Some Key Locational Terms and Concepts in Network Analysis

Here is my own working list of key terms and definitions in network analysis. Several of these are also discussed in the wonderfully fun Six Degrees of Spaghetti Monsters blog site, with examples from the social network of Harry Potter. This list accompanies my tutorial on Network Analysis and Cytoscape for XML Coders and my Thalaba project post, “Spectacular Intersections of Place.”

Walk—A sequence of nodes and lines—with a beginning and end point node, Can double back on itself—may not be straight.  A walk (as well as a trail and a path) has a length, number of lines.

Trail—A walk with distinct lines—no connection (or communication or link) is used more than once, but a node can come up more than once (doubling back).

Path—All nodes and all lines are distinct—No node is connected more than once along a path. This is a direct route.

Closed Walk—begins and ends at same node (loop).
Cycle—closed walk of 3+ nodes—all lines distinct—all nodes in between the start and finish are distinct (and the start node = finish node).

Tour—closed walk using each line in the whole graph.

Connected vs. Disconnected: Is there a path between all nodes in the graph? If disconnected, we can refer to components of the graph (connected units of it).

Geodesic: shortest path between two nodes. Geodesic distance: length of the shortest path. If there’s no path between nodes, the geodesic distance is either considered infinite or undefined, since they can’t be reached.

Eccentricity (or association number): largest geodesic distance between a node and any other node

 Diameter of a graph: defined by the largest geodesic distance between two nodes.

Connectivity: Does a graph remain connected without particular nodes or lines?

Vulnerability: if a graph is easily broken at a few nodes or edges

Cutpoint and Cutset—Cutpoint= node that if removed makes multiple components (splits a unified graph) Cutset = set of nodes that maintains connectedness.

Bridge: Line (edge) critical to connectedness

Centralities of Various Kinds: A Useful Site for Telling Them Apart

Degree Centrality – The most central node has the highest number of ties to other nodes

Ego Density—a node’s ties / max number of possible ties

Closeness and Closeness Centrality: How quickly can a node interact with all the other nodes? Does the node need to rely on lots of other nodes to connect across the graph, or can it get to all these nodes relatively quickly?

Betweenness and Betweenness Centrality: Which nodes are in-between other nodes—which are necessary to control or mediate interactions?

Eigenvector Centrality: measures the influence of a node on the other nodes around it–a way of studying the relative importance of nodes to making other nodes more central

Random Walk Centrality: involves starting from any node and randomly moving about—how long it takes to traverse the network—sort of like pouring turning on a tap at one node and watching to see where the water runs.

Information and Information Centrality: Information of a path = inverse of its length. “In brief, the length of any path is directly related to the variance of transmitting a signal from one node to another; thus the information contained in this path is the reciprocal of this variance. Thus any path (and hence, each and every combined path) has an ‘information content.’” (Wasserman and Faust 194)

Clique: 3+ nodes adjacent to each other—a subset of nodes, in which no others are also adjacent to ALL the members. (Thalaba is full of cliques)

Small World: Most nodes aren’t connected to each other, but can be reached in one or two steps (strangers linked by mutual acquaintance)


Borgatti, Stephen P. “Centrality and Network Flow.” Social Networks 27 (2005) 55–71.

Newman, M. E. J. “A measure of betweenness centrality based on random walks. arXiv:cond-mat/0309045v1 [cond-mat.stat-mech]  (Submitted on 1 Sep 2003).
Wasserman, Stanley and Katherine Faust. Social Network Analysis: Methods and Applications (Cambridge UP, 1994).

A Network Analysis and Cytoscape Tutorial

I’ve finally completed a tutorial I’d long promised on network analysis and plotting graphs with Cytoscape, making use of the Network Analyzer tools to exemplify important concepts in graph theory! It’s probably riddled with errors, so thanks for any feedback and corrections here. It is also liberally dotted with screen captures, and I hope this provides a useful, in-depth introduction.

This accompanies my posts here on Locational Terms and Concepts in Network Analysis  and on my Thalaba Network Analysis Project.


New Developments in the Thalaba Antisocial Network Analysis

Practical Matters: Working in Cytoscape:

See my Tutorial, An Introduction to Network Analysis and Cytoscape for XML Coders for a much more detailed explanation of how to read and make network graphs, and step-by-step advice on things you can do and try with Cytoscape’s network analyzer tools.

Sample .tsv file for import to Cytoscape:

Use import wizard to designate nodes and edges, and node + edge attributes for use in labelling info.

Eliminate self-loops and remove duplicated edges in Edit menu (or find ways to filter out unnecessary information that clutters your graph).

Use the Network Analyzer to calculate network statistics. Think about them. Choose among graphical layouts with care.

Of interest in my original graph was Betweenness Centrality of Nodes. Where does my network of places break (cutpoints)? Metaplaces were essential to network coherence.

Now, I’ve generated some new graphs to simplify our view of the places, eliminate the clutter of line-group nodes (moving that information to the edges). And I’m interested in Path Lengths of Edges (Average Shortest Path Length), and Closeness of Nodes.

Here’s a new network graph oriented to Closeness Centrality of Nodes (SVG output from Cytoscape).

I’m also measuring the Eccentricity of the Nodes–how far they are from each other, which produced a rather remarkable result: (SVG output from Cytoscape)

Day of DH 2014, and Pennsylvania Digital Humanists–Forming a Network?


There is a new Keystone DH Group forming on the Day of DH 2014 site, initially organized by Chris Long at Penn State University and Diane Jackaci at Bucknell University. It’s an effort to organize DH people from the state of PA together, and it could be good for interested Pennsylvanians to help build this and promote it. (Who knows–we could find ourselves pooling resources to organize a “PADH” conference or some such madness!) 
The group is organizing through Day of DH 2014, coming up next Tues. April 8. I’ve just signed up to participate in Day of DH and to join the Keystone DH group there (on the group pages). If you’re here with me in the state of PA, and you work on digital humanities projects you may want to sign up, too! 
Keystone DH Google Plus Community Page: :
The Keystone Group seems to be just now forming–not many members yet: I know there are many more of us DH-istas here in PA, and it would be great to see us all form a supportive network! 


Project Overhaul: Pacific Update!

Exhausted, happy, a little anxious about not being prepared for the week ahead, I’ve spent the entire weekend overhauling and updating Digital Archives and Pacific Cultures, and I confess to great fascination and love for this project. I discover that we’re probably better known on our mirror site at, which is right and proper, since our site was raised in an Australian monotreme’s nest and maintains a home there. I needed to update the site with the results of our course projects from the Digital Humanities course last December: new voyage files, graphs, charts, and maps—adventures with latitude and longitude extraction and conversion from 18th-century records. We’re making a point of sharing resources on the Pacific voyages that are hard to find, and this has sent us to studying the Forsters, father and son, who accompanied Captain Cook on his second circumnavigation voyage and shared an in-depth perspective on their cultural encounters with Pacific islanders.

So, this weekend Georg Forster (the younger Forster) sent me off on an unexpected adventure chasing after, I kid you not, belching seals in ancient Greek with diacritical markings. We autotagged the Forster texts from ancient word processing files sent us by Nicholas Thomas, who’d edited them for print publication–and we’re grateful to have them since they’re the ONLY digital resource we have of their work! But autotagging TEI XML from old word-processing files of gigantic voyage publications is fraught with perils, one of them being that you lose track of ancient Greek text that didn’t manage to be typed in a Unicode font. So, I happened to stumble into a passage of nonsense that, on consulting the printed text, turned out indeed to be the kind of classical Greek with an impressive variety of little accents and circumflexes and suchlike… and after some dedicated research of a few hours on Saturday, I was able to produce this:

We fell in with many herds of sea-bears, and sea-lions, which we did not attack, as another party was sent out upon that errand. We observed however, that these two species, though sometimes encamped on the same beach, always kept at a great distance asunder, and had no communication with each other. A strong rank stench is common to them, as well as to all other seals; a circumstance as well known to the ancients, as their inactivity and drowsiness whilst they lie on shore.

__________Φῶκαι νέποδες__________ 

ἀθρόαι εὕδουσιν, πολιῆς ἁλὸς ἐξαναδὐσαι, 

πικρὸν ἀποπνείουσαι ἁλὸς πολυβενθέος ὀδμήν.

 Webfooted seals forsake the whitening waves,
And sleep in herds, exhaling nauseous stench. 

Rowing along shore, we fell in with a spot where several thousand shags had built their nests, on those elevated tufts which I have mentioned before. Here was an opportunity to provide the whole ship’s company with a fresh meal, which was not neglected. The birds were for the greatest part so tame, as to let our boat’s crew come among them with clubs and staves; by which means several hundreds of them were killed. On this day’s excursion we found a bird of a new genus, which was of the size of a pigeon, and perfectly white. It belonged to the class of wading water-fowl, its toes were half webbed, and its eyes, as well as the base of the bill, surrounded by many little white glands or warts. It had such an horrid offensive smell, that we could not taste the flesh, though at this time we were not easily disgusted. 

It’s a wonderfully stinky passage–redolent of much of what ought to fascinate us about the Pacific voyages if we could only read them in fascinating snippets like this. Forster’s source for the Greek is none other than the Odyssey’s Book 4, as I quickly learned from the Perseus Project.

Meanwhile, the question is whether I’ve mismanaged my time horribly by posting students’ project work from last December rather than concentrating on the steadily aging digital grading piles my students have submitted this semester. I don’t know, but I can say this: I’m glad I had students to help with the Pacific project, and that they’ve had a chance to contribute to some real research resources—their time was not wasted and the Worldwide Web of Ideas is smarter because of their work and my finally getting around to posting it. And the site is actually fun, after all—if you haven’t spun one of our Google Earth KML viewers and read out bits of the voyages, it’s high time to go try that out… Spin Cook’s Second Voyage map over to the Cape of Good Hope and read about the fire-in-the-water from  the wee phosphorescent floating creatures that Cook and company sampled in buckets to study, and be amazed at the sight of strange worlds and the pungent odors of unknown species!

Spectacular Intersections of Place in Southey’s Thalaba the Destroyer

I began this project in 2013, as the first stage in what may grow into a long-range network analysis project on epic poems by Robert Southey and his contemporaries, composed around the turn of the nineteenth century. These elaborate poems feature multiple books and cantos with long prose footnotes that hold an encyclopedic range of references to the sciences and world cultures. Such complicated texts, with their intricate structures and their complex syntheses of worldviews, make very promising candidates for XML-based digital scholarship to study content that human readers have difficulty processing without computational help. With XML markup and network analysis tools we can trace distinctive patterns in Southey’s references to locations and cultures around the world, as well as above, beneath, and beyond it.


Back in the early 1990s when we had “humanities computing” but not “digital humanities” yet, Javed Majeed published a wonderfully illuminating discussion of Robert Southey’s Orientalist epics in his book on James Mill. I’ve long admired Majeed’s description of Southey as writing “as though he were in a laboratory of cultures, experimenting with and constructing different cultural identities.”[1] The image is entirely befitting for Southey the scholarly poet who sustained his youthful goal of writing epic poems “exhibiting the most remarkable forms of mythology,” and set to work scientifically with a scholarly dedication to a complex modelling of diverse belief systems in complex juxtaposition. Thalaba the Destroyer, drafted between 1798 and 1801, represents a remarkable accomplishment in juxtaposition with its effort to investigate the world from a fatalist Islamic point of view, as a scripted, written world, where the language of the divine is discernible in faces of grasshoppers and where the wordplay of sorcerers attempts to interfere with the source code of nature. (In one scene a young girl, Oneiza, reads letters on the face of a grasshopper and pores over them with Thalaba who interprets them as a divine message.) Taking its epic cue from a Muslim concept of God as writer, the worldview of Thalaba takes shape as a text over which sorcerers labor to comprehend so as to rewrite. Words are a code controlling the things of the world, and are the tools of empirical experiment.

Southey’s scientific efforts appear to have involved careful record-keeping in his common-place book, a research record that he appears to have mined exhaustively in the drafting of his epic poems with their elaborate annotations. Dahlia Porter has explored how Southey’s common-place book reflects his systematic use of Enlightenment methods to collect evidence of cross-cultural patterns.[2] We might further see this systematic exploration of cultural patterns in the elaborately structured epic poems Southey constructed, and particularly in their interplay of poetry and prose annotation. From Southey’s compilation and composition process, we can see his annotated epic poems as, indeed, conducting the laboratory work of “world cultural studies” at the turn of the nineteenth century. For Southey, no longer should “the epic” in English prioritize a classical heritage, but rather it should become the poetry of world cultures.  Southey’s cultural “lab” productions seem oddly comparable to the work of Franco Moretti on the novel as a “planetary form” as both scholars applied themselves to expanding a common frame of reference for the understanding of “the epic” and “the novel.”

Quite seriously we might ask, what has Robert Southey to do with Franco Moretti? Simply this—an emphasis on accumulating large quantities of cultural information on a planetary scale, and a sustained effort over decades to visualize and model that information in systematic and structured ways. With Moretti, it’s “graphs, maps, and trees” to help scholars to recognize more large-scale patterns than we could before, while with Southey, the intellectual inheritor of an eighteenth-century “Information Age,” we have the re-tooling of the ancient epic poem to model and study mythic thinking and ritual practices around the planet. There may be something Southeyan in today’s “laboratories” devoted to linguistic artifact analysis, like NUDHL (Northwestern University Digital Humanities Laboratory), the NYPL Labs, or the one that Moretti directs, the Stanford Literary Lab. Research labs at Digital Humanities institutes design their experiments with an emphasis on the quantitative and the structural in textual studies. Moretti, a pioneer of distant-reading methods, provocatively, even spectacularly, demonstrates how to deploy computers as a tool to reveal cultural patterns more effectively than we can decipher with unassisted eyes and brains. His 2011 article “Network Theory, Plot Analysis” (republished in his June 2013 book Distant Reading) offers a helpful introduction to how network theory can be applied to literary studies, and I’ve taken it as a starting point for the project I’m about to describe—an experiment with tagging and processing the 1801 edition of Thalaba the Destroyer in order to step back and study from a distance what patterns emerge from the elaborate structure of the text.


It is challenging for a theory-directed exegesis of texts guided by unassisted eyes to come to grips with densely allusive compendious constructions like Southey’s epic poems or, for that matter, Herman Melville’s novels, without handling them reductively—and indeed we may well turn to writers like Southey and Melville to challenge a prescriptive reductiveness in our theoretical constructions of textuality, cultural encounter, and empire. Southey’s complicated epics challenge us just as they did his immediate audience, and might well expose us in our 21st-century weakness: we cannot easily assess their elaborate interplay of contexts, their investigative reading of a centuries-old archive of records on cultural encounters, their blending of ancient and contemporary sources from voyage logs and travel narratives. Their massiveness of accumulation seems, indeed, remarkably suited to try out the methods of our current Computer Lab “Scientists” of the Humanities: We may be able to Read More of Southey without Getting Lost in his labyrinthine notes, if we deploy the tools of “distant-reading.” This summer I’ve begun the effort from the moderate distance of processing (at first) just one long compendious text, together with its weighty proliferation of paratext notes. I will not go so far as Moretti as to say that we should just quit reading texts, because I find that our close-reading methods are amplified and even intensified by work with the Tools and Coding available to us as practicing humanities scholars.

Southey’s poem is not exactly light reading, even for readers accustomed to long epic and romance poems of past centuries. Thalaba the Destroyer shares the encyclopedic dimensions of heavily annotated Enlightenment era epics like Erasmus Darwin’s The Botanic Garden of 1793, and its driving plot is stopped every other page or so with a long footnote offering a mixture of information, references to other texts in other languages, sometimes even another poem-within-the-poem inside a note, which makes for a complicated, frequently interrupted reading experience. Those who would try to read and absorb the poem in print as it was published in 1801 enjoy anything but a linear reading experience, and find themselves caught up in branches and tangential appendages in the big paragraphs of paratext running underneath the main text. At the time, reviewers and Southey’s publishers questioned the necessity of these notes, and later editions pushed them to the back of the text, converting them into endnotes rather than footnotes, but their prominence in the 1801 text gives greater significance to these notes as printed in the first edition of the text, a significance that was likely lost or unrecognized in later editions.

So, why are the notes (or running “paratext”) important to the poem, if at all? On their own, regardless of their placement, the notes could largely (though not entirely) be seen as proof of Southey’s thorough and systematic research on the world’s belief systems as documented before 1801, a sort of intellectual bolstering–like the footnotes in a scholarly essay–to justify the poem’s fantastical, supernatural plot to educated nineteenth-century readers. On the evidence of Southey’s paratext, Dahlia Porter has described Southey as an empirical scientist whose annotations reflect his data sampling, his accumulation of many pieces of cultural evidence to form patterns—and she suggests that Southey’s methods reflect “a conscious cultivation of amalgams that refuse to coalesce,” or that can’t be pulled into a single, universal whole in the tradition of classical epic or Enlightenment all-encompassing encyclopedic knowledge.[3]  My own article on “Southey’s Gothic Science” (2009) addressed Southey’s investment in the natural philosophy of his moment in representing his Domdaniel Sorcerers—whose “magic” noticeably reflects the contemporary work of Humphry Davy and other scientists in investigating the electrical spark of life.[4] The magicians in the poem appear to be practicing scientists in the way they tamper with codes and language that, within the poem, control the vital fires of life, its flowing (electrical) charges and radiant energies. In their vitalist experimentation, we see a team of Domdaniel Sorcerers coordinate with each other to challenge and threaten the written destiny of an Islamic deity, and undermine an Islamic worldview that Southey associates strongly with dutiful obedience to a divinely ordained script for earthly existence. Southey’s attention to scientific practices seems especially significant to this poem’s representation of cultures as they relate to this central conflict between sorcery and divine power for control over the life forces of nature–whether he is presenting in his notes a sampling of comparative evidence observed in multiple world cultures at different times and places, or whether he is projecting the contemporary excitement over electricity onto medieval sorcerers in an ostensibly medieval Islamic context.

If we take Southey’s extensive footnotes to the 1801 edition as an equivalent form that interacts with his main text, we begin to see Thalaba  as a scientific poem-with-dependent-prose-parts—an interlocking mechanism to “tease us [into] thought” rather than out of it, by way of contrast with Keats’s later “Ode on a Grecian Urn” (composed nearly twenty years later in 1819). Keats’s famous urn, an artifact of a singular lost culture frozen in time, seems pointedly to refuse a thoughtful response in the viewer. Unlike Southey’s Arabian epic, Keats’s Grecian urn absorbs the viewer completely in itself, needing no contextual frames of reference–and thus it became iconic to the old mid-twentieth-century theoretical school of New Criticism in its emphasis on aesthetic form, and as an iconic representation of “Romanticism” in rejecting a drive to “know.” Southey’s epics don’t fit this urn-containable view of “Romantic aesthetics” because they embed their cultural artifacts in layers and layers of contextual strata. Southey is not the poet to celebrate a singular “timeless” universal beauty but instead, he is the poet of alienating multiplicity in his references to ritual practices and belief systems from many times and places, pointedly disturbing rather than comforting to Western eyes.

As I’ve mentioned, the 1801 edition of Thalaba continually diverts the readers’ eyes away from the “main” text and into notes in the lower portions of the pages—notes which do not comment or explain events in the main text, but make reference to similarities and differences to those events grounded in stories collected from (usually) many different cultures around the world. These notes—the paratext running beneath the “main” text—frequently seem to dominate the eye, sometimes taking over well over half the page, and they seem to challenge the reader to a parallel reading experience. We can see how Thalaba’s highly elaborate structure sets its text and paratext in dialogue in a view of a page from the 1801 edition, for example, here:


Notice how much space is taken up by the notes on this page.

Another fascinating example in which the notes seem to “take over,” is the passage in Book 1 in which a second apparently satirical poem begins beneath the main text of the poem:  Southey presents “the Old Poulter’s Mare ballad” in the notes that mirror with a bizarre twist the story being told in lines above it: In the main text, a story is told of a camel tragically left to die at the grave of his owner, while the notes present an English folk ballad that Southey pretends to have “discovered” but has really made up, about an old horse left to starve: both animals, positioned across from each other in main text and notes, stare humans in the face in an apparently double-edged mute accusation. Here the text and notes set up a complicated ironic interplay—to compare belief systems, and to implicate English and European traditions alongside non-European cultures as comparable in their tendencies to cruelty.


Given the complexity of the paratext notes, it is very challenging for us to synthesize Southey’s highly elaborate writing about locations and cultures in this poem: A description of Babylon in Book V of the poem leads to a note that mentions about 13 other places–typically a heterogeneous mix of Middle Eastern, European, and English locations—in this case, Heaven, Arabia, the Tigris River, the Euphrates River (mentioned here separate from the Tigris), the Tower of Babel, St. Paul’s Steeple in London, Assyria, Sodom and Gomorrah, Naples, the kingdom of Judah, and Media.


We can see that frequently the notes radiate outward from the poem in their treatment of place—that for any given place mentioned in this text, there’s likely to be a plurality of places described in an annotation to it. In reading Thalaba I have often wondered whether there’s any rhyme or reason to Southey’s positioning of places, and this is why I have turned to coding the poem in TEI XML.

SpectPlaceMORE_SLIDES_Page_05[TEI code of the same page]

Over the past couple of years, I’ve been working on text encoding in XML and related transformational scripting—and I’ve been taking workshops and a whole semester-long class to learn how to extract data from coded text—to do things like tag and count references to persons and places in stanzas, or parts of speech in sentences, to graph the results, and to use SVG and JavaScript (among others) to render their presentation effectively on a website among other things. This has been an adventure, to discover in code a power to manipulate virtual worlds made out of texts—and I’ve sometimes enjoyed the odd parallel between me and a Domdaniel sorcerer—in learning to control the abstruse codes of a scripted system. I can mark up Southey’s epic poem and systematically extract information from my code in order to model his virtual world, to see patterns and study relationships that the maker of this text could not have seen as I do. Even so, such data-modelling seems inherent in Southey’s representation of “sorcery.”

Compare our ability to generate a GPS modeling of the Earth with Southey’s envisioning of the Domdaniel Sorcerers in their cave at the end of Book 12. William Hawkes Smith’s illustration of the scene merely suggests the titanic power of the Idol-Automaton created by the Domdaniel;


It was a Living Image, by the art
Of magic hands of flesh and bones composed,

And human blood thro’ veins and arteries
That flowed with vital action.
In the shape Of Eblis it was made,
Its stature such and such its strength
As when among the Sons of God
Pre-eminent, he raised his radiant head,
Prince of the Morning.
On his brow
A coronet of meteor flames,
Flowing in points of light.

Self-poised in air before him,
Hung the Round Altar, rolling like the World
On its diurnal axis, like the World
Checquered with sea and shore,
The work of Demon art.
For where the sceptre in the Idol’s hand
Touched the Round Altar, in its answering realm
Earth felt the stroke, and Ocean rose in storms,
And ruining Cities shaken from their seat
Crushed all their habitants.
His other arm was raised, and its spread palm

Up-bore the ocean-weight
Whose naked waters arched the sanctuary,
Sole prop and pillar he.
           —Thalaba the Destroyer, Book XII

Had Victor Frankenstein collaborated with a network of scientists, perhaps his Creature too could have been modeled on a radiant Lucifer like this and wrought devastating virtual havoc: This is a visualization of a virtual world-within-a world, designed by sorcerers with an interface that lets them manipulate nature by remote control. Just before Southey obliterates them in the closing lines of the poem, he renders the Domdaniel’s demonic world-imaging in a sudden brightly illuminated revelation—which prompts reflection on the world Southey has generated with his poem: Of what is this world made? Should we think of it as fantasy world generated from an experience of place from an Islamic point of view? If so when is this place? What is its relation to the places described in the notes? Where and when are we—and how many wheres and whens are we exploring at once?

In a general way, we can see a multiplicity of places coming to bear on each other in this poem, and we can also see that places are associated together that are not literally geographically connected to each other. At first I thought of plotting the locations on a map to show their global geographic distribution. Here’s what that looks like:

But re-reading the poem and its notes suggested another idea—that place is not merely rendered physically in this poem but conceptually—and that to render the places in this poem by geo-coordinates would reduce them to “known” recorded geographic space and miss many places that simply can’t be located that way. Instead Southey’s poem presents an abstract, speculative modelling of “real” and “imagined” places as conceptualized together in 1801—and we could perhaps study this with help from network analysis, to see which places are referred to in conjunction with others, and to consider how places are connected across the mediating edge boundary of Southey’s text and his notes. To begin thinking about this as a network analysis I’m indebted to local colleagues of mine in the Pittsburgh area, one of whom, Tom Lombardi, an Asst. Prof of Computer and Info Systems at Washington & Jefferson College, recently gave me an extremely helpful crash course on how network analysis works. Network analysis is more typically used to map social relations, and some seriously impressive digital humanities projects, like the Six Degrees of Francis Bacon project at Carnegie Mellon University, have built these based on the same kinds of networks social scientists and corporations use to model and discover nodes and hubs in social network systems like Facebook and Twitter.


But network analyses can be developed to visualize any kind of relationship. For example, they are used in biological sciences to study protein interactions in physiological systems.

Some basic things to understand about network analyses: They can be graphed as with the Six Degrees of Francis Bacon to show a social network of who interacted with whom—and sometimes how: who initiated contact, and how the contact was formed. The circles you see on the Six Degrees of Francis Bacon site are called “nodes” in a network analysis, and they indicate particular people interacting over the course of a long period—multiple generations in this network. The lines that you see here intersecting people are called “edges”—and the more edges (or radiating lines) a contact has, the more people they know. The Francis Bacon project has chosen a literal representation of the most well-known, highly connected people to make them stand out—James II is a HUGE node here because of the number of his connections. However, just as significant as how many interactions are shared is the matter of how important a node is to the maintenance of a coherent network—a factor calculated as “betweenness-centrality” in network statistics. You might think from looking at this that James II is the most important person to this network, but that’s not really necessarily the case: “Betweenness-centrality indicates something crucial: how central is a particular node to keeping the network as a whole connected together? Sometimes the most “central” figure, the one with the highest degree of betweenness-centrality, is a small and apparently unpopular one, who nevertheless formed interactions with well-connected people across otherwise divided groups—and here on the Francis Bacon graph, William Kent seems one such central figure.


Now I chose to produce what I’m calling an Antisocial Network—that is, I’ve purposefully decided to downplay the characters in Thalaba the Destroyer in order to concentrate our attention on place. I wanted to see to what extent the place references in the main text and in the annotations form a network: where do the places discussed in the notes coincide with places mentioned either in the main text or in other notes? When Southey mentions a wide array of places in a note, where does he come back to mention them again—if ever? Or when they do come up again, in what contexts do they come up? Which places are most frequently associated with other places? Which are most crucial for connecting the distinct imagined-worlds of the text and notes together?

After some experiments and counselling with Tom, I decided to chunk the poem by discrete line groups or stanzas: This would allow me to collect place references in groups of lines, together with their associated annotations. I did this tagging and collecting by starting with a base text from Project Gutenberg and reformatting it with autotagging in TEI XML.


TEI is the language of the Text Encoding Initiative—an organization founded in 1987 that has developed authoritative international standards “for encoding machine-readable texts in the humanities and social sciences.” Typically we deploy this language to build digital scholarly editions—like the Southey Letters on Romantic Circles or like our Digital Mitford project. However, this markup language is also well designed for collecting data for analysis—as I discovered with my experiment this summer. I learned that my colleagues who work on network analyses typically collect their data into tables in Microsoft Access or Excel to feed into open source analysis software like Gephi or Cytoscape—and their networks are thus based on the fields they have designated to construct a table.

By contrast, I’ve learned that I can extract data from my TEI coding in lots of sample sizes, working with different units of structure: I could choose to look only at places coded within discrete lines, or within stanzas, or move to a larger scale to look at whole books of the poem. Given some time I can easily generate new combinations to see if a network of shared properties exists between any tags I’ve used—for example, tags marking time, place, persons, or grammatical constructions in the poem vs. notes.) I chose to mark edge-interactions within line-groups (or stanzas) (see the nodes named B1_lg14, etc) because Southey’s notes are appended to specific clusters of lines, and each “stanza” or line group seems to have its own discrete assemblage of places. After tagging I wrote scripts to collect and import that data into Cytoscape. I chose Cytoscape because it has distinct advantages over the alternative available open source network analysis software (Gephi and Pajek) more typically used in digital humanities work: Likely due its use in mapping statistical relationships in the biological sciences, Cytoscape allows greater fine-tuning of edge-weights and node-sizes and also permits many more varieties of graph shapes by comparison with other network analysis tools.

Here is a view of my workspace in Cytoscape when I was developing my first network graphs in this project, followed by a view of the TEI markup that generated the data mapped in the image:



Within my TEI code above I want to point out the placement of Southey’s annotations–very different indeed from how they are positioned on pages in a printed text. In TEI, notes are generally coded on the spot in the place at which they are signaled—and this way they are effectively contained with the lines they document. They can be transformed for web or print publication—extracted then to appear, say, in JavaScript pop-up boxes or in a panel alongside the screen, but for the purposes of archiving and analysis—or if you read poems and notes in the TEI—paratext notes are nested within main text and seem to move to the foreground. For the purposes of this project, the embedded structure of this coding makes it easy to locate the note within a point of interaction—its place with a particular line-group or passage of the poem. I wrote a script to apply distinct line-group numbers, which also signaled Books of the poem, so that these would show up as distinct nodes in the network: so B5_lg35 means Book 5 line group 35.

As you can see, I’ve coded quite a lot more than place names because I have a larger project in view involving Thalaba and the other Southey epics to compile and study references to arts, architecture, sorcery, and science. The work of markup embellishment effectively brings you up very close, myopically close to the text, in order ultimately to present a view of it from a distance. Effectively, the labelling of contexts superimposes a new layer of editorial paratext around the structure already present.

First Efforts

Because I’m working with a team of editors on a much larger long-term project, the Digital Mary Russell Mitford Archive, my time has been short to work on Southey lately, and I was only able to code about a third of Thalaba in preparation for the conference circuit in summer 2013. Knowing I couldn’t code the whole text in June 2013 without rushing through it, I first selected Books 1, 2, 5, 8 and 9—a little over a third of the 12-book poem. W. A. Speck has pointed out that Books 8 and 9 were written in a transitional moment in Southey’s life, around the time when he left England with Edith for his second trip Portugal (during which incidentally he met “the Senhora” Mary Barker),[5] and the fact that Southey himself was in motion between places and cultures at the point when he was writing these books made me curious about whether his handling of place would differ substantially here. These are portions of the poem that move Thalaba significantly far away from Arabia into unidentified polar climes of ice and snow and toward increasingly metaphysical spaces. I wondered about whether and how the place referentiality in Books 8 and 9 would connect with the earlier books. Indeed the analysis reveals that they do connect in a selection of places. So, even as the settings in the main text of the poem move Thalaba away from the Middle Eastern locations where he began, we continue to see references to a handful key earthly places that mattered in the text and notes in Books 1 and 2, and 5.

I tagged two kinds of places in the poem, to distinguish “metaplaces” from “places”:


Metaphysical locations outside of mundane reality are of course the stock in trade of epic’s distinct interweaving of divine realms, cosmic spaces, places apart. I tagged as “metaplaces” those that may only be accessed through supernatural or magical intervention—and these appear in both the poem and the notes. By contrast, the place nodes are not so remote of access, though they may not all be “real” places in an historical sense. These are basically places that are accessible to human entities without divine assistance or without sorcery. Keep in mind that the places in the poem don’t all coexist at the same time—they are moreover highly diverse in scale and scope—the planet Venus for example, the land of one-legged men, Yemen and Persia, and Paris. The places are brought into imagined association with each other by their placement in the poem and its annotations—so everything I’m showing you is a map of associated juxtapositions, an attempt to visualize Southey’s laboratory technique of place shuffling.

In the resulting network of places and metaplaces across the books of the poem I coded—and you’ll see three different kinds of nodes: 1) orange place nodes for what we’ll call mundane places, 2) purple for the metaplaces, and 3) green diamond points for the line-group nodes. Remember that the connecting lines are the mediating “edges” in my network: Dotted lines indicate connections from a note, while solid lines are connections from the main text of the poem.


Thickness of the lines is an indication of edge-betweennesss, something I was especially curious about. There are certain place and metaplace nodes that hold the realms of the notes and main text together as a coherent whole. I only coded Greece twice in the five books I coded—once in the main text of Book 1 and once in a note in Book 8: Greece is positioned in Book 8 with a cluster of typically esoteric places that Southey appears only to mention once—the vampire segment (a passage of Southey’s notes that is perhaps the first explicit discussion of vampires in English literature). Greece was the only place mentioned here that was mentioned anywhere else in the poem—and hence its edge-betweenness at its point of connection to Book 1 line-group 14 is exceptionally important to prevent Book 8 from being a completely isolated cluster of places. In that interesting cluster we see a foray into Germany ( Wirtemberg) and Eastern Europe (Belgrade, Transylvania, Hungary) and the mention of Greece in this context of the undead metaplace of Vampires is this section’s sole spatial association with any other part of the poem I’ve yet coded. (It seems likely I’ll see more references to Greece as I finish coding the rest of the poem, but here the size of the edge indicates not only the crucial positioning of Greece, but also its connection in Book 1 line-group 14 to one of the most central passages regarding place in the poem. (For those who know the poem, this is the passage when he and his mother first see the amazing palace of Irem, where they are about to meet the undead Aswad who tells his sad tale of ages past.) In that passage and its accompanying notes, Southey embellishes the grandeur of this place of Irem by comparing it to descriptions of palaces from antiquity in Greece as well as Yemen, Lebanon, Persepolis and others—mostly places that *are* referred to again in other parts of the poem and notes.

Notice how diverse and diffuse the place references are. The vast majority of places are referenced in annotations, and dashed off once and quickly, but a few places are repeatedly brought into play. In our distant reading, we can see that the poem is evidently not rooted in any one place in particular, but moves nomadically about. We might expect this extensiveness of place reference in an epic poem—but I’d hypothesize at this point that Thalaba wanders more than most epics in a multi-dimensional way—due to the interactivity of narratives in text and notes. We know that the poem Thalaba does not end where it began—gives us no circularity except perhaps what’s generated from the Domdaniel Cave introduced in Book II and concluded in Book XII.  On the other hand, nodes that feature lots of edge lines and dotted lines radiating outward have many references–so Babylon and Arabia are pretty important junctions, as are the metaplaces of Heaven and the Tombs. But you can see in the upper left that the Vampire Metaplace and Transylvania are a cluster unto themselves–not connecting much to the rest of the poem without that one reference to Greece.

The network can thus feature the places to which the poem returns at least once. We can also see plenty of places that are disconnected—and this demonstrates something that we can expect of Southey—that a given node often presents its own discreet clustering of juxtaposed places. See “Mohareb City” (my own labelling of the city over which the sorcerer Mohareb reigns)—which looks huge in my graphical rendering of edge-betweenness because it’s disconnected from anything else in the network: it’s the center of its own tiny “world.” The disconnection of “Mohareb City” brings up another matter: The metaplaces we see in most proliferation on one side of this diagram do seem to be important nexus points for the association of place across the poem—Heaven and the Tombs for example.  Here’s a table generated from the network analysis in which I have ranked the nodes by edge-betweenness.


What our constellation of places can show us, too, is the path by which places of high currency are shared, shuffled, and exchanged between paratext and text, as well as which parts of the poem expose a high density of place associations. Thus, we can refer to this graph to identify all the parts of the poem in which England is mentioned, and in what contexts.


Now, one thing you can do with a network analysis is to see if you can break it! With some tinkering and filtering of input data, I can remove certain kinds of information to determine how significant certain kinds of places are. So I have filtered out places to see if the metaplaces form a network on their own, and vice versa with the places. It’s interesting to see that, first of all, metaplaces hold together and seem to refer to each other.

In this case, when we drop metaplaces and look only at places, we lose some serious network coherence—and that’s evident with the Domdaniels’ cave at the bottom of the network—It’s only tenuously connected to the rest of the place network. What does this mean?


There are a few Places that can only be reached in the World of the Poem through Metaplace access, and this may show us something of the connecting work of supernatural agencies within the poem. Simply by compiling all the metaplaces in the poem, we begin to see how important they are to characterizing the experience of the supernatural, and how frequently multiple cultures are brought to bear in Thalaba’s ostensibly monotheistic and culturally singular view of Heaven and Hell. Our network analysis helps us to see the profusion of places that Southey generates by moving Thalaba around on the topography of the main text.

We should recognize the importance of the Domdaniel sorcerers in motivating that movement of Thalaba away from home—a worldly challenge giving shape to this text. It’s interesting that the Domdaniel’s Cave of operations maintains its connection to the rest of the place network only through a metaplace linkage: In what I’ve coded thus far, the Domdaniel Cave metaplace is not discussed much at all in connection with worldly places. While the Domdaniel sorcerers work in their own magically-generated metaplace, they operate on the scripts controlling the world of the poem. Our main character Thalaba moves to avenge his father’s death and respond to the Domdaniels’ aggression, and as he moves in Southey’s own fictive lab compilation of a mythic world, we see something like indicator lights blinking and flashing in the notes—reflecting which locations in Southey’s study of world cultures feature something comparable or something contrasting. As a sort of textual machine of poetry and prose paratext, we begin to see how Thalaba the Destroyer works as a world modelling exercise to map correlations across cultures.

Continuing Work with “Close” and “Far” in the Place Network:

Over the past two years, I have continued working on the place-network analysis of Thalaba. I finished coding the poem (which you can find here on my GitHub for this project), and created new network graphs. And I discovered that adding more places to the graph made it more difficult to read! I began experimenting with new layouts to ensure that nodes were not hidden in a tangled “spaghetti monster” or “foggy cloud” of data. In an effort to spread out my nodes and edges and study them systematically, I began thinking about the notions of “closeness” and “distance”: What do we mean when we say that one place is “close” to another place? If we think in terms of geography, of course, the distance is something we measure in quantifiable units (kilometers or miles). But if we think in terms of how places are plotted in relation to one another inside an epic poem and its prose annotations, we might think of “close” and “far” in terms of whether two places are frequently referenced together in the same passages of a poem: Perhaps Southey has “plotted” places around the world literally within the plot fabric of his narrative poem, and he has created little neighborhoods or clusters based on places referenced in the annotations of a given unit of text.

I became curious about a statistical measurement in network analysis graphs called “closeness centrality” and its companion measurement, “average shortest path length”. The two concepts are inversely connected, based on the concept of a “path-step” in network analysis: A single path step takes you from one node to another along its shared connecting edge. If a node has six other nodes connected directly to it, those six nodes are the ones that are one path step away from this node. We can study a network by considering how many steps it would take to go from the most centrally connected node to the ones that are “furthest away,” that is, the nodes that it would take the most path steps to reach from most of the other nodes. Thus, the node with the highest “closeness centrality” is the node that has the lowest average shortest path to reach any other node in the network.

Because I am studying places (and metaplaces) in an unusual way by networking them together based on their position in an epic poem, I am interested in how “close” and how “far” these locations are from one another from the perspective of their distance in path steps on the network graph. Here is what this looks like (and since the network is much bigger now, you may wish to click on the image to see it enlarged to fill your screen):


Here we notice that Arabia has (not surprisingly) the highest degree of “closeness,” and Cytoscape has helpfully plotted the nodes with the highest closeness to the other nodes from the left and top of this graph. As we move down and to the right, we see the nodes ranked by their increasing distance from most other nodes. Closeness is a calculation based on the average number of steps it takes to “walk” to all the nodes in a network, and if I study my network tables for this statistic, Cytoscape has calculated the number of path steps. The most distant nodes on this network are four and five path steps away from most of of the other nodes, and these include, interestingly, some of the places that were of interest for their being on the borders of the world the English had travelled in 1801: Southey took a great interest in faraway “Eastern” places, but their distance from England is not what makes them remote on this network. It is their distance from other places with shared frames of reference. When Southey referred to these places, it seems he did not make many references to Arabia, or Babylon, or Greece, or the other places that are plotted more centrally. Indeed, it is interesting to note that some English places, like St. Paul’s steeple, are positioned at around 2.5 path steps’ distance from most other nodes. What this network graph helps us to visualize is which places are most frequently compared and associated (or “juxtaposed”) with each other in the context of a stanza in Southey’s poem (keeping in mind that a stanza is our literal “locational” frame of reference for plotting places in the network).

To interact with this graph, and tug on its nodes and edges to get a clearer view of how the nodes are connected, try loading the dynamic web version I have loaded here on my site. Wait a few minutes for it to fully load, then set the Visual Style to “default black”  and click on the droplet icon on the right, to set the background a dark or black color so that you can read the node labels and properly view the node and edge colors. Then, grab a node that interests you, and give it a tug with your mouse to drag it about and get a sense of how the network web is formed.

With a graph this large and complicated, we need to begin filtering it to study it from particular vantage points of interest, such as the “Vampire” metaplace which stands out here much as it did in our earlier network graph, as it is part of a spectacular cluster of places that bursts like a firework display–a large ring of places all connected with each other by their association in a single, very big footnote in Book 8 of the poem. ThalabaAna-TwoStepsVampire

What we see here is a filtered view of our network, showing us only the nodes that are within two path steps of the Vampire metaplace. I’ve pulled all the metaplaces to the left so they make their own cluster, to show which metaplaces are connected with the mappable places. The thick purple edges are weighted heavily where they indicate that particular places are connected by the character Thalaba’s actual movement between them in the poem. Thus, we see that Thalaba is associating particular worldly locations in the Middle East (Arabia, the desert, and Hirah, for example) with metaplaces in the poem, such as the Paradise of the magician Aloadin and the Tombs, where significant plot developments occur in the poem. Most of the other place references are connected to each other by way of Southey’s footnotes, which radiate a little world in themselves, with references from the Netherlands and Aix la Chapelle, to Hungary, Wirtemberg, Transilvania, Greece, Babylon, Lebanon, Persepolis, and out to India, Indonesia (Java), and into the New World with references to Brazil and Peru, and moving as far remote as the South Pacific island of Otahaeite. Zooming into this particularly interesting cluster in our network effectively gives us a circumnavigatory tour of the planet!

Reflecting on the concept of “closeness” and “distance” show us some fascinating perspectives on Southey’s poem and its “spectacular intersections of place.” We understand how close places are by how readily Southey mentions them together in close proximity to each other. Where places seem more isolated and remote, it is because they don’t appear to be so easily comparable or mentionable in the same breath with other locations–so that Southey seems to be treating such places (like Tibet or the Yellow River in China) with a certain distinction, not making them comparable or relatable to most other parts of the world that he more readily mixes together in the stanzas and notes of his poem. This particular set of questions yields perspective on distinctiveness vs. relatability of places in Southey’s structure of this poem. We can see from all of our network graphs how important place referencing is to this poem, so that stepping back and away from characters to produce an “anti-social” network yields a rewarding view of the spectacular proliferation of intersecting places that Southey has woven together in his textual fabric.

Though other projects call, the Thalaba experiment seems worth continuing. Here are some long-range questions I’d like to pursue next with this network analysis project:

  • Just how eccentric is Southey’s patterning of place references, and who are his nearest neighbors in the poetic/annotative compilation of places? How does his place referencing compare, for example, to Helen Maria Williams’s Peru, or Erasmus Darwin’s in The Temple of Nature, or William Blake’s Jerusalem? Or to place referencing in Byron’s in Childe Harold and Don Juan? Panning back, how might this mapping of place and metaplace compare with Milton’s Paradise Lost or Dante’s Commedia? (I’ve begun this work by marking up Erasmus Darwin’s The Temple of Nature and producing preliminary maps and network graphs to compare with Southey’s poem. These can be found on my github,io site for this project.)
  • More immediately, now that I’ve learned how to do a network analysis on Thalaba, I’m ready to begin applying what I’ve learned to my other ongoing projects on Mary Russell Mitford and on the Pacific voyages. The networks we generate for these other projects will likely be easier to conceptualize: a social network for the Mitford project is highly in order, for example—but Thalaba’s complexity is exactly what makes this avant-garde and influential poem so rewarding a study: Southey produced here an abstract intellectual modeling of a plurality of world views—a spectacularly prolific distribution of imagined places defined from a study of multiple empires: This is not simply an exotic Arabian poem, but a meta-imperial epic.

For Further Reading. . .

Here is my website for the project where I have posted network graphs and discussion, and where I’m also hosting a tutorial (still under construction) on using XQuery to extract and prepare network data for processing in Cytoscape. A longer article in draft, featuring newer and better network graphs, is posted here on my GitHub space.

I have also posted an introductory list of network analysis terms and definitions here on this blogsite, and a tutorial for XML Coders on Network Analysis and Cytoscape. Please feel free to contact me or leave a comment here if you would like some help and advice on a network project with Cytoscape.




[1] Javed Majeed. Ungoverned Imaginings: James Mill’s The History of British India and Orientalism. Oxford: Clarendon Press, 1992, 53

[2] Dahlia Porter, Poetics of the Commonplace: Composing Robert Southey, Wordsworth Circle (2011) 42: 1,  27-33

[3] Dahlia Porter: “Formal Relocations: The Method of Southey’s Thalaba the Destroyer (1801) ERR Dec. 2009, 676.

[4] Elisa Beshero-Bondar, “Southey’s Gothic Science: Galvanism, Automata, and Heretical Sorcery in Thalaba the Destroyer,” Genre: Forms of Discourse and Culture 42 (Spring/Summer 2009) 1-32.

[5] W. A. Speck,Robert Southey: Entire Man of Letters (New Haven: Yale University Press, 2006) 83-87.


Our Digital Humanities Course This Fall

Much excitement as we work on the Pacific Project for our Digital Humanities course this fall! Here’s a link to our course site in progress:

We’ll be putting together the course schedule in August, when we’re also moving to a new Greensburg campus server.