Understanding the structure of your network
- Most networks have a distinct structure that is related to what the network is about. They contain bunches of nodes, called clusters, that reflect the differences in affinity between the various nodes. These clusters help us to better understand our networks.
Every network has a structure. It might have a dense core at its center, with a number of other dense regions surrounding the core. Perhaps there are two or three distinct subsets that are set quite apart from each other or hardly connected with each other at all. The entire network could consist of just one central core, with very few outliers.
Why such structures may be visible in your graph is relatively easy to understand. Think of your own networks, online as well as offline, such as your family, a circle of friends and your work or study colleagues. Do all of these people know each other? If you tried to draw your entire network of relationships, you might end up with three separate clusters that are only connected by your role in each of them. Alternatively, there may be considerable overlap between them.
The same is true for other types of networks. In the data captured using TAGS, for instance, there may be multiple, distinct sub-networks, even though all of the tweets captured contain the same keyword or hashtag. This is because not everyone participating in the conversation will engage equally with everyone else. One reason for this may be language: users tweeting about an international event such as the Oscars might use the same hashtag to tweet in English, Spanish or Chinese, for instance, but because of language barriers, these accounts may mainly @mention or retweet speakers of their own language. Similarly, users discussing a political topic may retweet only those who subscribe to a similar ideology. This network might end up with several left-wing, moderate and right-wing communities. Other hashtags are simply used for a number of different purposes at the same time.
Identifying clusters in the network
These sub-networks or communities within your network are, in network theory, usually referred to as clusters. There are various approaches for identifying them. In fact, if you distinguished a number of different regions in your visualization, you’ve probably already used one. Perhaps you’ve also explored what the key accounts in each region are, and from this you’ve formulated a hypothesis of what drives the emergence of these distinct structures.
Network analysis and visualisation programs such as Gephi provide tools to automate this process and step beyond mere visual detection. To do so, they use a variety of algorithms that analyse the modularity of a network. Modularity means the tendency of a network to separate into different clusters. But even automated cluster detection is far from easy. Where do you draw the line around your personal network, for instance? When does an ‘acquaintance’ turn into a ‘friend’? Even if you could quantify exactly how close you are to everyone around you, what’s the threshold value to make that distinction?
Rather than setting some universally applicable boundary in each case that determines what is inside and what is outside of any given cluster, modularity algorithms in programs like Gephi can usually be set to be more or less inclusive. Make them more inclusive, and they’ll find fewer, larger clusters or even decide that the entire network is one single cluster. Make them less inclusive, and they’ll find more, but smaller clusters. In this way, the large clusters that the more inclusive approach would have found can be broken up into smaller and smaller subsets, until – in the extreme case – each node becomes its own exclusive cluster.
Again, none of these analyses are inherently right or wrong. They simply use different criteria to measure what constitutes a cluster, the same way you may use different criteria to determine a friend from an acquaintance. Despite these uncertainties, automated cluster detection is valuable: used critically, it can be a very useful tool for extending your understanding of a network structure. It also opens up additional options for visualising the network.