The rapid exploration of complex networks in recent years has been dogged by a lack of standardized naming conventions, as various groups use overlapping and contradictory[28][29] terminology to describe specific network configurations (e.g., multiplex, multilayer, multilevel, multidimensional, multirelational, interconnected). To fully leverage the dataset information on the directional nature of the communications, some authors consider only direct networks without any labels on vertices, and introduce the definition of edge-labeled multigraphs which can cover many multidimensional situations.[30] The term "fully multidimensional" has also been used to refer to a multipartite edge-labeled multigraph.[31] Multidimensional networks have also recently been reframed as specific instances of multilayer networks.[1][5][6][32] In this case, there are as many layers as there are dimensions, and the links between nodes within each layer are simply all the links for a given dimension.
Definition
Unweighted multilayer networks
In elementary network theory, a network is represented by a graph in which is the set of nodes and the links between nodes, typically represented as a tuple of nodes . While this basic formalization is useful for analyzing many systems, real world networks often have added complexity in the form of multiple types of relations between system elements. An early formalization of this idea came through its application in the field of social network analysis (see, e.g.,[33] and papers on relational algebras in social networks) in which multiple forms of social connection between people were represented by multiple types of links.[34]
To accommodate the presence of more than one type of link, a multidimensional network is represented by a triple , where is a set of dimensions (or layers), each member of which is a different type of link, and consists of triples with and .[6]
Note that as in all directed graphs, the links and are distinct.
By convention, the number of links between two nodes in a given dimension is either 0 or 1 in a multidimensional network. However, the total number of links between two nodes across all dimensions is less than or equal to .
Weighted multilayer networks
In the case of a weighted network, this triplet is expanded to a quadruplet , where is the weight on the link between and in the dimension .
Further, as is often useful in social network analysis, link weights may take on positive or negative values. Such signed networks can better reflect relations like amity and enmity in social networks.[31] Alternatively, link signs may be figured as dimensions themselves,[35] e.g. where and This approach has particular value when considering unweighted networks.
This conception of dimensionality can be expanded should attributes in multiple dimensions need specification. In this instance, links are n-tuples . Such an expanded formulation, in which links may exist within multiple dimensions, is uncommon but has been used in the study of multidimensional time-varying networks.[36]
General formulation in terms of tensors
Whereas unidimensional networks have two-dimensional adjacency matrices of size , in a multidimensional network with dimensions, the adjacency matrix becomes a multilayer adjacency tensor, a four-dimensional matrix of size .[3] By using index notation, adjacency matrices can be indicated by , to encode connections between nodes and , whereas multilayer adjacency tensors are indicated by , to encode connections between node in layer and node in layer . As in unidimensional matrices, directed links, signed links, and weights are all easily accommodated by this framework.
In the case of multiplex networks, which are special types of multilayer networks where nodes can not be interconnected with other nodes in other layers, a three-dimensional matrix of size with entries is enough to represent the structure of the system[8][37] by encoding connections between nodes and in layer .
Multidimensional network-specific definitions
Multi-layer neighbors
In a multidimensional network, the neighbors of some node are all nodes connected to across dimensions.
Multi-layer path length
A path between two nodes in a multidimensional network can be represented by a vector r in which the th entry in r is the number of links traversed in the th dimension of .[38] As with overlapping degree, the sum of these elements can be taken as a rough measure of a path length between two nodes.
Network of layers
The existence of multiple layers (or dimensions) allows to introduce the new concept of network of layers,[3] peculiar of multilayer networks. In fact, layers might be interconnected in such a way that their structure can be described by a network, as shown in the figure.
The network of layers is usually weighted (and might be directed), although, in general, the weights depends on the application of interest. A simple approach is, for each pair of layers, to sum all of the weights in the connections between their nodes to obtain edge weights that can be encoded into a matrix . The rank-2 adjacency tensor, representing the underlying network of layers in the space is given by
where is the canonical matrix with all components equal to zero except for the entry corresponding to row and column , that is equal to one. Using the tensorial notation, it is possible to obtain the (weighted) network of layers from the multilayer adjacency tensor as .[3]
Centrality measures
Degree
In a non-interconnected multidimensional network, where interlayer links are absent, the degree of a node is represented by a vector of length . Here is an alternative way to denote the number of layers in multilayer networks. However, for some computations it may be more useful to simply sum the number of links adjacent to a node across all dimensions.[3][39] This is the overlapping degree:[4]. As with unidimensional networks, distinction may similarly be drawn between incoming links and outgoing links.
If interlayer links are present, the above definition must be adapted to account for them, and the multilayer degree is given by
where the tensors and have all components equal to 1. The heterogeneity in the number of connections of a node across the different layers can be taken into account through the participation coefficient.[4]
Versatility as multilayer centrality
When extended to interconnected multilayer networks, i.e. those systems where nodes are connected across layers, the concept of centrality is better understood in terms of versatility.[10] Nodes that are not central in each layer might be the most important for the multilayer systems in certain scenarios. For instance, this is the case where two layers encode different networks with only one node in common: it is very likely that such a node will have the highest centrality score because it is responsible for the information flow across layers.
Eigenvector versatility
As for unidimensional networks, eigenvector versatility can be defined as the solution of the eigenvalue problem given by , where Einstein summation convention is used for sake of simplicity. Here, gives the multilayer generalization of Bonacich's eigenvector centrality per node per layer. The overall eigenvector versatility is simply obtained by summing up the scores across layers as .[3][10]
Katz versatility
As for its unidimensional counterpart, the Katz versatility is obtained as the solution of the tensorial equation , where , is a constant smaller than the largest eigenvalue and is another constant generally equal to 1. The overall Katz versatility is simply obtained by summing up the scores across layers as .[10]
HITS versatility
For unidimensional networks, the HITS algorithm has been originally introduced by Jon Kleinberg to rate Web Pages. The basic assumption of the algorithm is that relevant pages, named authorities, are pointed by special Web pages, named hubs. This mechanism can be mathematically described by two coupled equations which reduce to two eigenvalue problems. When the network is undirected, Authority and Hub centrality are equivalent to eigenvector centrality.
These properties are preserved by the natural extension of the equations proposed by Kleinberg to the case of interconnected multilayer networks, given by
and , where indicates the transpose operator, and indicate hub and authority centrality, respectively. By contracting the hub and authority tensors, one obtains the overall versatilities as and , respectively.[10]
PageRank versatility
PageRank, originally introduced to rank web pages, can also be considered as a measure of centrality for interconnected multilayer networks.
It is worth remarking that PageRank can be seen as the steady-state solution of a special Markov process on the top of the network. Random walkers explore the network according to a special transition matrix and their dynamics is governed by a random walk master equation. It is easy to show that the solution of this equation is equivalent to the leading eigenvector of the transition matrix.
Random walks have been defined also in the case of interconnected multilayer networks[15] and edge-colored multigraphs (also known as multiplex networks).[40] For interconnected multilayer networks, the transition tensor governing the dynamics of the random walkers within and across layers is given by , where is a constant, generally set to 0.85, is the number of nodes and is the number of layers or dimensions. Here, might be named Google tensor and is the rank-4 tensor with all components equal to 1.
As its unidimensional counterpart, PageRank versatility consists of two contributions: one encoding a classical random walk with rate and one encoding teleportation across nodes and layers with rate .
If we indicate by the eigentensor of the Google tensor , denoting the steady-state probability to find the walker in node and layer , the multilayer PageRank is obtained by summing up over layers the eigentensor: [10]
Like many other network statistics, the meaning of a clustering coefficient becomes ambiguous in multidimensional networks, due to the fact that triples may be closed in different dimensions than they originated.[4][41][42] Several attempts have been made to define local clustering coefficients, but these attempts have highlighted the fact that the concept must be fundamentally different in higher dimensions: some groups have based their work off of non-standard definitions,[42] while others have experimented with different definitions of random walks and 3-cycles in multidimensional networks.[4][41]
Community discovery
While cross-dimensional structures have been studied previously,[43][44] they fail to detect more subtle associations found in some networks. Taking a slightly different take on the definition of "community" in the case of multidimensional networks allows for reliable identification of communities without the requirement that nodes be in direct contact with each other.[3][8][9][45]
For instance, two people who never communicate directly yet still browse many of the same websites would be viable candidates for this sort of algorithm.
Modularity maximization
A generalization of the well-known modularity maximization method for community discovery has been originally proposed by Mucha et al.[8] This multiresolution method assumes a three-dimensional tensor representation of the network connectivity within layers, as for edge-colored multigraphs, and a three-dimensional tensor representation of the network connectivity across layers. It depends on the resolution parameter and the weight of interlayer connections. In a more compact notation, making use of the tensorial notation, modularity can be written as , where , is the multilayer adjacency tensor, is the tensor encoding the null model and the value of components of is defined to be 1 when a node in layer belongs to a particular community, labeled by index , and 0 when it does not.[3]
Tensor decomposition
Non-negative matrix factorization has been proposed to extract the community-activity structure of temporal networks.[46] The multilayer network is represented by a three-dimensional tensor , like an edge-colored multigraph, where the order of layers encode the arrow of time. Tensor factorization by means of Kruskal decomposition is thus applied to to assign each node to a community across time.
Statistical inference
Methods based on statistical inference, generalizing existing approaches introduced for unidimensional networks, have been proposed. Stochastic block model is the most used generative model, appropriately generalized to the case of multilayer networks.[47][48]
As for unidimensional networks, principled methods like minimum description length can be used for model selection in community detection methods based on information flow.[9]
Structural reducibility
Given the higher complexity of multilayer networks with respect to unidimensional networks, an active field of research is devoted to simplify the structure of such systems by employing some kind of dimensionality reduction.[22][49]
A popular method is based on the calculation of the quantum Jensen-Shannon divergence between all pairs of layers, which is then exploited for its metric properties to build a distance matrix and hierarchically cluster the layers. Layers are successively aggregated according to the resulting hierarchical tree and the aggregation procedure is stopped when the objective function, based on the entropy of the network, gets a global maximum. This greedy approach is necessary because the underlying problem would require to verify all possible layer groups of any size, requiring a huge number of possible combinations (which is given by the Bell number and scales super-exponentially with the number of units). Nevertheless, for multilayer systems with a small number of layers, it has been shown that the method performs optimally in the majority of cases.[22]
Other multilayer network descriptors
Degree correlations
The question of degree correlations in unidimensional networks is fairly straightforward: do networks of similar degree tend to connect to each other? In multidimensional networks, what this question means becomes less clear. When we refer to a node's degree, are we referring to its degree in one dimension, or collapsed over all? When we seek to probe connectivity between nodes, are we comparing the same nodes across dimensions, or different nodes within dimensions, or a combination?[6] What are the consequences of variations in each of these statistics on other network properties? In one study, assortativity was found to decrease robustness in a duplex network.[50]
Path dominance
Given two multidimensional paths, r and s, we say that rdominatess if and only if: and such that .[38]
Shortest path discovery
Among other network statistics, many centrality measures rely on the ability to assess shortest paths from node to node. Extending these analyses to a multidimensional network requires incorporating additional connections between nodes into the algorithms currently used (e.g., Dijkstra's). Current approaches include collapsing multi-link connections between nodes in a preprocessing step before performing variations on a breadth-first search of the network.[28]
Multidimensional distance
One way to assess the distance between two nodes in a multidimensional network is by comparing all the multidimensional paths between them and choosing the subset that we define as shortest via path dominance: let be the set of all paths between and . Then the distance between and is a set of paths such that such that dominates . The length of the elements in the set of shortest paths between two nodes is therefore defined as the multidimensional distance.[38]
Dimension relevance
In a multidimensional network , the relevance of a given dimension (or set of dimensions) for one node can be assessed by the ratio: .[39]
Dimension connectivity
In a multidimensional network in which different dimensions of connection have different real-world values, statistics characterizing the distribution of links to the various classes are of interest. Thus it is useful to consider two metrics that assess this: dimension connectivity and edge-exclusive dimension connectivity. The former is simply the ratio of the total number of links in a given dimension to the total number of links in every dimension: . The latter assesses, for a given dimension, the number of pairs of nodes connected only by a link in that dimension: .[39]
Burst detection
Burstiness is a well-known phenomenon in many real-world networks, e.g. email or other human communication networks. Additional dimensions of communication provide a more faithful representation of reality and may highlight these patterns or diminish them. Therefore, it is of critical importance that our methods for detecting bursty behavior in networks accommodate multidimensional networks.[51]
Diffusion processes on multilayer networks
Diffusion processes are widely used in physics to explore physical systems, as well as in other disciplines as social sciences, neuroscience, urban and international transportation or finance. Recently, simple and more complex diffusive processes have been generalized to multilayer networks.[23][52] One result common to many studies is that diffusion in multiplex networks, a special type of multilayer system, exhibits two regimes: 1) the weight of inter-layer links, connecting layers each other, is not high enough and the multiplex system behaves like two (or more) uncoupled networks; 2) the weight of inter-layer links is high enough that layers are coupled each other, raising unexpected physical phenomena.[23] It has been shown that there is an abrupt transition between these two regimes.[53]
In fact, all network descriptors depending on some diffusive process, from centrality measures to community detection, are affected by the layer-layer coupling. For instance, in the case of community detection, low coupling (where information from each layer separately is more relevant than the overall structure) favors clusters within layers, whereas high coupling (where information from all layer simultaneously is more relevant than the each layer separately) favors cross-layer clusters.[8][9]
Random walks
As for unidimensional networks, it is possible to define random walks on the top of multilayer systems. However, given the underlying multilayer structure, random walkers are not limited to move from one node to another within the same layer (jump), but are also allowed to move across layers (switch).[15]
Random walks can be used to explore a multilayer system with the ultimate goal to unravel its mesoscale organization, i.e. to partition it in communities,[8][9] and have been recently used to better understand navigability of multilayer networks and their resilience to random failures,[15] as well as for exploring efficiently this type of topologies.[54]
In the case of interconnected multilayer systems, the probability to move from a node in layer to node in layer can be encoded into the rank-4 transition tensor and the discrete-time walk can be described by the master equation
where indicates the probability of finding the walker in node in layer at time .[3][15]
There are many different types of walks that can be encoded into the transition tensor , depending on how the walkers are allowed to jump and switch. For instance, the walker might either jump or switch in a single time step without distinguishing between inter- and intra-layer links (classical random walk), or it can choose either to stay in the current layer and jump, or to switch layer and then jump to another node in the same time step (physical random walk). More complicated rules, corresponding to specific problems to solve, can be found in the literature.[23] In some cases, it is possible to find, analytically, the stationary solution of the master equation.[15][54]
Classical diffusion
The problem of classical diffusion in complex networks is to understand how a quantity will flow through the system and how much time it will take to reach the stationary state. Classical diffusion in multiplex networks has been recently studied by introducing the concept of supra-adjacency matrix,[55] later recognized as a special flattening of the multilayer adjacency tensor.[3] In tensorial notation, the diffusion equation on the top of a general multilayer system can be written, concisely, as
where is the amount of diffusing quantity at time in node in layer . The rank-4 tensor governing the equation is the Laplacian tensor, generalizing the combinatorial Laplacian matrix of unidimensional networks. It is worth remarking that in non-tensorial notation, the equation takes a more complicated form.
Many of the properties of this diffusion process are completely understood in terms of the second smallest eigenvalue of the Laplacian tensor. It is interesting that diffusion in a multiplex system can be faster than diffusion in each layer separately, or in their aggregation, provided that certain spectral properties are satisfied.[55]
Recently, how information (or diseases) spread through a multilayer system has been the subject of intense research. [56][1][57][58][59]
Multilayer network analysis software
Several software programs focusing on the analysis and visualization of multilayer networks have been introduced. Some popular solutions include multinet (C++ / Python / R), MuxViz (R), Pymnet (Python), with each software typically specializing in different analytical functions.[60] However, most software currently face issues such as processing very large multilayer networks, while the interoperability between software also needs improvement.
^Coscia, Michele; Rossetti, Giulio; Pennacchioli, Diego; Ceccarelli, Damiano; Giannotti, Fosca (2013). ""You know because I know": A multidimensional network approach to human resources problem". Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Advances in Social Network Analysis and Mining (ASONAM). Vol. 2013. pp. 434–441. arXiv:1305.7146. doi:10.1145/2492517.2492537. ISBN9781450322409. S2CID1810575.
^ abKivela, M.; Arenas, A.; Barthelemy, M.; Gleeson, J. P.; Moreno, Y.; Porter, M. A. (2014). "Multilayer networks". Journal of Complex Networks. 2 (3): 203–271. arXiv:1309.7233. doi:10.1093/comnet/cnu016. S2CID11390956.
^Costa, J.M.; Ramos, J.A.; Timóteo, S.; da Silva, L.P.; Ceia, R.C.; Heleno, R. (2018). "Species activity promote the stability of fruit-frugivore interactions across a five-year multilayer network". bioRxiv10.1101/421941.
^Zignani, Matteo; Quadri, Christian; Gaitto, Sabrina; Gian Paolo Rossi (2014). "Exploiting all phone media? A multidimensional network analysis of phone users' sociality". arXiv:1401.3126 [cs.SI]. Ch. 4: "Here we introduce the definition of edge-labeled multigraph which can cover many multidimensional situations. To fully leverage the dataset information on the directional nature of the communications, we consider only direct networks without any labels on vertices".
^Kazienko, P. A.; Musial, K.; Kukla, E. B.; Kajdanowicz, T.; Bródka, P. (2011). "Multidimensional Social Network: Model and Analysis". Computational Collective Intelligence. Technologies and Applications. Lecture Notes in Computer Science. Vol. 6922. pp. 378–387. doi:10.1007/978-3-642-23935-9_37. ISBN978-3-642-23934-2.
^ abBródka, Piotr; Kazienko, Przemysław; Musiał, Katarzyna; Skibicki, Krzysztof (2012). "Analysis of Neighbourhoods in Multi-layered Dynamic Social Networks". International Journal of Computational Intelligence Systems. 5 (3): 582–596. arXiv:1207.4293. doi:10.1080/18756891.2012.696922. S2CID1373823.
^Cai, D.; Shao, Z.; He, X.; Yan, X.; Han, J. (2005). "Community Mining from Multi-relational Networks". Knowledge Discovery in Databases: PKDD 2005. Lecture Notes in Computer Science. Vol. 3721. p. 445. doi:10.1007/11564126_44. ISBN978-3-540-29244-9.
^Berlingerio, M.; Pinelli, F.; Calabrese, F. (2013). "ABACUS: Frequent p Attern mining-BAsed Community discovery in m Ultidimensional networkS". Data Mining and Knowledge Discovery. 27 (3): 294–320. arXiv:1303.2025. doi:10.1007/s10618-013-0331-0. S2CID17320129.