Network partitioning algorithms as cooperative games

The paper is devoted to game-theoretic methods for community detection in networks. The traditional methods for detecting community structure are based on selecting dense subgraphs inside the network. Here we propose to use the methods of cooperative game theory that highlight not only the link density but also the mechanisms of cluster formation. Specifically, we suggest two approaches from cooperative game theory: the first approach is based on the Myerson value, whereas the second approach is based on hedonic games. Both approaches allow to detect clusters with various resolutions. However, the tuning of the resolution parameter in the hedonic games approach is particularly intuitive. Furthermore, the modularity-based approach and its generalizations as well as ratio cut and normalized cut methods can be viewed as particular cases of the hedonic games. Finally, for approaches based on potential hedonic games we suggest a very efficient computational scheme using Gibbs sampling.

to community detection. Most bibliography described in [4] is in fact dedicated to noncooperative game theory approaches. It appears that the application of the cooperative or coalition games to community detection problem is under-developed and thus with this article we advance this research area.
There are definitely many relations among the above-mentioned classes. In particular, the conditions for minima of the objective functions can often be interpreted in terms of the eigen elements of the network matrices. The eigen elements of the network matrices also characterize the stationary or quasi-stationary state of a random walk on a network. In the present work, we show more connections between the approach based on cooperative games and other approaches.
In essence, all the above-mentioned approaches, with exception of the game theory approach, try to detect dense subgraphs inside the network and do not address the question: what are the natural forces and dynamics behind the formation of network clusters. As noticed in [20], most of traditional clustering methods pursue a top-down approach, whereas typically communities are formed by local interactions in self-organizing fashion, often driven by egocentric decisions. Thus, it is very natural to apply game theory, and in particular, coalition game theory for community detection problem. Also, in most of the above-mentioned methods, the number of communities is a prerequisite parameter. The game theory approach typically does not require a priori knowledge of the number of communities. One more very important benefit in using the methods from game theory is that such methods are naturally distributed and can easily be implemented in clouds and decentralized multi-agent systems.
In the present work, we explore two cooperative game theory approaches to explain possible mechanisms behind cluster formation. Our first approach is based on the Myerson value in cooperative game theory, which particularly emphasizes the value allocation in the context of games with interactions between players constrained by a network. The advantage of the Myerson value is in taking into account the impact of all coalitions. We extend the method developed in [21,22] to calculate efficiently the Myerson value in a network. A number of network centrality measures based on game-theoretic concepts have been developed, see [22][23][24][25][26][27][28] and references therein. It might be interesting to combine node ranking and clustering based on the same approach such as the Myerson value to analyze the network structure. Unfortunately, the computation of the Myerson value is a very difficult problem even for a moderately large number of players. Therefore, we propose the second approach which has efficient computational implementation and can easily be distributed.
The second approach is based on hedonic games [29], which are games explaining well the mechanism behind the formation of coalitions. Both our approaches allow to detect clusters with varying resolutions and thus avoiding the problem of resolution limit [30,31]. The hedonic game approach is especially well suited to adjust the level of resolution as the limiting cases are given by the grand coalition and sequential maximum clique decomposition, two very natural extreme cases of network partitioning. Furthermore, the modularity-based approaches as well as ratio cut [32] and normalized cut [10,33] based methods can be cast in the setting of hedonic games. We find that this gives one more, very interesting, interpretation of the modularity-based methods. The advantage of casting the ratio cut and normalized cut in the framework of hedonic games is that we do not need to prespecify the number of clusters as was needed in the original formulations of these methods.
Some hierarchical network partitioning methods based on tree hierarchy, such as [15], cannot produce a clustering on one resolution level with the number of clusters different from the predefined tree shape. Furthermore, the majority of clustering methods require the number of clusters as an input parameter. In contrast, in our approaches we specify the value of the resolution parameter(s) and the method gives a natural number of clusters corresponding to the given resolution parameter(s).
Let us point out major differences between our approaches and approaches suggested in the other works on cooperative game theory for network clustering. In [34], a cooperative game theory approach based on Shapley value has been proposed. However, with the proposed characteristic function, the players tend to form the grand coalition. In the subsequent work [35], a new characteristic function has been proposed, which combines both link-based as well as attribute-based information. The Shapley value associated with that characteristic function is very cumbersome to compute in comparison to the Myerson value for the characteristic function proposed in the first part of our paper. Of course, we admit that the computation of any type of Shapley value is computationally demanding and this is why we propose the second approach which has an efficient, naturally distributed, computational implementation.
The authors of [20] have also proposed to use hedonic games for community detection. They consider only the modularity metric as value function. They have suggested an additional voting mechanism to overcome the resolution problem. Their algorithm is a version of greedy optimization. Our approach is much more general: not only we show that the modularity optimization is a particular case of our approach but we also demonstrate that such known methods as ratio cut and normalized cut are also particular cases of our approach. We also propose a couple of new functions that overcome the resolution problem without a need of additional voting mechanism. Our Gibbs samplingbased algorithm can be used with both fixed and decreasing temperature and hence can be used for local as well as global maxima search. Setting the temperature to a very low value corresponds to the greedy approach.
The authors of [36] in the first part of their paper propose to use the concept of strong Nash equilibrium in addition to the concept of hedonic games. They also define a community as a ( , γ )-relaxation of the clique. There are several serious problems with their propositions. First of all, the strong Nash equilibrium might not exist (they acknowledge this fact themselves in their work), and such equilibrium is very hard to compute even if it exists. Furthermore, they give two definitions of a maximal ( , γ )-relaxation of the clique which are contradictory and therefore their algorithm can cycle.
We also note that our approaches based on cooperative games easily work with multigraphs, where several edges (links) are possible between two nodes. A multi-edge has several natural interpretations in the context of social networks. A multi-edge can represent a number of telephone calls; a number of exchanged messages; a number of common friends; or a number of co-occurrences in some social event.
Let us now summarize the main contributions of the paper (we place * in the items, which are new additions to the work in comparison with the conference version [37]): • First the cooperative game theory approach based on the Myerson value is proposed for network partitioning. • Then the hedonic coalition formation framework is proposed for network partitioning which has more efficient computational implementation than the approach based on the Myerson value. • New interpretation in terms of hedonic games is given to modularity, ratio cut, and normalized cut network partitioning methods. * • Two new network partitioning methods based on potential hedonic games are proposed. (One method is a new addition with respect to the conference paper [37].) * These two methods are especially well suited to find partitions with different levels of resolution; the methods use only one or two parameters. We provide recommendations how to set these parameters. • For methods constructed on potential hedonic games, we suggest to use a very efficient computational algorithm based on Gibbs sampling. * • Several numerical evaluations using real * as well as synthetic networks are carried out. These numerical evaluations in particular demonstrate the efficacy of the clustering methods based on potential hedonic games with resolution regularization.
The paper is structured as follows: in the following section, we provide necessary definitions from graph theory, network partitioning, and network games. Then, in "Myerson cooperative game approach" section, we present our first approach based on the Myerson value. The second approach based on the hedonic games is presented in "Hedonic coalition game approach" section. In both "Myerson cooperative game approach and Hedonic coalition game approach" sections, we provide small illustrative examples to explain the essence of the methods. In "Numerical validation" section, we evaluate our methods on a number of real as well as synthetic network examples. Finally, "Conclusion and future research" section provides conclusions and directions for future research.

Preliminaries of graph theory, network partitioning, and network stability
Let g = (N , E) denote an undirected multi-graph consisting of the set of nodes N and the set of edges E. We denote an edge (link) between node i and node j as ij. The interpretation is that if ij ∈ E , then the nodes i ∈ N and j ∈ N have a direct connection in network g, while ij / ∈ E , then nodes i and j are not directly connected. Since we generally consider a multi-graph, there could be several edges between a pair of nodes. Multiple edges can be interpreted for instance as a number of telephone calls or as a number of message exchanges in the context of social networks. We view the nodes of the network as players in a cooperative game. Let N (g) = {i : ∃j such that ij ∈ E(g)} . For a graph g, a sequence of different nodes {i 1 , i 2 , . . . , i k }, k ≥ 2 , is a path connecting i 1 and i k if for all h = 1, . . . , k − 1 , i h i h+1 ∈ g .
The length l of a path is the number of edges in that path, i.e., l = k − 1 . A path with no repeated nodes is called a simple path. Graph g on the set N is connected graph if for any two nodes i and j there exists a path in g connecting i and j.
We refer to a subset of nodes S ⊂ N as a coalition. The coalition S is connected if any two nodes in S are connected by a path which consists of nodes from S. The graph g ′ is a (connected) component of g, if for all i ∈ N (g ′ ) and j ∈ N (g ′ ) , there exists a path in g ′ connecting i and j, and for any i ∈ N (g ′ ) and j ∈ N (g) , ij ∈ g implies that ij ∈ g ′ . Let N|g be the set of all (connected) components in g and let g|S be the subgraph with the nodes in S.
Let g − ij denote the graph obtained by deleting edge ij from the graph g and g + ij denote the graph obtained by adding edge ij to the graph g.
The result of community detection is a partition of the network (N, E) into subsets (coalitions) {S 1 , . . . , S K } such that S k ∩ S l = ∅, ∀k, l and S 1 ∪ ... ∪ S K = N . This partition is internally stable or Nash stable if for any player from coalition S k it is not profitable to join another (possibly empty) coalition S l . We also say that the partition is externally stable if for any player i ∈ S l for whom it is beneficial to join a coalition S k , there exists a player j ∈ S k for whom it is not profitable to include there player i. The payoff definition and distribution will be discussed in the following two sections.

Myerson cooperative game approach
In general, a cooperative game of n players is a pair < N , v > where N = {1, 2, . . . , n} is the set of players and v: 2 N → R is a map prescribing for a coalition S ∈ 2 N some value v(S) such that v(∅) = 0 . This function v(S) is the total utility that members of S can jointly attain. Such a function is called the characteristic function of cooperative game. An interested reader can find more details on cooperative games in e.g., [38][39][40].
Additionally, as in [41], we assume that the cooperation is restricted by a network. The payoff to an individual player is called an imputation. The imputation specifies how the value associated with the network is distributed to the individual players. The imputation in our cooperative game will be based on the Myerson value [21,22,41] which was designed to take into account the effect of the network. The Myerson value [41] is the allocation rule where Y i (v, g) is the payoff allocated to player i from graph g under the characteristic function v. The Myerson value is uniquely determined by the following two axioms [41]: Axiom 1 If S is a connected component of g, then the members of the coalition S ought to allocate to themselves the total value v(S) available to them, i.e., ∀S ∈ N |g, Axiom 2 ∀g, ∀ij ∈ g both players i and j obtain equal payoffs after adding or deleting a link ij, Characteristic function (payoff of coalition S) can be defined in different ways. Here we use a general idea from [21,22,42,43], which is based on discounting paths. However, unlike [21,22,42,43], we do not consider shortest paths but rather simple paths.
Let us elaborate a bit more on the construction of the characteristic function. Each edge (or direct connection) gives to coalition S the value r, where 0 ≤ r ≤ 1 . Moreover, players obtain a value from indirect connections. Namely, each simple path of length 2 belonging to coalition S gives to this coalition the value r 2 , a simple path of length 3 gives to the coalition the value r 3 , etc. Set Thus, for any coalition S, we can define the characteristic function as follows: where a k (g, S) is the number of simple paths of length k in this coalition. Note that we write as infinity the limit of summation only for convenience. Clearly, the length of a simple path is bounded by n − 1 . The following theorem provides a convenient way to calculate the Myerson value corresponding to the characteristic function (3).

Theorem 1 Let the characteristic function of a coalition S ∈ 2 N be defined by Eq. (3). Then the Myerson value of a node i is given by
where a (i) k is the number of simple paths of length k containing node i.
Proof We shall prove the theorem by checking directly the Myerson value axioms, i.e., Axioms 1 and 2.
First, we note the following: Since every simple path contains k + 1 different nodes, every simple path of the length k is counted k + 1 times in the sum i∈S a (i) k (g, S).
Thus, Axiom 1 is satisfied: k (g, S) denote the number of paths of length k traversing the edge ij. Then Thus, Axiom 2 is satisfied as well.
We can propose the following algorithm for network partitioning based on the Myerson value: Start with a partition of the network N = {1, . . . , n} , where each node forms her own coalition. Consider a coalition S l and a player i ∈ S l . In the cooperative game with partial cooperation presented by the graph g|S l , we find the Myerson value for player i, Y i (g|S l ) . This is the reward of player i in coalition S l . Suppose that player i decides to join the coalition S k . In the new cooperative game with partial cooperation presented by the graph g|S k ∪ i, we find the Myerson value Y i (g|S k ∪ i) . So, if for the player i ∈ S l : Y i (g|S l ) ≥ Y i (g|S k ∪ i) then player i has no incentive to join to new coalition S k , otherwise the player changes the coalition.
The partition N = {S 1 , . . . , S K } is the Nash stable or internally stable if for any player there is no incentive to move from her coalition. Notice that our definition of the characteristic function implies that for any coalition it is always beneficial to accept a new player (of course, for the player herself it might not be profitable to join that coalition). Thus, it is important that in the above algorithm, we consider the internal and not external stability. If one makes moves according to the external stability, then the result will always be the grand coalition.
We would like to note that the above approach also works in the case of multigraphs, where several edges (links) are possible between two nodes. In such a case, if two paths contain different links between the same pair of nodes, we consider these paths as different. We see that for player A it is not profitable to move from S 1 to which is valid for all r in the interval (0, 1]. Therefore, in this partition node, A has no incentive to change the coalition under any choice of r. Now consider a slightly modified example, where we change the weight 2 on the edge {B, C} to weight 1 (see Fig. 2). This change results in the following imputations: and Thus, if r is sufficiently large, the partition {S 1 , S 2 } becomes internally unstable and the grand coalition becomes the only stable configuration.
In the above example the parameter r can be used to tune the resolution of network partitioning. Resolution scale tuning will be even more natural in the next approach. We shall see that the next approach is also much more computationally efficient than the Myerson value-based approach.

Hedonic coalition game approach
There is another game-theoretic approach for partitioning society into coalitions based on the ground-breaking work [29] on hedonic games. Assume that the set of players N = {1, . . . , n} is divided into K coalitions by the partition � = {S 1 , . . . , S K } . Let S � (i) denote the coalition S k ∈ such that i ∈ S k . A hedonic game is defined in terms of player preferences for various coalitions. A player i preferences are represented by a complete, reflexive, and transitive binary relation i over the set {S ⊂ N : i ∈ S} . Denote by ≻ i the strict part of this relation.
Let us now apply the framework of hedonic games [29] to network partitioning problem, particularly, specifying the preferences. First, in the next subsection, we consider the case of additively separable preferences and then in "The case of non-additively separable preferences" section, we consider the case of non-additively separable preferences.

The case of additively separable preferences
The preferences are additively separable [29] if there exists a value function v i : The symmetry property defines a very important class of hedonic games.
As in the previous section, the network partition is Nash stable, if S � (i) � i S k ∪ {i} for all i ∈ N , S k ∈ ∪ {∅} . In the Nash-stable partition, there is no player who wants to leave her coalition.
A potential of a coalition partition � = {S 1 , . . . , S K } (see [29]) is One natural method for detecting a stable community structure can be based on the following better response type dynamics: Start with any partition of the network N = {S 1 , . . . , S K } . Choose any player i and any coalition S k different from S � (i) . If S k ∪ {i} ≻ i S � (i) , assign node i to the coalition S k ; otherwise, keep the partition unchanged and choose another pair of node-coalition, etc.
Since the game has the potential (5), the above algorithm is guaranteed to converge in a finite number of steps.

Proposition 1 If players' preferences are additively separable and symmetric
, then the coalition partition giving a local maximum of the potential P(�) is the Nash-stable partition. Avrachenkov et al. Comput Soc Netw (2018) 5:11 One natural way to define a symmetric value function v with a parameter α ∈ [0, 1] is as follows: For any subgraph g|S, denote the number of nodes in S as n(S), and the number of edges in S as m(S). Then, for the value function (6), the potential (5) takes the form We can characterize the limiting cases α → 0 and α → 1 . Towards this goal, let us introduce a special decomposition of the network into cliques. At first, let us find a maximum clique S 1 in the network G (a maximum clique of a graph, is a clique, such that there is no clique with more vertices). Remove all vertices of S 1 from G and consider the new network G ′ . Let us find a maximum clique S 2 in the network G ′ and continue this procedure until we derive the partition {S 1 , ..., S K } of the network G into cliques. Call this partition the sequential decomposition of the network into maximum cliques.

Proposition 2
If α = 0 , the grand coalition partition N = {1, . . . , n} gives the maximum of the potential (7). Whereas if α → 1 , the network sequential decomposition into maximum cliques corresponds to a maximum of the potential (7).
Proof It is immediate to check that for α = 0 the grand coalition partition N gives the maximum of the potential (7), and P α (N ) = m(N ).
For values of α closed to 1, the partition into maximum cliques � = {S 1 , . . . , S K } gives the maximum of (7). Indeed, assume that a player i from the clique S � (i) of the size m 1 moves to a clique S j of the size m 2 < m 1 . The player i ∈ S � (i) and S j are connected by at most m 2 links. The impact on P α (�) of this movement is not higher than Now, suppose that player i from the clique S � (i) moves to a clique S j of the size m 2 ≥ m 1 . Notice that clique S j was constructed in the procedure of sequential decomposition before the clique S � (i) . The player i ∈ S � (i) is connected with the clique S j by at most m 2 − 1 links. Otherwise, the clique S j can be increased by adding node i and this contradicts the fact that S j was a maximum clique at the procedure of decomposition. If i has an incentive to move from S � (i) to the clique S j , then for new partition the sum (7) would not be higher than for partition by For α close to 1, this impact is negative, so there is no incentive to join the coalition S j .
The grand coalition and the sequential maximum clique decomposition are two extreme partitions into communities. By varying the parameter α we can easily tune the resolution of the community detection algorithm.
Example 2 Consider graph G = (N , E) , which consists of n = 26 nodes and m = 78 edges (see Fig. 3). This graph includes 4 fully connected subgraphs: G 1 with 8 vertices N 1 connected by 28 links, G 2 with 5 vertices N 2 connected by 10 links, G 3 with 6 vertices N 3 connected by 15 links, and G 4 with 7 vertices N 4 connected by 21 links. Subgraph G 1 is connected with G 2 by 1 edge, G 2 with G 3 by 2 edges, and G 3 with G 4 by 1 edge.
Firstly, calculate the potentials (7) for large-scale decompositions of G for any parameter α ∈ [0, 1] . It is easy to check, that P(N ) = 78 − 325α , P({N 1 , Other coalition partitions give smaller potentials: We solve a sequence of linear inequalities in order to find maximum of the potential for all α ∈ [0, 1] . The result is presented in the table.

Nash-stable coalition partitions in Example 2
α

Coalition partition Potential
Example 1 (ctnd) Note that for the unweighted version of the network example presented in Fig. 1, there are only two stable partitions: = N for small values of α ≤ 1/9 and = {{A, B, C}, {D, E, F }} for α > 1/9.
Another natural approach to define a symmetric value function is, roughly speaking, to compare the network under investigation with the configuration random graph model. The configuration random graph model can be viewed as a null model for a network with no community structure. Namely, the following value function can be considered: where A ij is the number of links between nodes i and j (multi-graph is allowed), d i and d j are the degrees of nodes i and j, respectively, m = 1 2 l∈N d l is the total number of links in the network, and β ij = β ji and δ are some parameters.
Note that if β ij = β, ∀i, j ∈ N , and δ = 1 , the potential (7) coincides with the network modularity [15,16]. If β ij = β, ∀i, j ∈ N , and δ � = 1 , we obtain the generalized modularity presented first in [18]. The introduction of the non-homogeneous weights was proposed in [19] with the following particularly interesting choice: The introduction of the resolution parameter δ allows one to obtain clustering with varying granularity as well as to overcome the resolution limit [30].
Thus, we now have an interpretation based on coalition game of the modularity method. Namely, the coalition partition � = {S 1 , . . . , S K } which maximizes the modularity gives the Nash-stable partition of the network in the hedonic game with the value function defined by (8)

The case of non-additively separable preferences
Now let us consider a few cases of non-additively separable preferences which still have potentials. First, we consider a slight modification of preference structure (6) which makes it non-additively separable. Namely, define the preference relation as follows: where 1{·} is the indicator function, giving one if the argument is true, v ij is defined as before in (6), and γ is a parameter representing the cost of coalition creation and allows us to control further the clustering resolution and granularity. As verified in the following proposition, in this case the game also has a potential.

Proposition 3
The hedonic clustering game defined by the preference relation (10) has the following potential: Proof Suppose that partition = {S 1 , ..., S K } maximizes the function (11), possibly locally. Then, if i moves from S � (i) to S k , the value of (11) corresponding to the new partition ′ is different from the value corresponding to by note that K is not changing. If i creates its own cluster, then the value of (11) corresponding to ′ is different from that for by and in the case S � (i) = {i} , the difference is If provides a maximum of (11), all these differences are negative. So, according to relation (10), player i indeed has no incentive to move from her coalition S � (i) to another coalition and the function (11) can be interpreted as a potential.
Let us provide a few recommendations for the choice of α and γ . Similarly to [18], from the analysis of the mean field model corresponding to a stochastic block model (SBM), one can show that the value of α close to the link density ensures the internal stability of clusters in the mean field model of SBM. Thus, if a network has one main scale, such value of α gives good result. If a network has nested clustering structure, one can vary α to obtain clustering with the needed level of granularity. Again using the mean field model for SBM, one can show that the good value of γ corresponds to the product of α and the smallest size of the cluster we would like to obtain.
We mention that interestingly under a specific choice of parameters the globally optimal partition may contain disconnected clusters. However, such a choice of parameters is typically not natural.
Example 3 Let us consider a graph that consists of a clique of four nodes and two cliques of three nodes connected to it (see Fig. 4).
One can check that for α = 0.5 and γ = 5, the partitioning gives the maximum value to the potential P α,γ (�) = − 8.5 , while An intuitive interpretation for this choice of parameters is that α is chosen significantly large to encourage splitting of clusters from the grand coalition and γ is also chosen significantly large to penalize the creation of independent clusters.
Next we would like to note that two well-known network partitioning methods: normalized cut [10,33] and ratio cut [32] can also be viewed as particular instances of potential hedonic clustering games with non-additively separable preferences. Towards this end, let us introduce a few more definitions. Let S, T ⊂ N be two, possibly overlapping sets of nodes. Then, we define a cut W(S, T) as Note that an edge is counted twice if its both ends lie in the same set. The volume of a set S ∈ N is defined as the number of edges between its vertices  Similar to the normalized cut, we assign h RCUT (S) = +∞ , if |S| = 0 , i.e., S = ∅ . Then, the normalized cut [10,33] and ratio cut [32] network partitioning methods are based on the following potentials respectively. Similar to the proof of Proposition 3, one can check that the above potentials correspond to the following preferences: for the normalized cut and the ratio cut respectively. Thus, two more, well-known network partitioning methods can be cast into our general framework. A very important benefit of such interpretation is that in contrast to the original formulations, we now do not require a priori knowledge of the number of clusters.

Gibbs sampling approach for hedonic games with potential
We note that finding Nash equilibrium in a game with potential is equivalent to finding a maximum of the game's potential. To find a maximum of the game's potential, we can follow the approach based on Gibbs sampling. Let us consider the following Gibbsian distribution over all partitions: It is easy to see that as β → ∞ , the distribution concentrates on the partition corresponding to the maximum of the potential P(�). Next, denote by the set of indices of the network clusters and by � i→σ the (re)assignment of node i to cluster σ ∈ � and run the Glauber dynamics [18,44] according to (12) .

(17)
that is, we choose randomly a node and reassign this node to a new cluster according to the conditional Gibbsian distribution (clearly, it can happen that the node remains in its current cluster). It is well known that the Glauber dynamics corresponds to the reversible Markov chain with the stationary distribution given by (16), see e.g., [44]. One can also cool down the temperature as in simulated annealing [45] in order to find the partition with the global maximum of the potential. We define one iteration as n updates of nodes according to (17). Typically and as will be demonstrated in the next section, if we take a reasonably high inverse temperature β , the process (17) often finds good-quality partition already after 5-10 iterations. The complexity of one iteration is very light in the case of sparse graphs, i.e., O(|E|). A sample generated by the above-described Glauber dynamics appears to have significant variance. To reduce it, the generalized empirical covariance matrix of several samples can be used, similar to [46] where the standard covariance matrix has been used for the case of two clusters. For one sample, the elements of the generalized covariance matrix are defined as follows: The empirical generalized covariance matrix of a set of samples is the average of their generalized covariance matrices, i.e., An (i, j)th value of the generalized covariance matrix indicates how often the ith and jth nodes appear in the same cluster.
Then, given a generalized covariance matrix one can extract the community structure using threshold-based or PCA-based methods.

Numerical validation
In this section, we validate the proposed approaches on synthetic and real-world networks. As a benchmark, we take a widely used clustering method sklearn.cluster.spectralclustering from [47]. The method is based on the eigen elements of the normalized Laplacian and K-means postprocessing and have demonstrated good performance in many previous studies.
If it is available, the ground truth clustering is denoted by true and the clustering obtained by an algorithm as test . Each time we specify which algorithm we test against the ground truth. We measure the difference between these two partitions by the following function from [48]:

Synthetic network: stochastic block model
We first evaluate various clustering algorithms based on potential hedonic games on stochastic block model (SBM), a synthetic network with known community structure. An SBM with | | clusters is represented by a symmetric square matrix P where p σ ,σ is a density of edges inside the cluster σ and p σ ,σ ′ = p σ ′ ,σ is a density between clusters σ and σ ′ . Specifically, we use SBM with two communities of 50 and 150 nodes, intra-cluster density p 11 = p 22 = 0.1 and inter-cluster density p 12 = p 21 = 0.02 . We start from a random coloring and run the process for 100 iterations. In Fig. 5, we show an example of the Glauber dynamics using NCUT potential (12). For small β = 10 we observe unstable behavior, while for large β = 500 the process evolves around a local maximum that provides relatively bad clustering (49 out of 50 nodes of the first community and only 116 out of 150 of the second community are clustered correctly). Now let us take a closer look at RCUT (13). We discovered that in our example the ground truth partition does not maximize the potential. The process converges fast to a clustering that differs from the ground truth and has larger P RCUT than the ground truth partition. We tested the algorithm on a set of 100 graph instances generated according to the SBM and we show the results in Fig. 6 where we also compare it to the spectral clustering from [47]. One can see that while the Glauber dynamics generally ends up with P RCUT (� test ) > P RCUT (� true ) . The spectral clustering procedure provides a solution that has smaller P RCUT but is closer to the ground truth.
Next let us evaluate the performance of the clustering based on the potential P α , see (7). Empty clusters do not cause any singularities in P α unlike in P NCUT and P RCUT . Hence, the final partition can have less clusters than | | . Let us at first restrict the number of clusters by setting = {0, 1} . In this context, we have two natural choices for initial coloring of a graph: either, as before, we can choose colors uniformly at random, or we can assign same color to all nodes. We tested both settings on a set of 100 SBM graph instances. The best results are obtained with α = 0.05 and β = 10 . If we assign clusters at random at initialization, the process may not converge to a good coloring. Assigning the same initial color to all the nodes leads to better results. Fig. 7a and b shows evolution on the same graph with different initial colorings. The average E after 20 iterations for random-cluster initialization is 0.033, for single-cluster initialization it is 0.006; while the standard spectral clustering, i.e., the continuous relaxation of the NCUT [33] provides a result with average E(� true , � test ) = 0.025 . We can conclude that the P α -based clustering significantly outperforms the spectral clustering in terms of accuracy. However, it depends on the parameter α that determines the penalty for large clusters. If α is too small, the uniform coloring becomes the ground state, as already indicated in Proposition 2. If α is too large, the obtained clusters will be relatively of the same size but may not represent the real community structure.
We can also try to detect the real number of clusters, if we choose large . Here we can test the case when initially all nodes receive different colors | | = |V | . We discovered that the final clustering consists of 9 or 10 clusters on average, most of which contain very few nodes. See Fig. 7c for an example of such clustering process.
To prevent the problem described in the previous paragraph, we modify P α to P α,γ , see Eq. (11), by introducing a penalty term proportional to the number of non-empty clusters. The potential P α,γ depends on parameters α and γ that determine penalties for disparate clusters and for the total number of them, respectively. We tested the respective Glauber dynamics on the same set of random instances of SBM with parameters α = 0.05 , γ = 5 , β = 10 , and | | = |V | = 200 . We run the process for 20 iterations and averaged the coloring over the last 10 of them. We obtained the following results: 2 clusters were determined in every graph instance and the average E(� true , � test ) is 0.0057. The average E(� true , � test ) for the spectral clustering is 0.0252.
In order to validate further the method based on P α,γ -potential, we tried it on different sets of graphs of different clustering structures with the same algorithm parameters α , γ , and β . On a set of 100 homogeneous Erdős-Rényi random graphs of 200 nodes with edge density 0.1, our algorithm ended up with a uniform coloring on 99 of them and on one graph it finished with 2 clusters where the smaller one contains only two nodes. Given a set of 100 graph instances of SBM with clusters of 50, 150, and 200 nodes, the algorithm correctly determined the number of clusters in each graph and provided on average E(� true , � test ) = 0.006 , while spectral clustering provided on  x-axis corresponds to the node index and y-axis corresponds to the iteration number. Different colors correspond to different clusters and 3 clusters for the others. The average E(� true , � test ) becomes 0.0185. The average E(� true , � test ) of the spectral clustering on the same set of graphs is 0.0375.
The above results can be further improved by using the generalized covariance matrix. The application of the generalized covariance matrix will be illustrated in some of the following network examples.

Real-world network with ground truth: Karate club
Consider the popular example of the social network from Zachary karate club (see Fig. 8). In his study [49], Zachary observed 34 members of a karate club over a period of 2 years. Due to a disagreement developed between the administrator of the club and the club's instructor there appeared two new clubs associated with the instructor (node 1) and administrator (node 34) of sizes 16 and 18, respectively.
The authors of [15] divide the network into two groups of roughly equal size using modularity and hierarchical clustering tree. They show that this split corresponds almost perfectly with the actual division of the club members following the break-up. Only one node, node 3, is classified incorrectly by the method of [15].
Let us first apply the Myerson value approach to the karate club network. To perform the analytic study, let us start from the ground truth partition [49] By using subindex 3, we emphasize the importance of player 3. By enumerating all simple paths and using formula (4), we find the Myerson value for player 3 in coalition R 3 and in the coalition L 3 ∪ {3} 3, 4, 5, 6, 7, 8, 11, 12, 13, 14, 17, 18, 20, 22} and L 3  We have plotted both Y 3 (g|R 3 ) and Y 3 (g|L 3 ∪ {3}) as functions of r in Fig. 9. If r is smaller than 0.231, node 3 has no incentive to move from coalition R 3 to coalition L 3 . Recall that the modularity-based method of [15] would displace player 3 into the wrong coalition L 3 .
It is also interesting to investigate the imputations of the other two border nodes 9 and 10. If we plot the imputations for node 10: Y 10 (g|L 3 ) and Y 10 (g|R 3 ∪ {10}) (see Fig. 10), we observe that as for node 3, for smaller values of r (i.e., for r < 0.363 ), node 10 has no incentive to leave the coalition L 3 ; whereas for the values of r greater than 0.363, node 10 has incentive to change the coalition.
As it is clear from Fig. 11, node 9 has no incentive to leave the coalition L 3 with any value of r. Thus, we can conclude that the ground truth partition [49] is internally stable according to the Myerson value approach if r < 0.231 . This has a nice intuitive interpretation. Humans cannot count easily long paths and consequently one needs to apply heavy discounting to mimic humans' decisions.
Let us now apply the hedonic game approach with Glauber dynamics to the karate club network. We started from a random partition into two clusters and run the algorithm using the potential (7) with α = 0.046 , which corresponds to 1/3 of the edge density, and β = 20 . The algorithm stabilizes after around 5 iterations and the mean error after 10 iterations in 100 runs was 20.0%, which roughly corresponds 7 misclassified nodes. However, the partitioning results differ significantly from run to run.
By applying the spectral clustering algorithm from [47] to Zachary karate club network, we obtain an average error of 25.8%.
To reduce the variance of the Glauber dynamics and hence the clustering error, we computed the empirical generalized covariance matrix M for the results of 10 independent runs of 10 iterations of the Glauber dynamics and then extracted the community structure using the PCA algorithm. Only node 9 was misclassified, which is a border node. The application of the generalized covariance matrix in addition to the Gibbs sampling really helps to consistently obtain high-quality results.
We would like to note that the application of the generalized covariance matrix to the spectral clustering method from [47] does not improve significantly its results since spectral clustering gives less noisy, however, more biased results compared to the hedonic game approach.

Real-world network with ground truth: Dolphins
Consider now the Dolphins social network from [50]. This network presented in Fig. 12 was constructed from observations of a community of 62 bottle nose Dolphins over a period of 7 years from 1994 to 2001. The nodes in the network represent the Dolphins, and the ties between nodes represent the associations between dolphin pairs occurring more often than expected by chance.  As was the case of the Dolphins network, we could not apply the Myerson value approach to the football network because of the difficulty in enumerating all simple paths. In contrast, the hedonic game approach can easily be applied.
One of the main advantages of the hedonic game approach with P α,γ potential (11) is the fact that it does not require the number of clusters as a parameter. Let us test on the football network, which consists of 12 clusters, how the hedonic game approach with P α,γ potential can perform without a priori knowledge of the number of clusters.
We set the following parameters of the potential: α = 0.093 (edge density), γ = 10 . We run the Glauber dynamics with random initial partition into 20 clusters and the inverse noise β = 10 for 20 iterations. The clustering dynamics stabilizes after around 10 iterations. We performed 50 independent runs. The average number of detected clusters was 10.22 and the average percentage of misclassified nodes is 13.5%.
As with Zachary and Dolphins networks, we computed the empirical generalized covariance matrix of the clustering results. Since our goal was to determine the clusters without providing its number to the algorithm, we used a simple threshold-based clustering algorithm instead of the PCA algorithm: we build a weighted graph using the generalized covariance matrix as its adjacency matrix and removed the edges with  The resulting graph contained 13 connected components, i.e., we identified 13 clusters in the initial network that is quite close to the ground truth value 12. The percentage of the misclassified nodes is 6.9%, which signifies that the generalized covariance matrix improved significantly the quality of the clustering results.

Large real-world network without ground truth: Co-authorships in Math-Net.ru
To test scalability and efficiency of the hedonic game approach, we have chosen to cluster a fairly large social network. We have crawled the site Math-Net.ru, Russian Mathematical Portal, for the co-authorship graph [51]. We further extracted the giant connected component of this co-authorship network, which includes 41,840 authors. We have applied the hedonic game with the potential (7) and run the modified Glauber dynamics using round robin node schedule with random permutation. Twenty iterations of the modified Glauber dynamics run for about 2 min on Intel Core Duo 1.6 GHz processor and 5GB RAM. We have initialized process with a single cluster and restricted the number of clusters to ten. We have again observed stabilization of the modified Glauber dynamics at 15-20 iterations. Recall that one iteration of the modified Glauber dynamics requires O(|E|) operations, which is quite a reasonable cost in the case of sparse networks and which is the case of most real-world networks. We have chosen significantly high value of β , which corresponds to nearly greedy algorithm. First, we have run the modified Glauber dynamics with α corresponding to the average edge density. This has lead to unbalanced clusters, see Table 1. By expecting clusters, we have observed that nearly all academicians (aka leaders of scientific schools) have been clustered to one largest cluster. However, when we have increased α tenfold, the clustering became much more balanced and the academicians have been distributed more evenly among the clusters.

Large synthetic SBM network with ground truth
To continue testing scalability and efficiency of the hedonic game approach and in particular to confirm a rapid convergence of the Glauber dynamics to a good solution, we consider a large stochastic block model graph with known communities. Specifically, we have generated an SBM with two clusters of sizes 50,000 and 150,000 nodes. We have generated the intra-cluster links with probability 0.0002 and the inter-cluster links with probability 0.00005. We run the Glauber dynamics associated with the potential (7), setting α = 0.0001 and β = 10 . In a typical run, after 7 iterations, only 97 nodes from the smaller cluster were misclassified to the larger cluster and only 65 nodes from the larger cluster were misclassified to the smaller cluster. It is not surprising that by the "gravity" effect the larger cluster attracted more nodes. We find that 7 iterations of the Glauber dynamics are not at all a large cost for partitioning 200,000 node network.

Conclusion and future research
We have presented two cooperative game theory-based approaches for network partitioning. The first approach is based on the Myerson value for graph constrained cooperative game, whereas the second approach is based on hedonic games which explain coalition formation. We find the second approach especially interesting as it gives a very natural way to tune the clustering resolution and generalizes the modularity, ratio cut, and normalized cut-based approaches. Within the hedonic games framework, we have proposed two new methods which particularly well regularize clustering resolution and help to adjust the level of granularity. We have shown that normalized cut and ratio cut methods can be modified to avoid the requirement of the number of clusters. All approaches that can be represented as hedonic games with potentials can be very efficiently implemented using Gibbs sampling with Glauber dynamics and generalized covariance matrix. The application of the generalized covariance matrix significantly improves the quality and stability of the clustering results. Our research plans are to test and to compare our methods on more social networks and to study analytically the convergence rate of Gibbs sampling.