Spheres of legislation: polarization and most influential nodes in behavioral context

Game-theoretic models of influence in networks often assume the network structure to be static. In this paper, we allow the network structure to vary according to the underlying behavioral context. This leads to several interesting questions on two fronts. First, how do we identify different contexts and learn the corresponding network structures using real-world data? We focus on the U.S. Senate and apply unsupervised machine learning techniques, such as fuzzy clustering algorithms and generative models, to identify spheres of legislation as context and learn an influence network for each sphere. Second, how do we analyze these networks to gain an insight into the role played by the spheres of legislation in various interesting constructs like polarization and most influential nodes? To this end, we apply both game-theoretic and social network analysis techniques. In particular, we show that game-theoretic notion of most influential nodes brings out the strategic aspects of interactions like bipartisan grouping, which structural centrality measures fail to capture.


Spheres of legislation
We use an unsupervised machine learning technique, namely fuzzy clustering, to assign bills to different spheres of legislation based on the bill subjects. We learn the linear influence game (LIG) models, analyze influence networks, compute equilibria, and find most influential senators for each sphere separately. By doing so, we are able to examine differences and make comparative judgments across the spheres. We first describe how we prepare the data for clustering.

Preparing congressional roll-call data
Our model relies on data obtained from the @unitedstates project's Congress repository, (https ://githu b.com/unite dstat es/congr ess), a public domain program that allows easy access to official congressional data from the Congressional Research Service (CRS). In particular, we use bill data and roll-call data. Roll-call data contain senators' "yea, " "nay, " or abstaining votes, while bill data include a list of subjects incident to the bill, among other attributes. These 820 subjects range from "Abortion" to "Zimbabwe, " and a multitude of subjects describes each bill. Additionally, each bill is assigned a single "top term, " the broad subject which best describes the bill out of 23 possible top-level subjects. We use the roll-call data to represent senator voting behavior, and bill data to extract bill topics.
Working with the combined data from multiple terms presents a troubling problem for graph-based analysis: senators come and go. Seats in the United States Senate often change during midterm elections, when constituents have the chance to re-elect or replace incumbent senators. In the middle of a term, if a senator leaves their seat, a successor is appointed until the state can hold a special election to find a democratically elected replacement. In the 2016 midterm election at the onset of the 115th congress, seven senate seats were changed; during the course of the 115th congress, due to cabinet appointments by President Trump, scandals, and a death, the senate saw seven more changes.
When a senator is not present for a vote, they neither influence nor can be influenced by other senators' votes during that roll call. Some senators in our dataset never once overlap with another; one left the senate before the other even joined. To reduce the number of these cases, we combined non-permanent senators under the following circumstances, given a departing senator A and an incoming senator B: 1. Senator A does not run during an election, and senator B of the same party is elected to replace them. 2. Senator A voluntarily or involuntarily steps down, and senator B of the same party is appointed as their replacement In these circumstances, we assume that the incoming senators behave similarly to the departing senators. In other circumstances, such as when a senator loses their seat to member of the opposing party, we keep both senators in the dataset. Changes in senate membership, and the operations undertaken to reduce the number of total senators, are described in Table 1. Additionally, learning the LIG model requires data to be in the form of two discrete values: 1 (yea) or −1 (nay). When a senator is not present for a vote-either because they were absent on that day, or were not yet holding office-we fill in the missing data with the mean vote of their party. 1

Clustering algorithm
We seek to split the bills into a small number of broad categories, each of which encompasses many bills. Each bill has been tagged with a "top term" by @unitedstates. The top term corresponds to congress.gov's tag of "policy area. " According to congress.gov, "one Policy Area term, which best describes an entire measure, is assigned to every public bill or resolution. " The policy area vocabulary consists of 32 terms. 2 However, these top terms/policy areas are too specific to be used as clusters on their own. In fact, making each top term its own cluster would result in some clusters containing only one bill and others containing a hundred. This would be problematic because the "outcome space" of LIGs is exponential in size, and as a result, learning LIGs requires a relatively large amount of data.
Rather than manually re-categorizing bills, we took a statistical clustering approach to grouping, based on a bill's assigned "top term" in addition to all subjects it contains. For each data point, we assigned each possible subject a weight: 0 if missing, 1 if present, or 10 if it is the "top term. " By including both measures of subjects (top and regular), we produce more meaningful categories than using top terms or bill subjects lists alone.
In data science, K-Means (KM) is often used as a simple yet effective clustering algorithm [38]. In KM, n data points are partitioned into k clusters based on their Euclidean distance from cluster centers. In each iteration, every data point is assigned a cluster based on the closest centroid; then, the centroid of each cluster is reset to the average position of each data point within that cluster. The process repeats until centroid positions converge. The problem of choosing k is left up to the researcher; generally, k is chosen by trial-and-error. Cluster membership in KM is crisp, meaning that each data point belongs to one and only one cluster. While effective at producing distinct clusters, KM is not ideal for our purposes because bills often belong to multiple clusters. For example, a bill about increasing defense spending is about national security as well as economics.
The Fuzzy C-Means (FCM) clustering algorithm addresses this problem. FCM is an extension of KM which allows for overlaps in clusters [3,47]. The objective function in FCM is largely the same as in KM, with the addition of membership values w ij and a fuzzifier m. Membership values describe how closely each data point i belongs to cluster j. The fuzzifer changes membership values: m = 1 results in crisp clusters ( w ij ∈ {0, 1} ), and higher values of m result in fuzzier clusters. The FCM algorithm produces a list of cluster centers, describing the position of each centroid, as well as the fuzzy partition matrix, describing the membership degree of each bill to every cluster.
Iterating over a range of values, we found that number of clusters, c = 4 and m = 1.3 resulted in clusters which were relatively distinct, had intuitive descriptions and also contained an adequate number of bills for machine learning. Additionally, we experimented with the threshold values for cluster membership and settled on 0.15. That is, a bill is considered a member of a cluster if its membership value is above 0.15. Table 2 describes the results of our chosen FCM parameters. Each cluster is assigned a shorthand name describing its contents and is called a sphere of legislation in this paper. We next describe the model.

The LIG model
We represent the senate influence network as a linear influence game (LIG) [29,30], one type of 2-action graphical game [35]. Nodes represent senators, or players, and are connected by directed edges. Edge weights represent the influence exerted by the source node upon the target. Influence weights can be negative, positive, or zero. The directed edges are allowed to be asymmetric, meaning that nodes A and B may exert different levels of influences on each other. Additionally, nodes have a threshold level, which represents "stubbornness. " Nodes with thresholds further from zero are more resistant to change. Absent influences, a node with negative threshold is predisposed to adopting action 1 (yea vote), and a node with positive threshold is predisposed to −1 (nay vote). The matrix of influence weights W ∈ R n×n and the threshold vector b ∈ R n constitute the LIG model. The action x i ∈ {1, −1} chosen by each node i is the outcome of the model, as described below in game-theoretic terms.
Each node's best response to other nodes' actions depends on the net incoming influence and the node's threshold. When the total incoming influence from nodes playing 1 minus the total incoming influence from nodes playing −1 exceeds the node's threshold level, that node's best response is 1. If below, it is −1 ; in the case of a tie, the node is indifferent and can play either. Note that the best responses of the nodes are interdependent. A vector of mutual best responses of all the nodes is a stable outcome of the model, formally known as a pure strategy Nash equilibrium (PSNE). It is stable because no node has any incentive to deviate from it. The LIG model adopts PSNE to represent stable collective outcomes from a complex network of influence. Before formally defining the technical terms, we illustrate the model using an example.
Example. Fig. 1 illustrates the LIG model with a simple, 4-node example. Note that the LIG model allows edges of opposite polarities between two nodes. This is not shown in this example for simplicity. As explained in Fig. 1, A and B playing 1 and C and D playing −1 is a PSNE, whereas all nodes playing 1 is not a PSNE.
As shown for node A in the above example, the process of adding up incoming influences from nodes playing 1, then subtracting influences from nodes playing −1 , and finally comparing the result with the threshold value is succinctly captured by the Table 2 Summary of four spheres of legislation: shorthand names and descriptions for each of the spheres of legislation identified by the FCM algorithm are shown here The sum of the number of bills across all spheres (965) is greater than the total number of bills (722) because membership is fuzzy. Spheres 1 and 2 are relatively distinct from the rest, while Spheres 3 and 4 share a large number of bills  8:14 influence function defined in Definition 3.1. The best response calculation (e.g., node A's best response is to play 1 if the total weighted influence on A exceeds its threshold) can be done using the payoff function defined in Definition 3.2. Finally, PSNE is formally defined in Definition 3.3. In the following formal definitions, we use the same notation as [30]. Definition 3.1 (Influence function [30]) The influence function of each individual i, given others' actions x −i , is defined as f i (x −i ) ≡ j� =i w ij x j − b i where for any other individual j, w ij ∈ R is a weight parameter quantifying the "influence factor" that j has on i, and b i ∈ R is a threshold parameter for i's level of "tolerance. " Here, individuals receive influences from other players and have an influence threshold of their own, which accounts for their own resistance to external influence. The influence function f i calculates the weighted sum of incoming influences on i, as described in the paragraph above Definition 3.1, and subtracts i's threshold from it.
We next define the payoff of each player. The payoff function happens to be one of the main ingredients of any game-theoretic model. A four-node LIG is shown here. The directed edges are labeled with influence levels. Any absence of a directed edge implies an influence level of 0. Threshold values of 0 (for simplicity) are shown with a connector to each node. We assume binary actions {1, −1} . In this game, nodes A and B playing 1 and nodes C and D playing −1 is a pure strategy Nash equilibrium (PSNE). To see this, consider node A first. We add up the incoming influences from those nodes (in this case, B) that are playing 1 and then subtract from it the influences coming from nodes (in this case, C and D) playing −1 . We get 1 − (−2 − 1.5) = 4.5 , which is basically the total weighted influence on A. Since 4.5 is greater than A's threshold of 0, A's best response is 1. Similarly, it can be shown that B, C, and D's best responses are 1, −1, −1 , respectively, and therefore, this is a PSNE. Similarly, nodes A and B playing −1 and C and D playing 1 is another PSNE. As a negative example, all nodes playing 1 is not a PSNE. To see this, consider node A. The total weighted influence on A is 1 + (−2) + (−1.5) = −2.5 , which is less than A's threshold of 0. Therefore, A's best response is to play −1 , which violates the mutual best response condition for PSNE Phillips et al. Comput Soc Netw (2021) 8:14 Definition 3.2 (Payoff function [30]) For an LIG, we define the payoff function , where x −i denotes the vector of a joint action of all players except i and f i is defined in Definition 3.1.
The payoff function quantifies the preferences of the players based on the actions of other players. Given the action of all other individuals x −i and influence function f i (x −i ) , an individual will prefer to choose either 1 or −1 as follows. When f i (x −i ) is negative, x i = −1 will result in a positive payoff; when f i (x −i ) is positive, x i = 1 will result in a positive payoff. Actions chosen in this fashion in order to result in a positive payoff (i.e., to maximize payoff ) is defined as the best response.
Example. For the LIG shown in Fig. 1, when A and B play 1 and C and D play −1 , A's payoff is 1 × 4.5 = 4.5 . In this scenario, A is playing its best response because if A were to play −1 , A's payoff would have been −4.5 . As another example, when everyone plays 1, A's payoff is 1 × (−2.5) = −2.5 . Here, A is not playing its best response because A could have gotten a payoff of 2.5 by switching to action −1 . Note that the payoff of a node does depend on the node's own action.
We next define pure-strategy Nash Equilibrium (PSNE) of an LIG. PSNE is one of the most central solution concepts in game theory. A PSNE signifies everyone playing their best responses simultaneously. Definition 3.3 (Pure-strategy Nash equilibrium [30]) A pure-strategy Nash equilibrium (PSNE) of an LIG G is an action assignment x * ∈ {−1, 1} n that satisfies the following condition. Every player i's action x * i is a simultaneous best response to the actions x * −i of the rest.
Example. In our running example ( Fig. 1), nodes A and B playing 1 and nodes C and D playing −1 is a PSNE because it can be verified that every player is playing their best response simultaneously. As another example, nodes A and B playing −1 and C and D playing 1 is also a PSNE. As shown in Fig. 1, all nodes playing 1 cannot be a PSNE.
We adopt PSNE as the notion of stable outcomes arising from a network of influence. We are interested in questions like how the network changes based on the spheres of legislation and what impact the spheres have on polarization and most influential nodes. For these, we learn the networks using the spheres data.

Machine learning
We use Honorio and Ortiz's machine learning algorithm to instantiate an LIG from raw roll-call data [26]. The goal of the algorithm is to capture as much of the ground-truth data as possible as PSNE (the empirical proportion of equilibria), without having so many total PSNE (the true proportion of equilibria) that the model is meaningless. For example, if all influence weights and threshold levels are 0 (i.e., W = 0, b = 0), then all 2 n possible joint actions among n players would be PSNE, trivially covering all observed voting data. However, this is undesirable as it has no predictive power at all. Therefore, we would like to maximize the empirical proportion of equilibria while minimizing the true proportion. Following is a gist of Honorio and Ortiz's machine learning algorithm resulting from a very lengthy proof [26].

Learning algorithm
To balance the true and empirical proportions of equilibria, the learning algorithm uses a generative mixture model that picks a joint action which is either a PSNE or non-PSNE of an LIG model G with probabilities q and 1 − q , respectively. Of course, our goal is to learn the game G . Let N E(G) denote the set of PSNE of G and D = {x (1) , x (2) , ..., x (m) } be the dataset of m voting instances. The empirical proportion of equilibria, π(G) , is the fraction of data captured as PSNE of G . This is formally defined as follows, where is the indicator function returning 1 if the condition is true, 0 otherwise: The true proportion of equilibria, denoted by π(G) , is the fraction of all joint actions among n players that are PSNE, regardless of their existence in the voting instance data. This can be expressed as: Given a set of voting instances D , the average log-likelihood of the probabilistic generative model can be written as follows. Here, KL stands for the Kullback-Liebler divergence [11,Ch 2]: Leaving the rigorous mathematical proof [26] aside, we can intuitively see how maximizing the above log-likelihood achieves maximization of the empirical proportion of equilibria π(G) relative to the true proportion of equilibria π(G) . For this, note that the first term above, KL( π(G) || π(G)) , is maximized by a game G that makes π(G) as big as possible while making π(G) as small as possible. In other words, the game should capture as much of the data as possible as PSNE while keeping its total number of PSNE as small as possible.
Furthermore, the second term, −KL( π(G) || q) becomes 0 when π(G) = q . This indicates that the optimal mixture parameter q is π(G) . This leaves learning G to maximize KL( π(G) || π(G)) as the main task because we are maximizing the log-likelihood over all choices of G and q. The main challenge here is dealing with π(G) due to the hardness of computing PSNE [30]. However, it can be shown that with high probability, maximizing a lower bound of the log-likelihood is equivalent to maximizing π(G) over all choices of G . This is equivalent to minimizing 1 − π(G) , which leads to the following loss minimization formulation: Above, the loss function ℓ represents the errors in best responses. It is easy to explain the above using the 0/1 loss function l(z) ≡ [z < 0] . Whenever any player in the l-th voting instance does not play its best response, max i ℓ x When all players play their best responses, then max i ℓ x Phillips et al. Comput Soc Netw (2021) 8:14 For practical purposes of optimization, instead of the 0/1 loss function, a continuous loss function like the logistic loss function is used.
The final optimization problem is the following: Here, m is the number of bills, ℓ is the typical logistic loss function, and ρ is an l 1 regularization parameter controlling the number of edges ||w|| 1 . That is, we prefer sparser networks if the solution quality is not degraded too much. We solve the above optimization for each sphere of legislation and obtain an influence network. While doing this, we rigorously cross validate to avoid overfitting or underfitting as described in the next section.

Cross-validation and model selection for LIG
To make use of the l 1 -regularized model, we must choose a regularization parameter ρ . High values of ρ assign a higher penalty to the number of edges in the graph and result in a sparser graph, while low values of ρ assign a lower penalty and result in a denser graph. While low values of ρ will be better fitted to the model, there is a risk of overfitting-"memorizing" the data-which results in poor predictive performance on new data.
Additionally, the number of edges must be taken into consideration because the problem of computing equilibria is NP-hard [29,30]. In fact, it is likely that an extremely complex model would have so many edges that equilibria computation would never finish within a reasonable time-frame of several days. However, an exceedingly low number of edges would lead to an under-fit model, and could not be generalized to new data. Therefore, we must pick a ρ value which strikes a balance between computation time and the risks of over-and under-fitting.
We use cross-validation (CV) to determine the effectiveness of a given ρ value. In CV, a process essential to most machine learning applications, data are partitioned into two sets: training and validation. The model is trained using the training set and then employed to make predictions against the validation set. The performance of the model is measured by the error in the training and validation set. When a model is overfit, validation error will be significantly higher than training error. When a model is under-fit, both validation and training error will be high. In CV, researchers adjust the parameters of the machine learning algorithm to create the best model which neither underfits nor overfits the data.
With large datasets, training and validation sets are often created by splitting the data in half, or holding out some smaller proportion of the data. 3 However, the four datasets generated by clustering method are too small to form informative predictions if they are further reduced by this straightforward partitioning. Instead, we used k-fold CV, which leverages re-sampling to form useful insights on small datasets. In k-fold CV, the dataset is randomized and split into k partitions. In one run of the k-fold CV, one of the k sets is chosen as the validation set, while the remaining k − 1 sets are combined to form the train set. On the next run, a different set is chosen as the validation set, and the others are used to train the model. Measures of accuracy and error from each run are averaged across the k runs. Choosing a value of k is arbitrary, but k = 10 is often used in research applications, colloquially known as 10-fold CV. We ran 10-fold CV on each sphere with 0 < ρ < 0.01 , tracking three measures of model performance: 1. Number of edges in the training set graph 2. Best response (BR) error, or the percentage of senators not playing their best response, in training and validation sets 3. q, the proportion of votes recorded as PSNE, in training and validation sets.
We chose ρ values, shown in based on the following goals: 1. The graph is sparse enough to efficiently compute equilibria 2. The model neither overfits nor under-fits the data (i.e., BR error is low, and the differences between training and validation sets for BR error and q are low) 3. The proportion of observed roll-call votes that are PSNE (q) according to the learned model is high We next present the cross-validation results for Sphere 1.
Cross-validation on Sphere 1 (Security & Armed Forces). As shown in Fig. 2, the number of edges drastically decreases until ρ = 0.000367 and then begins to decrease at a slower rate, reaching a reasonable number of edges between values of 0.002424 and 0.003455. BR error in both the training and validation set remains low until ρ ≥ 0.004 and then begins to increase, showing that the model performs well until that point. Until ρ = 0.001512 , the drastic difference between training and validation q values shows that the model is overfit, and the regression to q = 0 when ρ > 0.007014 shows that the model is under-fit. Between values of 0.002154 and 0.03455, all metrics are within an acceptable range.
While we leave the detailed cross-validation results for the other spheres to Appendix B, there are a lot of similarities among these results. Across all spheres, when ρ = 0 , the learned model is basically memorizing the training data as training error is 0, validation error is relatively high, and the proportion q of data captured as PSNE is drastically higher for the training set than the validation set. This is the overfitting regime. As ρ increases, validation and training errors begin to converge, as do the validation and training q values. At higher ρ values, validation and training errors are both prohibitively high and the learning enters the under-fitting regime. Table 3 summarizes the ρ values that we have chosen according to the three criteria listed above. We use these values of ρ to produce the LIG models used throughout the rest of the paper. . We perform tenfold CV for 0 < ρ < 0.01 . The plots for the number of edges, best response errors, and the proportion q of data that are PSNE are shown here  cross-border) edges, influence weights and thresholds, and modularity measures. We begin with cross-party edges.

Cross-party edges
The boundary between the two parties is interesting for studying polarization. Even though negative edges more often occur at the boundary, the connectivity between the two parties varies a lot according to the spheres of legislation. These are depicted in Figs. 5, 6 for Spheres 1 and 2, respectively (others are in Appendix F.4). Figure 6 shows the cross-party edges in Sphere 2 (Economics & Finance), which starkly contrasts those of Sphere 1 (Security & Armed Forces) shown in Fig. 5  Similarly, examining inter-party edges reveals that Sphere 3 (Energy & Infrastructure) is also very polarized. While there are many edges between both parties in this network, about 70% of them are negative. Positive influences come from a few sources, again including the centrist Senator Collins. Incongruously, prominent right-wing senator Tom Cotton (R-AR) also exhibits positive influences with Democratic senators. However, most other far-left or far-right leaning senators, including Sanders (I-VT) and Cruz (R-TX), only exhibit negative influences with the opposite party.
Sphere 4 (Public Welfare)'s inter-party edges strike a balance between the polarities exhibited by the previous three spheres. There are slightly more positive edges (9) than negative edges (7), but still a low number of edges overall. Again, there are positive influences between Maine senators King (I-ME) and Collins (R-ME), but also positive influences between Senator McConnell and Democratic senators King (D-ME) and Tester (D-MT).
Overall, each sphere exhibits some level of polarization, but some spheres are far more polarizing than others. Some senators are present in every sphere's inter-party boundary, whether for positive or negative influences. Maine Senators Collins (R) and King (D) often share positive influences with each other, as well as other senators. Senator Lee (R-UT), a conservative libertarian, always exhibits negative edges with members of the other party, although in Sphere 1, he also shares positive influences with senators Harris (D-CA) and Feinstein (D-CA). Meanwhile, left-wing icon Bernie Sanders (I-VT) exhibits the equivalent behavior, with only negative cross-party edges in all spheres except Sphere 1. These results suggest that Sphere 1 (Security & Armed Forces) is least polarized, whereas Spheres 2 (Economics & Finance) is highly polarized.

Influence weights and thresholds
We now take a closer look at the influence weights and thresholds of the machine learned models, beyond just the cross-party edges. Republican-to-Democrat (note that the edges are directed). In each plot, the histograms for the four spheres are superimposed for the purpose comparison. For the intra-party edges (D-D and R-R), Spheres 2, 3, and 4 have very similar histograms and they are different from the histogram of Sphere 1 (Security & Armed Forces). At the peak, the number of intra-party edges in Sphere 1 is dominated by the other spheres. However, for higher edge weights, Sphere 1 dominates the other spheres. This indicates that there are stronger D-D and R-R influences in Sphere 1 compared to the other spheres, which in turn may indicate more polarization in Sphere 1. Interestingly, if we look at the crossparty edges (D-R and R-D), we can see that Sphere 1 again dominates the other spheres in the positive influence weights regime. Note that in the bottom row of Fig. 7, the peak of Sphere 3 dominates that of Sphere 1, but Sphere 3's peak is in the negative influence regime, whereas Sphere 1's peak is in the positive influence regime. 4 All of these indicate that there are more positive influences within and across the two parties in Sphere 1 compared to the other spheres, which contributes to Sphere 1 being less polarized. Of course, the influence weights cannot be read alone without considering thresholds because the game-theoretic model accounts for both of these in predicting stable outcomes. Recall that the threshold magnitude signifies stubbornness or resistance to influence. More positive threshold values resist positively weighted influences by leaning to play −1 in the presence of (1) positive influence from those playing 1 and (2) negative influence from those playing −1 (in both cases, a neighbor's action times the influence from that neighbor is positive). More negative threshold values resist negatively weighted influences in a similar fashion. Figure 8 shows the threshold histograms for the two parties. The most interesting aspect of these histograms is that for both Democratic and Republican senators, the threshold distribution is "flatter" in Sphere 1 (Security & Armed Forces) compared to the other spheres. This indicates that for both parties, the thresholds are more "uniformly distributed" in Sphere 1 than in the other spheres. In contrast, in Spheres 2 (Economics & Finance), 3 (Energy & Infrastructure), and 4 (Public Welfare), the threshold values of each party are concentrated in one region, which indicates the similarity among the senators belonging to the same party. Together with negative cross-party edges and positive intra-party edges, this contributes to polarization in these spheres. While Fig. 8 shows the histogram of each party for different spheres, Fig. 9 makes a comparison of the histograms of the two parties for each sphere separately. The contrast between the two parties is not as remarkable as the contrast among the four spheres for any party.
As a final note, we emphasize that the threshold values on their own lack sufficient predictive power. In fact, the main component of the LIG model is the interdependence among the senators' actions through the influence structure. Having said that, if a sphere is overwhelmingly dominated by bills sponsored by one of the two parties, then it is possible that the machine learning algorithm would assign low threshold values to the senators of that party (that is, those senators would be predisposed to voting yea). 5 Even then, the influence weights would play a role in predicting the stable outcomes. Investigating this issue using sponsorship and co-sponsorship data is an interesting future direction.

Modularity
Furthermore, a formal study of polarization rooted in network science produces similar results. Modularity [23,41,42] has been widely used as a measure of polarization in networks. We apply the following definition of modularity derived for directed networks with signed weights [20]: , and 2w ± is the total weight of all positive or negative edges, expressed by i j w ± ij . Furthermore, w ±,out i is the weighted out-degree k w ± ik and w ±,in j is the weighted in-degree k w ± kj . The Kronecker delta function δ C i , C j is 1 if i and j belong to the same party; it is 0 otherwise.
Applying this definition, We obtain the following modularity scores for the four spheres of legislation respectively: 0.7861, 0.8904, 0.8724, and 0.8857 (see Table 5). This shows that Sphere 1 (Security & Armed Forces) is least polarized and Spheres 2 (Economics & Finance), 3 (Energy & Infrastructure), and 4 (Public Welfare) are much more polarized.
It is important to note that modularity does not always indicate polarization. As Guerra et al. [22] show, there are networks that exhibit community structure despite not being polarized. However, in our case, we are not investigating whether Congress is polarized or not. Polarization in Congress is already a settled matter [15]. We are rather investigating to what degree Congress is polarized based on the spheres of legislation. Furthermore, our analysis of cross-party edges ("Cross-party edges" section) resonates with Guerra et al. 's main idea that in a polarized network, the nodes at the border are on average more connected inside their own community than outside.

Most influential nodes in context
There exists a number of centrality measures that are derived from a structural analysis of networks [31]. However, our model is behavioral where nodes adopt their best responses to each other. In a strictly game-theoretic model of behavior, a set of nodes will be called most influential with respect to achieving a desirable stable outcome if their choice of actions leads the whole system of influence to that desirable stable outcome [29,30]. Here, a crucial aspect is a desirable stable outcome, represented by a PSNE. For example, let us say that our desirable stable outcome is to pass a bill by a 100-0 vote. A set of senators will be called most influential if their voting together influences every other senator to also vote for the bill, thereby having the desirable stable outcome as the unique PSNE outcome. This concept can be extended to other types of desirable stable outcomes like blocking a bill unanimously, passing a bill with at least 60 votes, forcing/ avoiding a filibuster, etc. When there are multiple most influential sets, we naturally prefer smaller sets of most influential nodes. The above concept of most influential nodes is centered around stable or PSNE outcomes. As we will see in "Computing most inuential nodes" section, it requires computation of all PSNE. We next outline how we compute all PSNE for each sphere of legislation.

PSNE computation
Once the LIG is instantiated by the machine learning algorithm (see "Machine learning" section), we can compute the set of all PSNE using the algorithm described in [30]. This is a backtracking search algorithm which takes advantage of the graph's structure. We give a brief overview below.
The algorithm begins by selecting the node with the highest out-degree-the node which directly influences the most other nodes-and assigns it the action 1. It progressively selects new nodes and assigns them the action 1 until all nodes are assigned actions without any contradiction (indicating a PSNE) or it encounters a contradiction that guarantees that there is no PSNE with the actions assigned so far. It then revisits the most recent node and changes its action from 1 to −1 . After this, the algorithm again tries to make progress. In general, at any stage of the algorithm, we have a partial joint action, which is the action (1 or −1 ) of each node selected so far. If some node in the network is not playing its best response, the partial joint action cannot lead to a PSNE. When this occurs, the algorithm tries a different action for the most recently selected node v if it has not already done so. If trying a different action for v still leads to a contradiction, the algorithm backtracks by deselecting v and changing the action of the node that had been selected before v. When every node is playing their best response with respect to each other, we have reached a PSNE. Importantly, the algorithm always tries to reach a contradiction so that it can reduce the overall computation time by pruning large parts of the search tree. This process repeats until all possible PSNE have been found.
For each sphere, we ran the algorithm on Bowdoin College's high-performance computing (HPC) grid. The number of PSNE created for each sphere's LIG given our chosen ρ values are summarized in Table 4. Note that the number of PSNE is a tiny fraction of the 2 103 possible joint actions. These sets of all PSNE are necessary to compute the most influential senators, which we describe next.

Computing most influential nodes
Algorithmically, the most influential nodes problem asks for selecting a minimum set of nodes, such that when they choose their actions according to the desirable stable outcome (e.g., voting yea when the desirable stable outcome is passing a bill unanimously), the desirable stable outcome becomes the only possible PSNE. An approximation algorithm for computing most influential senators was given by Irfan and Ortiz [30], which produces a directed acyclic graph (DAG). The algorithm requires precomputation of all PSNE, which is a provably hard problem [30]. We apply Irfan and Ortiz's PSNE computation algorithm to the LIG for each sphere of legislation. Having computed all the PSNE, we then compute the DAG representing most influential sets of nodes. Figures 10, 11 show the results of the most influential nodes algorithm for Spheres 1 and 3, respectively, where the desirable stable outcome is to achieve the most number of yea votes possible in any PSNE (that is, to gain the most support possible from the legislative body according to our model). The way to read Figs. 10, 11 is to inspect each DAG and find a top to bottom path. Each of these paths gives a most influential set.
The sets of most influential senators in each sphere support the inferences gained from analyzing the LIG networks. As illustrated in Fig. 10, in Sphere 1 (Security & Armed Forces), 4 Republicans and 4 Democrats comprise a set of 8 most influential senators. In other words, 8 senators and, more importantly, the balanced bipartisan groups of 8 senators shown in Fig. 10 are sufficient to generate the maximum possible support for a bill in Sphere 1. As shown in Fig. 11, in Sphere 3 (Energy & Infrastructure), 5 Republicans and 6 Democrats comprise a set of 11. This suggests that Sphere 3 is more polarized than Sphere 1, since it requires a larger body of influencing senators. The DAGs for the other spheres are shown in Appendix E.

Game-theoretic vs. structural centrality measures
In the above game-theoretic formulation of most influential nodes, we find that each set of most influential senators across all spheres is comprised of an (almost) equal  Fig. 10.
Here, 5 Republicans and 6 Democrats are most influential number of Democrats and Republicans. This signifies the need for bipartisan support to guarantee passing a bill with the maximum possible support under the PSNE constraints. As we show next, this also happens to be a distinguishing feature between game-theoretic and structural measures. Table 5 shows various centrality measures and other quantities computed for each sphere. First, measures like diameter, average shortest path length, and clustering coefficient reveal some, but not many, differences among the spheres. The network diameter of Sphere 3 (Economics & Finance) is 4, and the network diameter of every other sphere is equal to 5. The average shortest path lengths between all four spheres are similar to one another, ranging between 2.2295 and 2.5476. Being close to half the size of the network diameter, these values suggests that most nodes in the network are well connected, though not all. The average clustering coefficient is a measure of the density of triangles in a network. In more polarized networks, we might expect this value to be high because senators who are closely aligned on partisan issues would be well connected with each other. In each sphere, the average clustering coefficient is similar, but lower in Spheres 1 and 3 (0.0187 and 0.0174, respectively) than in Spheres 2 and 4 (0.0206 and 0.0218, respectively). These measures, however, do not give a direct indication of polarization, at least not as much as the modularity measure. We discussed the modularity values in "Polarization in context" section.
We now focus on the widely applied structural measures of centrality. For each sphere, we show the top 10 most central senators with respect to four centrality measures: degree, closeness, betweenness, and eigenvector. The simplest measure is degree centrality, or the number of nodes each node is connected to (normalized by the maximum possible degree, N − 1 or 102 in our case). The next form is closeness centrality, or how close a node is, on average, from every other node in the network. The third form is betweenness centrality, which is the average number of times the node is present along the shortest path from any other two nodes. The final form is eigenvector centrality, which has a self-referential definition accounting for the centrality of a node's neighbors.
Most notably, these centrality measures do not capture the strategic aspects of behavior. Throughout most measures, Republican senators are overrepresented, comprising the majority of the top ten most central nodes. In contrast, the game-theoretic measure gives a balanced coalition between Democrats and Republicans. This is important because when networks are polarized, achieving a desirable stable outcome requires support from both sides.

Toward richer models: ideal point models with social interactions
We also apply a richer model of influence recently proposed by Irfan and Gordon [28] that extends the LIG model by incorporating ideal points of senators and polarities of bills. Their work showed the value of combining game-theoretic and statistical models for studying strategic interactions in context, but they assume the network to be fixed, regardless of the bill context. We use their model and allow the network to change based on the spheres of legislation. We also perform an analysis of the networks learned.  We start with an overview of how Irfan and Gordon's model [28] builds on the political science literature on ideal point models [8,13,32,45,46,48]. Ideal point models are predictive statistical models that assign each senator i an ideal point p i signifying the senator's legislative position. Usually, more negative values of p i mean more liberal position and more positive values mean more conservative. Similarly, each bill l is also assigned a polarity a l signifying the position of the bill in the liberal to conservative spectrum. There is a third model parameter called the popularity r l of bill l representing the fraction of senators supporting the bill. The ideal point model in its most basic form defines the probability of senator i supporting bill l using the following logistic function σ: The ideal point model captures the interdependence among the senators using the r l term. However, this term is an aggregate measure quantified by the number of senators voting yea on bill l. In ideal point models with social interactions, Irfan and Gordon expand this aggregate measure by considering how the individual senators are voting and how their votes influence each other [28]. The resulting model is game-theoretic with p(x i,l = yea | p i , a l , r l ) = σ (p i a l + r l ). Betweenness ( the following influence function. Here, other than the new terms l, p i , and a l defined in the previous paragraph, the rest of the terms are the same as those in Definition 3.1: Using the above influence function, the richer game-theoretic model is defined in the same fashion as "The LIG model" section. As a cautionary note, the way Irfan and Gordon's model [28] combines networks with ideal points makes it difficult to disentangle the two. Analyzing the networks alone may be inconclusive because ideal points also supply the model with predictive power. Moreover, the machine learning algorithm learns these two components simultaneously. With this caveat in mind, we give an analysis of the networks and the ideal points learned. Analysis of influence networks.  Figs. 3, 4). Second, a closer look at the cross-party edges shows that there are a lot more negative edges between the two parties under this richer model than there are under the LIG model. We show the cross-party edges for Spheres 1 and 2 in Figs. 14, 15, respectively (others are in Appendix F.4). These two differences can be attributed to using ideal points to discriminate the behaviors of opposing senators.
Polarization metric based on modularity. The modularity framework discussed in "Polarization in context" section yields scores of 0.5392, 0.6801, 0.6887, and 0.6229, We have also applied a recently proposed measure called polarization index [39]. Inspired by the electric dipole moment, the polarization index is measured from an  Fig. 4 where the two parties are separated to a great extent opinion distribution, where opinions propagate from a set of elite entities (e.g., influential politicians and media accounts on Twitter) to listener entities (e.g., ordinary individuals on Twitter). The measure is based on opinion distribution (as opposed to dynamics).
Here, we apply it to the machine learned ideal point distribution. 6 We use the following definition of the polarization index µ , where A represents the difference between the fraction of Republicans and Democrats and gc + and gc − represent the gravity centers of the Republican and Democratic senators' ideal points, respectively:     [40] presented before due to the similarity between the two definitions in our case.
We conclude this section by reiterating an earlier point. Investigating the influence networks and ideal points separately does not give us the complete picture, since the model combines these two components together to make predictions. Therefore, we should also combine them in a meaningful way to infer polarization. We leave this as future work. We also leave open an exploration of the most influential nodes problem under this richer model.

Concluding remarks and research outlook
In this paper, we have studied the linear influence game (LIG) model in the context of four spheres of legislation. We have done a thorough network analysis of the machine learned models for each sphere. Our analysis shows that contrary to the popular notion that the U.S. Congress is overly polarized these days, the measure of polarization varies according to the spheres of legislation. In fact, the two opposing parties tend to come together when dealing with bills in Sphere 1 (Security & Armed Forces). Therefore, the notion of polarization should be contextualized with respect to the spheres of legislation.
We have also shown that across all the spheres, the LIG model predicts that a set of most influential senators consists of a bipartisan coalition (which also differentiates game-theoretic and structural centrality measures). Despite this shared property among the four spheres, the number of senators required to form a most influential set varies. Sphere 1 happens to require the least number of senators in its most influential set to achieve a desirable outcome of garnering the maximum support possible for a bill (under PSNE constraints). Again, this signifies that Sphere 1 is least polarized among the four spheres.
In sum, the consideration of different spheres of legislation reveals interesting aspects of polarization and most influential senators in Congress. Building upon this study, following are some interesting future directions.
1. The most pressing task is to fully explore the ideal point model with social interactions [28] for different spheres of legislation. We have briefly touched upon it in "Towards richer models: ideal point models with social interactions" section. However, as we have mentioned in that section, finding a behavioral definition of polarization that can meaningfully combine different constituent parts of the model, such The distance between the mean ideal points of the two parties is 1.235, which shows more polarization compared to Sphere 1 shown in Fig. 16 as ideal points of senators, polarity of bills, influence weights, and threshold values, remains an open problem. 2. In a similar vein, the notion of context provides another interesting direction. In this paper, we use spheres of legislation as a contextual platform for learning, analyzing, and comparing influence networks. The main idea here is that depending on the sphere, the influence network would be different. In contrast, the ideal point model with social interactions also has a contextual element in terms of polarities of bills and ideal points of senators, but it keeps the network fixed. How we can synthesize these two diverging ways of capturing context and thereby give a deeper meaning to context remains open. 3. A detailed comparative study of the most influential nodes for different spheres under the richer model [28] is another interesting direction. In particular, what happens to the balanced, bipartisan composition of most influential sets under the LIG model (see "Most inuential nodes in context" section when we incorporate additional contextual parameters like ideal points and polarities? 4. On the computational front, Irfan and Gordon [28] showed promising results on improving the time required to compute all PSNE. Extending those results to the spheres of legislation setting is another promising direction. It would also be interesting to investigate why their model leads to drastic improvement in computational time. 5. Considering different modeling frameworks is yet another exciting direction. A particularly promising framework is probabilistic graphical models (PGMs). Whereas we are currently constructing the spheres of legislation first and then learning the LIG models for each sphere, PGMs may allow us to do both at the same time. This approach would not require us to split the data. Finally, exploring the recently proposed semi-supervised learning for studying polarization [25] in game-theoretic settings is also an interesting direction.
In addition to the above open directions in the context of legislative chambers, the LIG model may also be applied to other settings where network-connected individuals exhibit influence or behavioral interdependence. Some examples in the public health domain are smoking [7] and obesity [6]. Other promising areas include smart electricity grids, vaccination, and the adoption of microfinance.

Appendix A: literature review
We first review the literature on models and algorithms. We then review the literature on polarization.

A.1 Models and algorithms
While the study of influence in networks is very broad [16], we focus on models and algorithms for game-theoretic settings. Irfan and Ortiz propose Linear Influence Games (LIGs) [30], a type of 2-action graphical game [35]. In an LIG, every node (or player) represents an individual with a binary action (1 or −1 ) and a threshold level representing their "stubbornness. " There is an underlying network structure among the nodes.
The weight of each edge (u, v) represents the amount of influence that node u exerts on node v. These influence weights may be positive or negative, 7 and are not required to be symmetric (meaning that node u may exert more or less influence on node v than it receives). The best response of a node depends on its threshold level and the net influence on it. The net influence is found by calculating (1) the sum of all incoming influences from nodes playing action 1 and (2) the sum of all incoming influences from nodes playing action −1 , and then subtracting the second sum from the first. If this net influence exceeds the node's threshold, the best response for that node is 1; if it does not, the best response is −1 . In the case of a tie, the node is indifferent between the two actions.
Instantiating an LIG, then, requires a matrix of influence weights W ∈ R n×n and a threshold vector b ∈ R n . Each outcome of an LIG is a joint action x , which is basically a vector of actions of all players. For every individual player i, x −i is the vector of all actions except the action of i. Each player has an influence function . A joint action x * is a pure-strategy Nash equilibrium (PSNE) when every individual is playing their best response x * i -that is, when no player has an incentive to unilaterally deviate from their chosen action. With the United States Senate as an example, each node is an individual senator, and each edge is the influence that a senator has upon another senator. A senator will vote yea (1) if their threshold has been met given all incoming influences from other senators, or nay ( −1 ) if not. When all senators are playing their best responses in x , the system is stable, and the network is in PSNE. The LIG model is further explained in "The LIG model".
While the matrix of influence weights and vector threshold values necessary for instantiating an LIG could be generated manually for very small instances, Honorio and Ortiz develop a method of instantiating an arbitrarily large LIG from raw, binary-action data via machine learning [26]. Only voting records are made available to the learning program; no other information is involved. Given these data, the program generates the influence weights w and influence thresholds b which define a game G. The program seeks to instantiate an LIG where a high proportion of real-world data is accurately reflected as PSNE, without allowing so many PSNE that any joint action would be in equilibria. Finding the number of ground-truth joint actions represented as PSNE is computationally easy, but computing the total number of PSNE in a game is NP-hard, and therefore infeasible on large datasets. By proving a number of simplifying assumptions, they approximate the problem using convex loss minimization. In this function, parameters of the game are chosen so that the average error-the proportion of groundtruth joint actions which are not reflected as PSNE-is minimized. This algorithm is explained in "Machine Learning" section.
The majority of research in analyzing and predicting legislative votes has not been in the game theory space. Rather, roll-call data are most often used in ideal point models, which estimate the ideal point of a legislature upon a scale of conservative to liberal extremes. Clinton et al. proposed Bayesian methods for ideal point estimation, which can be solved using Markov Chain Monte Carlo (MCMC) simulations [9]. In contrast to prior methods, this MCMC-calculated Bayesian method is computationally efficient at large scale; other methods required small populations or made statistical compromises in order to be feasible. Regardless of methods used, ideal points range on an arbitrary scale of negative to positive. In practice, a negative ideal point represents "liberal" polarity, while a positive ideal point represents a "conservative" polarity. Their work is widely cited in later ideal point models which expand upon the original concept.
While the importance of roll-call data is widely recognized, it is also recognized that each vote is a member of a broader context with important characteristics. Grerish and Blei extend the traditional ideal point model, which relies solely on roll-call data, to account for the topics of bills [18]. Using a Latent Dirichlet Allocation (LDA) topic model, Gerrish and Blei integrate bill topics and political tone into their ideal point model. LDA topic models identify patterns in words, but labeling and interpreting these patterns are left up to the researchers. These bill topics may be, for example, national recognition ("people", "month", "recognize", "history", "week", and "woman") or healthcare ("care", "applicable", "coverage", "hospital", and "eligible"). They find that the model performs especially well when bills have bipartisan support or disapproval, or when bills face clearly partisan support and disapproval, but lose accuracy when bills receive mixed, nonpartisan support. Topic modelling is not the only method of inferring bill topics: The Congressional Research Service (CRS) assigns subject codes to every bill, out of close to a thousand possible codes. In an ensuing study on ideal point models, Gerrish and Blei note that using CRS subject codes rather than an LDA topic model also provided a good basis for their ideal point model [19].
In a recent paper, Irfan and Gordon add context to the LIG models [28]. By combining social interactions and context, they develop a model which performs better than the purely behavioral model. They learn the ideal points of each senator while learning parameters for the LIG, and account for disparities in polarity across bill topics by utilizing the subject codes of each bill. They expand the influence function of every senator i to include the ideal point of that senator ( p i ), and the polarity a l of a bill l. The product of these two terms is added to the otherwise unchanged influence. When the signs of the polarity of the bill and the ideal point of the senator are the same (e.g., −1.5 and −0.5 , meaning that both are liberal leaning), the signs cancel, increasing senator i's payoff for voting yea; when they differ, a negative value is added, decreasing senator i's payoff for voting yea.
Some researchers have taken other approaches to modeling congressional behavior. Woon utilizes both ideal points and game-theoretic concepts to analyze how bill sponsorship and co-sponsorship affect the content senators write in a bill [51]. Woon argues that, when sponsoring a bill, legislators balance two opposing forces. One pushes them toward writing median language because they want a bill to pass without complications, and the other toward writing highly polarized language because they wish to signal their beliefs to their constituents. As such, a legislator L will propose a bill with location y within a one-dimensional policy space. They also consider that another legislator, P, will be pivotal in allowing a bill's passage. That pivotal senator may choose either y or the status quo, q. P's choice is known as the policy outcome and is denoted by x. The passage of a bill depends on senator L and P's utility functions, which consider the distance between x and the ideal points of L and P, respectively. In addition, L's utility function also considers the weight w that L places on being close to y, which is known as L's position-taking. Woon extends the model to account for co-sponsorship of other legislators, each with their own utility functions. While our research focuses on legislative votes rather than policy proposals, Woon's research affirms the validity of combining contextual data and game-theoretic models, and puts forth bill sponsorship and co-sponsorship as another direction of future research.
Bill sponsorship and co-sponsorship is not the only method by which legislators may signal their preferences for a bill prior to voting. Desmarais et al. build upon prior bill co-sponsorship research to introduce co-participation in press events-called the joint press events network-as an indicator for voting behavior [14]. Using linear regression, they show a statistically significant positive relationship between press event co-participation and roll-call votes. While not focused on the computational aspects of congressional research, this study highlights the observation that "[l]egislation is often the end product of a lengthy collaborative effort. " Studies like this attempt to uncover ostensibly hidden mechanisms within that lengthy effort. This process starkly contrasts to the behavioral, game-theoretic approach, which makes no assumptions about the underlying mechanism or process, viewing them instead as a "black box". This lack of assumptions is one of the key benefits of the game-theoretic approach.
Recently, a group of mathematicians took a very different approach to analyzing congressional voting networks from roll-call data. Glonek et al. introduce the Graph Labeling Semi-Supervised (GLaSS) method [25], a random-walk-based graph labeling method. They model both the House and Senate (from 1935-2017, in different trials) as a graph from roll-call data, where nodes are Democratic or Republican legislators (other parties are ignored), and their labels correspond to their parties. While every senator's party affiliation is known for validation purposes, the only labelled nodes in the graph are the Democratic and Republican party leaders; all other nodes are unlabelled. With the GLaSS method, those nodes are labelled based on the expected time to absorption in a discrete-time Markov chain (DTMC), where absorption states are labelled nodes (i.e., party leaders) and transient states are unlabeled nodes (i.e., other senators). By comparing the labels generated by the GLaSS method to the ground-truth labels of legislator, they measure polarization in Congress. When party affiliation can be accurately predicted by voting trends, Congress is more polarized; when there is some uncertainty, it is less so. Their results show that the U.S. Congress has become remarkably polarized in the past decade, with the model able to accurately predict every senator's affiliation in each term of Congress since 2007. In contrast to Glonek et al. 's stochastic process-based approach, we model strategic interactions among senators in a game-theoretic fashion that allows us to infer joint behavioral outcomes. Additionally, Glonek et al. 's method relies on a model of binary party affiliation and considers nodes as labeled only by party affiliation rather than named as individual senators, which prevents further analysis of the model's network structure.

A.2 Literature review: polarization
While modularity [23,41,42] has been widely used as a measure of polarization in networks, it is often not a definitive measure. Guerra et al. presents a novel metric based on the edges incident on the boundary nodes [22]. Like most other metrics of polarization, their metric is also structural in the sense that it does not take into account potentially different network structures among the same population induced by different behavioral contexts. One of the main goals of this paper is to analyze polarization within behavioral context.
Closely related to this paper is Waugh et al. 's work on polarization in Congress [52]. They first compute a weighted network among the members of Congress by counting how many times each pair of members voted the same way. They then compute the modularity of this network as a measure of polarization. Their work can be contrasted with McCarty et al. 's ideal point-based approach [40], where the absolute value of the differences in mean ideal points of the two parties serves as a measure of polarization. In fact, our approach may be mistaken as a combination of these two approaches. First, we do compute influence networks among the senators, but these networks are learned from behavioral data. Moreover, there are positive as well as negative edge weights in our networks, whereas Waugh et al. 's networks have only non-negative edge weights by definition [52]. Second, the richer model [28] which we use combines influence networks with ideal points in such a way that we cannot talk about either networks or ideal points in isolation of the other.
Zhang et al. [53] study polarization in the U.S. Congress, the same setting as ours. However, theirs is based on co-sponsorship networks, which is observed from data. In contrast, ours is based on networks of influence, which have been learned using roll-call and bill-text data. Furthermore, one of the central aspects of our work is to show that polarization in Senate varies according to the spheres of legislation. We do not touch on the rise in polarization in Senate over time, which by now is a well-settled matter [15].
Behavioral aspects of polarization among political parties have been studied before, but at an empirical level. Garcia et al. analyze multiplex networks consisting of comments, likes, and supports levels among multiple political parties in Switzerland [17]. In contrast, ours is a model-based approach where polarization can be considered an inference question.
At a broader level, there have been numerous studies on political polarization. The edited volume by Hopkins and Sides [27] presents a comprehensive treatise from three different perspectives: why American politics is polarized, how it became polarized, and what we can do about it (including whether the alternatives are any better). As a specific example, Conover et al. [10] give evidence of polarization in Twitter network based on retweet networks. Interestingly, the opposite happens in mention networks (where ideologically opposing individuals mention each other to start conversations).
Not surprisingly, Twitter provides a trove of data that has been used in several other studies. Notably, Morales et al. [39] give a framework to estimate polarization index using a model of opinion generation. Unlike other generative models of opinion propagation [49], their focus is on the distribution of opinions and not the dynamics of opinions. We briefly reviewed their model in "Toward richer models" section. One major difference between Marales et al. 's work and ours is how we get to the behavioral distribution (or PSNE in our case). In our models, we do not have predefined elite and listener nodes and do not perform DeGroot-style iterative updates [12]. Furthermore, the complexity of interdependent actions in a PSNE and the multiplicity of PSNE make a direct application of polarization index to our setting challenging (see Footnote 6).
Whereas Morales et al. apply polarization index to a case study of tweets in the aftermath of Venezuelan leader Hugo Chávez's death, their basic idea has been generalized to any Twitter topics by Garimella et al. [21]. Of course, there are methodological differences between the two studies. Garimella et al. 's random walk-based algorithm to measure polarization is promising for large-scale networks. In contrast to these studies, we use machine learning to learn the networks of influence from voting data. Also, our behavioral model is strictly game-theoretic.
There has also been some interesting work on the behavioral choice of individuals in a polarized environment. Bakshy et al. [4] use large-scale Facebook data to show that the consumption of politically "hard content" is largely controlled by individuals' own choices and not by algorithmically fed news rankings.
On the computational side, algorithmic approaches to polarization extend beyond modularity. Al Amin et al. [1] give a matrix factorization-based algorithm to uncover polarization in Twitter networks.

Appendix B: Detailed cross-validation results
In this section, describe the cross-validation results on learning LIG models for Spheres 2, 3, and 4.
Cross-validation on Sphere 2 (Economics & Finance). The number of edges decreases smoothly, reaching a reasonable number of edges when 0.002424 ≤ ρ ≤ 0.006236 . Best response error converges for both training and validation sets around ρ = 0.001512 , and remains at an acceptably low value until ρ ≥ 0.007017 . Until around ρ = 0.001512 , the high values of training q relative to validation q show that the model is overfit, and when ρ > 0.006236 , q's regression to 0 shows that the model is under-fit. Within this range, both training and validation q are relatively high, at around 0.22. The acceptable range, then, is between 0.002424 and 0.006236 (Fig. 18). ). The number of edges again decreases smoothly, and is reasonable when 0.002728 < ρ < 0.005541 . Best response error for training and validation sets converges and remains low when 0.001914 < ρ < 0.006236 . As the large difference between training q value and validation q illustrates, the model is overfit until q ≥ 0.001061 . When ρ > 0.005541 and the q value starts to decrease, the model is under-fit. The acceptable range is between 0.002728 and 0.005541 (Fig. 19).

Cross-validation on Sphere 3 (Energy & Infrastructure
Cross-validation on Sphere 4 (Public Welfare). The number of edges again decreases smoothly, and is reasonable when 0.002728 < ρ < 0.005541 . Best response error for training and validation sets converges and remains low when 0.001914 < ρ < 0.006236 . The high training q relative to validation q shows that the model is overfit until ρ ≥ 0.001701 , and remains steady until the model begins to become under-fit when ρ ≥ 0.006236 . The acceptable range is between 0.002728 and 0.005541 (Fig. 20).   Error (Best Response)   G G G G G G G G G G G G G G G G G G G GGGGG G G G G G G

F.3 Visualization of networks
In this section, we show the networks for Spheres 3 and 4 learned under the ideal point model with social interactions (Figs. 37, 38). The networks for the first two spheres have been shown in Figs. 12, 13 within the main body of the paper.

F.4 Visualization of cross-party edges
In this section, we show Graphviz visualizations of edges which connect members of opposite parties within the top 40% of all edges when we apply the richer model to Spheres 3 and 4 (Figs. 39,40). Figures 14, 15, shown in the main body of the paper, illustrate the cross-party edges for Spheres 1 and 2, respectively. Table 7 shows the results of network analysis under the ideal point model with social interactions. This table can be compared with Table 5 shown in the main body of the paper. Table 8 shows the distance between the mean Democratic and Republican ideal points for each of the four spheres.   Following are the notable additions to the main body of the paper: "Preparing congressional roll-call data" section; LIG example and an example for each definitions in "The LIG model, Cross-validation and model selection for LIG, Influence weights and thresholds, PSNE computation" sections: Pure-Strategy Nash Equilibria (PSNE) Computation; and many figures. In addition, there are light revisions all throughout the conference version, but the following parts went through significant revision: "Learning Algorithm, Game-theoretic vs. structural centrality measures and Towards richer models: ideal point models with social interactions" sections. We also include an Appendix with a detailed literature review and many figures and tables. Betweenness (