Node-weighted centrality: a new way of centrality hybridization

Centrality measures have been proved to be a salient computational science tool for analyzing networks in the last two to three decades aiding many problems in the domain of computer science, economics, physics, and sociology. With increasing complexity and vividness in the network analysis problems, there is a need to modify the existing traditional centrality measures. Weighted centrality measures usually consider weights on the edges and assume the weights on the nodes to be uniform. One of the main reasons for this assumption is the hardness and challenges in mapping the nodes to their corresponding weights. In this paper, we propose a way to overcome this kind of limitation by hybridization of the traditional centrality measures. The hybridization is done by taking one of the centrality measures as a mapping function to generate weights on the nodes and then using the node weights in other centrality measures for better complex ranking.

are termed as hybrid-centralities and few of them are summarized in section on related works.
In the last two decades, a major portion of the interdisciplinary work evolved just applying these measures to extract information from underlying network data. Each of the proposed centrality measures ranks nodes in a network based on a specific structural attribute, making them application specific. Due to this reason, choosing an appropriate centrality for a given application has been a key issue in network analysis. A major portion of the research work done in this direction is concerned with selecting the best of the available measures for a particular application or defining a new measure that outperforms the existing ones.
Most of the centrality measures proposed in literature were first defined for unweighted graphs, i.e., all of the nodes and all the edges were assumed homogeneous in the beginning of centrality computation. We refer these measures as unweighted centrality measures. After realizing the existence and understanding the necessity and importance of weights on the edges, centrality measures for unweighted graphs were extended to edge-weighted centrality measures. These measures take weights on the edges into consideration for ranking the nodes while analyzing the networks, but still assuming equal weights on the nodes. A substantial part of the present-day research in the analysis of weighted networks considers only edge weights to determine the topological significance of nodes [5].
We call a network with nonuniform weights on both: edges and nodes, as a fully weighted networks. Several such networks surround us. The weights on the nodes in fully weighted networks can be understood as some sort of mapping of the characteristics or attributes of the nodes to some real value. At times, these weights can also be dependent on the structure around the nodes. Let us understand the possibility of existence of weights on the nodes with the help of some popular networks around us. In these networks weights on the edges have already been discussed above. Therefore, here, we only discuss regarding the existence of nonuniform weights on the nodes.
1. Friendship/social networks: in this type of networks, nodes represent persons and edges represent the friendship relationship between the considered set of persons.
Here, weights on the nodes can be understood as a mapping of wealth, power, education level, or some other attribute of persons. It is notable that existence of two persons with identical attributes is highly unlikely and therefore all person's attributes can be mapped to different real values based on the application specific mapping. 2. Public transit networks: road networks, train networks, metro rail networks, airline networks are some of the major public transit networks. In these types of networks, nodes represent locations (place) and edges represent direct connected traveling medium between the locations. In such a network, weights on the nodes can be understood as a mapping of the population, frequency of commuters, development status, popularity for tourism, etc., of the location represented by that node. 3. Communication networks: in communication networks, nodes are represented by the communicating devices and links represent the direct (wired or wireless) connection between these devices. Weights on the nodes in such networks can be understood as the role, cost, location of these devices.
In most of the studies done so far on fully weighted networks, while considering weighted edges, weights on the nodes were completely ignored. Meanwhile, only a little work has been accomplished while considering the weights on nodes in the realworld networks [6][7][8][9]. The main contribution of this paper is to motivate the analysis of networks while considering the weights on the nodes. In order to overcome the challenges in figuring out the weights on the nodes, we propose to use appropriate centrality measures to generate the weights and then further use these weights in the node-weighted centrality measures to analyze a given network. We give two applications based on this principle of hybridization.
In the next section, we provide the basic notations and provide definitions of traditional unweighted centrality measures. Two toy examples pointing out and motivating the necessity of considering weights on the nodes is discussed after that. Next, node-weighted centrality measures and related works are summarized. Afterwards, new hybridization of centrality measures are introduced for solving two complex computational problems in networks based on the definitions of node-weighted centrality measures. The experimental comparison between the newly proposed hybrid centrality measures and traditional centrality measures on real-world networks is comprised in the respective sections on problems. Finally, we conclude and discuss the prospective future directions.
It is an expanded and extended version of the results that appeared in CSoNet 2019 [1]. The conference version titled "Hybrid Centrality Measures for Service Coverage Problem" discussed a hybridization of centrality measures to solve Service Coverage problem based on the formulation for node-weighted centrality measures. In this version, we have extended the concept of this type of hybridization to solve another problem called spread maximization problem. Due to this, the structure of the paper has been revised and this version is focused on motivating hybridization of centrality measures based on the formulation of node-weighted centrality measures. Application of the proposed measures to more than one problem in this version exhibits the potential for this kind of hybridization and motivates for several open directions in this area.

Unweighted centrality measures
This section first defines the basic notations used through out the paper. Then, we briefly summarize traditional centrality measures; degree, closeness, betweenness, and eigenvector. We also describe in brief harmonic centrality that is highly correlated to closeness centrality.
Let G = (V , E) be an undirected unweighted network, where V is the set of nodes and E is the set of links. Let the number of nodes in the network, i.e., the cardinality of set V be |V | = n and the number of links, i.e., the cardinality of set E be |E| = m . Let A be the adjacency matrix of size n × n in network G, where entry a ij denotes whether there is a link between node i and node j. a ij = 1 if there exists a link between node i and node j, otherwise a ij = 0 . Let d ij be the geodesic distance, length of the shortest path, between node i and node j. Next, we discuss in brief some of the widely used centrality measures.
• Degree: Freeman's [10] degree centrality considers that a node's importance is proportional to its degree, number of links connected (starting/ending) to that node. Mathematically, degree centrality of a node u, Degree centrality can be normalized by dividing the above expression with n − 1 .
In a social network, degree centrality of a node represents that node's popularity. A higher degree node has many followers/friends which shows the strength of the node. Lower degree nodes are the actors who are nearly isolated from the population of the network and are least popular. • Closeness: Freeman's [10] closeness centrality considers a node's importance to be inversely proportional to the sum of its distance to other nodes. Mathematically, closeness centrality of a node u can be represented as Closeness centrality can be normalized by multiplying the above expression with n − 1 . The concept of closeness centrality was first given by Freeman [10] for social networks, but the concept has existed for a long time as status of a node [11]. Closeness centrality of a node in a network represents the node's average distance, i.e., the expectation of how much time, a piece of information that started flowing from any node in the network, will take to reach that node or vice-versa. A higher closeness central node gets updated very early when some information is spreading and nodes with low closeness centrality have to wait longer for getting updated with the flowing information. • Harmonic: harmonic centrality [5,12] considers that a node's importance in a network is proportional to the sum of inverse of its distances from other nodes. Mathematically, harmonic centrality of a node u is Freeman's [10] closeness centrality was not applicable on disconnected networks. Harmonic centrality is highly correlated to closeness centrality [12] and works well on disconnected and directed networks. • Decay: this centrality works on same principle as harmonic centrality but at place of penalizing the contribution of nodes linearly, it does exponentially [3]. Mathematically, decay centrality of a node u is, where δ lies between 0 and 1. δ is called decay parameter. • Betweenness: Freeman's [10] betweenness centrality considers that a node's importance is proportional to the number of times that node occurs in the shortest paths between all possible pairs of nodes in a network. Mathematically, betweenness centrality of a node u, where σ st is the total number of shortest paths from node s to node t and σ st (u) is the total number of shortest paths from node s to node t passing through node u. Betweenness centrality can be normalized by dividing with (n−1)(n−2)

2
. The concept of betweenness centrality was first proposed by Anthonisse [13] and Freeman [14] independently. Betweenness centrality of a node in a network represents the node's brokerage power, i.e., control over the flow passing through the network with the assumption that information is flowing through shortest paths. A higher betweenness central node controls a major fraction of flow passing through the network while a low betweenness central node has nearly no such control. • Eigenvector: Bonacich's [15] eigenvector centrality considers that a node's importance in a network is proportional to the sum of the importance of neighboring nodes in that network. Mathematically, eigenvector centrality of a node u can be written as, Eigenvector centrality of a node in a network represents the power of a node's neighbors in the network. A node with high eigenvector centrality indicates the direct connection of this node to important nodes in the network and vice-versa.

Toy examples
Consider the example networks given in Fig. 1. Let us assume that the first network in Fig. 1 is a road network connecting the cities of a state where the size of nodes represent the weights (in this case population) on the nodes. City F is highly populated while Population in other cities is significantly low and several times lesser than city F. Average-sum facility location problem questions for a node in a given network that is at the minimum average distance to all other nodes in the network. Now, if we are attempting to solve the average-sum facility location problem to install a facility for the whole population of the state, the solution is the most closeness central node assuming equal population in each city. By the definition of closeness centrality, the answer is city C, but this city is not really suitable in the reality for the whole population. The answer is city F where most of the population of the state already resides. Now, suppose if we are tackling a problem where the goal is to find a city from which maximum population is exactly at one hop distance, degree centrality seems to solve this problem in unweighted networks but here it fails. Degree centrality ranks city C as the most central but because of nonuniform distribution of population, city E is the correct answer.
Next, let us assume that the second example network given in Fig. 1 is a communication network where the size of nodes denotes weights (in this case we can assume importance/vitality) of the nodes. We assume that communication happens through shortest paths and the importance of a communication is a function of the importance of nodes, between which communication is happening. Here, we assume that the function computes the multiplication between the importance of the communicating nodes. Now, the goal is to compute the node which controls the most important communications in the network. Applying betweenness centrality, which is a tool directly used for this type of problems, answers node C neglecting the importance attribute. In the reality, node J is the answer because it controls the communication between the most important nodes in the given network. We note above that the existing centrality measures Fig. 1 Two counterexample networks, with size of nodes representing weights on the nodes neglect node weights and due to this, are prone to analyze incorrectly the fully weighted networks. Therefore, there is a need to upgrade the current definitions of unweighted or edge-weighted centrality measure.

Node-weighted centrality
This section summarizes centrality measures that take into consideration the weights on the nodes while giving equal priority to the edges in a given network. These fall in the category of node-weighted centrality measures. We only mention these here to avoid the complexity of including weights on both edges and nodes. Once, these measures are understood, we recommend the readers to combine these measures with the edgeweighted centrality measures given in [5] to derive the definition for fully weighted centrality measures. Recall G = {V , E} as the undirected unweighted graphs. We add an extra element, weights on the nodes defined as a function W : V → R , where R is set of real numbers. Let W x be the weight given at node x. We do not directly use W x in the definitions, but at place of it, we use a function f of W x (or a function of W x and W y depending on the number of parameters) without losing the generality. This gives us the flexibility to tune the function of weights on the nodes according to our need. Here, in this paper we take f (W x ) = W x for the simplicity. To normalize the new centrality measures, we divide by the maximum possible value that any node can score in the numerator of below-given formulas. We start with degree centrality.
• Node-weighted degree centrality: in [6], weights on nodes are considered and the definition of degree centrality is modified to accommodate the node weights. Abbasi and Hossain [6] considered centrality scores as weights on the nodes. Following it, node-weighted centrality of a node u is calculated as: This measure assigns higher importance to those nodes which are in the immediate neighborhood of highly weighted nodes. Next two measures extend the consideration to all the nodes to compute the eminence of a node. • Node-weighted closeness/harmonic centrality: to target the wider applicability, we define node-weighted harmonic centrality (which also can be used in the case of closeness centrality computation as both are highly correlated). Node-weighted harmonic centrality of a node u in a network is defined as: This measure depends on two factors: weight of the node u under consideration and the effective weights of other nodes corresponding to their distances from node u. It assigns a higher value to the nodes that are of high weights and closer to the nodes with high weights. We refer this measure as harmonically attenuated node-weighted centrality measure. • Node weighed decay: weighted decay of a node u in a network as defined as . .
where δ lies between 0 and 1. Here also the computation of importance depends on the same two factors as in NWCC. But, the contribution of weights of other nodes decays exponentially with distance. Weighted decay assigns a higher value to the nodes that are of high weights and very close to the nodes with high weights. We refer this measure as exponentially attenuated node weighted centrality measure. • Node-weighted betweenness centrality: in [16], a factor denoting the importance of communication between two pairs is multiplied in the core formula while computing the betweenness centrality. The node-weighted betweenness centrality of a node u was defined as where f (W x , W y ) can be assumed to map the weights given on node x and y to a real value denoting the importance of flow happening between x and y.
Eigenvector centrality is a measure where it is still open how to include the effect of node weights. Even if we start the eigenvector centrality computation with a vector comprising the weights on the nodes, at the time when the convergence occurs, we arrive at the same eigenvector as the solution every time. It is because the computation of this particular centrality is dependent only on the adjacency matrix and eigenvector is the property of this matrix only. One way around here is to follow the idea given in [9], where they multiply the βth power of the weight of each node to its unweighted centrality to compute node-weighted centrality. Here, β takes a value from the interval [−1, 1].
In this paper, we do not explore about these measure in the experimental section. We plan to do a thorough application oriented study of the above measures in near future. We present the definition for the sake of completeness of major traditional centrality measures.

Related work
Centrality measures are the tools to find application specific importance of a node. Unweighted Centrality measures mentioned earlier, are the most widely used measures but for complex problems and applications, these measures are inefficient. In that case, a combination of these centrality measures produces better analysis than using them individually. In a recent study [6], authors have proposed a new set of hybrid centralities; degree-degree, degree closeness, and degree betweenness. They noticed that the newly defined centrality measures are different than the traditional centrality measures. On real-world co-authorship network, they found that all the newly defined hybrid centrality measures are significantly correlated to authors' performance (sum of citations of authors' h-index). A similar study [7] on weighted collaboration networks was done and three new sets of collaborative centrality measures were proposed. The traditional collaborative centrality measures for an author node (h-index, a-index, g-index) are used to propose new centrality measures. Newly defined set of measures were highly , correlated with the traditional performance measures of scholars (publication count, citation count, h-index, a-index, g-index). Zhang et al. [17] proposed a new hybrid centrality measure to measure a node's importance in satellite communication network. This measure is also based on the combination of closeness and betweenness centrality but the considered measure in their paper punishes the betweenness importance with a different factor. Qiao et al. [18] proposed a hybrid page scoring algorithm based on degree centrality, betweenness centrality, closeness centrality and the PageRank algorithm. Lee and Djauhari [19] proposed a hybrid centrality measure which was a linear combination of the degree, closeness, betweenness, and eigenvector centrality measures. The proposed measures were used to find the most significant or influential stocks. A hybrid centrality based on the linear combination of degree centrality and cohesion centrality was proposed was Qiu et al. [20] and further used for community detection by Li-Qing et al. [21]. Wang et al. [22] proposed a hybrid centrality measure based on the combination of degree, the degree of neighbors and betweenness. In another study by Buechel and Buskens [23], authors analyze a hybrid centrality model as a combination of extended closeness (a variation of closeness for disconnected networks), betweenness and degree measures. None of the above studies attempt to solve the service coverage problem.

The service coverage problem
This section covers the first of two applications covered in this paper, the service coverage problem (SCP). We present two new hybrid measures to solve SCP a type of complex facility location problem. We define these measures for networks without weights on edges. These can be easily further extended to edge-weighted version of hybrid centrality measures following the ideas for edge-weighted centrality measures given in [5].
Flow networks are those networks where information, people, commodities, etc., move from one node to other while traversing on the links. Each node starts with a limited bandwidth of resources for convening the flow and these resources prone to collapse/ degrade/reduce over time due to the continuous participation of the node in fostering the flow. Due to this, such networks require uninterrupted maintenance/replenishment service on a regular basis for the proper functioning of the network. Keeping this in mind, service facilities are installed at nodes that meet the service demand of other node from time to time.
After a service request is made by some node for more resources or reviving the node after a failure occurs, the service team is sent from the service center to meet the demand. The response-time to meet the service demand is defined as the time taken between the request for service and start of the service work on the demanding node. The response-time depends on the distance between the node requesting for a service and the node with the service stations. It is under the assumption that the service centers have sufficient resources to meet the demand of other nodes. The response-efficiency of a node is inversely correlated to the response-time, i.e., when the response-time is least, the node is said to be maximum response-efficient and most suitable for installing service stations. A node with a higher response-time possesses a smaller responseefficiency and is not appropriate choice for installing service stations.
Given an unweighted network, the objective of service coverage problem (SCP) is to find a best suited node to install service station facility such that the expected responsetime is reduced. In another word, the goal is to find a node with highest expected response-efficiency, i.e., the service should reach faster at the nodes that request for service more often than the nodes with moderate or seldom demands. Therefore, service centers should be installed closer to the nodes with frequent demands.
Randomly choosing a node to install service facility is certainly not the best solution. It is because the randomly picked node may be very far from the nodes with higher demand. In such a case where a node with higher load fails, severe damage would have already been caused before the maintenance service reaches it. Thus, we require a measure to evaluate the importance of nodes for the candidacy to install stations covering service requests.
Betweenness centrality is used to predict the importance of nodes in a number of flow based networks, e.g., power grid networks, public-transit networks, gas line networks, and communication networks. The betweenness scores of the nodes in such networks have been considered as the load on the nodes in many literature. Several models for understanding and replicating the cascading phenomena have been proposed [24][25][26]. Few of these models observe that failure of a node with a high load may cause a severe cascading failure and hence result in the breakdown of the whole network system. After a node fails, the best way to reduce the damage is to recover the failed node as soon as possible. We cannot prevent a node from failure but we can definitely put a maintenance mechanism on some of the nodes in the network.

Formulation
In the service coverage problem in flow networks, the load of the nodes (when no other information is provided) can be assumed to be proportional to the betweenness centrality of the nodes. This is because a node with large flow through it, is expected to degrade faster than other nodes and if such a node shuts down, a large amount of flow will be affected. Thus we take the probability of node j requesting for a maintenance service over a fixed time-interval as where BC(j) is the betweenness centrality of node j. Let X ij , be the response-efficiency of a node i to meet service demand from node j if the service station is going to be installed on node i. Certainly, X ij depends on the distance between the node requesting for a service (node j) and the node with the service stations (node i). Harmonic decay can be considered in applications where the nodes are more robust and the service requirement is not urgent while exponential decay might be a more appropriate simplified model in applications like real-time systems where service requests need to be met on an urgent basis.
If the response-efficiency decays harmonically in a given application, i.e., the node is maximum efficient (efficiency value = 1 ) for itself while it is half efficient (efficiency value = 0.5 ) for each of its neighbors and so on, then we can formulate for the value of response-efficiency X ij = 1 d i,j +1 . Recall, d u,v denotes the distance, i.e., length of the shortest path from node u to node v in the given network. Let χ i be the total responseefficiency of a node i if the service station is going to be installed on node i. Then, The expected response-efficiency of node i ( E[χ i ] ) is computed by taking expectation over that node's response-efficiency to service all the nodes in network: We define the following hybrid centrality measures based on the above formulation: • Harmonically attenuated-betweenness centrality (HABC): this measure is based on the harmonic attenuation of importance with respect to the distance. The harmonically attenuated-betweenness centrality of a node u is defined as, where BC(u) is the unweighted betweenness centrality of node u. This hybrid measure assigns a value that is calculated based on an averaging function of distances and betweenness centrality. This measure assigns a higher value to the nodes that are of high betweenness and closer to other high betweenness central nodes. This measure solves the SCP problem in flow networks where the response-efficiency decays harmonically.
In few complex systems, response-time plays very crucial roles in the functionality of networks and the response-efficiency decays exponentially. In such networks, the response-efficiency decays faster, i.e., the node is maximum efficient (efficiency value=1) for itself while it decays by a multiplicative factor α (efficiency value = α ) for each of it's neighbors and so on. Here, α is a severity factor that lies in interval[0,1]. Then we can formulate for the value of response-efficiency X ij = α d i,j . Similar to the previous analysis, the expected efficiency of a node i to be a service station, when efficiency decays exponentially by a factor α is E[ k∈V BC(k) . We define the following hybrid centrality measures based on the above formulation: • Exponentially attenuated-betweenness centrality (EABC): this measure is based on an exponential attenuation in the power of distance. The exponentially attenuated-betweenness centrality of a node u is defined as, where BC(u) is the betweenness centrality of node i. α is the attenuation factor that is used in the exponential function to the power of distance. It is used to sharply punish the service time. This hybrid measure assigns a value that is calculated based on betweenness centrality of nodes and an exponential function in the power of distances. This measure assigns a higher value to the nodes that are of high betweenness .
, and very close to other high betweenness central nodes. This centrality measure is a sharp attenuation variant of harmonically attenuated-degree centrality. This measure solves the service coverage problem when the service is required on a very urgent basis. Let m be the number of links and n be the number of nodes in a given network. The over all time to compute proposed hybrid centrality measure is O(mn) in unweighted and O(mn + n 2 log n) in weighted graphs. It is due to the time complexity for computing betweenness centrality [27]. Efficient algorithms for updating and estimating betweenness centrality measures in dynamic and large-size networks are discussed in [28][29][30].

Experimental results for service coverage problem
In this section, we discuss the simulation results on various real-world networks. First, we discuss the experimental setup. Then, we mention all the data set used for experimentation. Next, we provide a comparison of traditional centrality measures: degree (DC), closeness (CC), betweenness (BC) and, eigenvector (EC) and the proposed hybrid centrality measure [HABC, EABC ( α = 0.5 ), EABC ( α = 0.75 )] in the considered networks. The experiments are performed on a Linux system with 8 GB RAM and Intel Core i5-8250U CPU at 1.6 GHz. Implementation is done in Python 3.6.7 with the help of Networkx library.
Considered real-world data sets the proposed solution to solve service coverage problem discussed in this paper hybridizes betweenness centrality within closeness/harmonic and decay centrality. Betweenness centrality is first used to compute the load on each node, therefore, we have selected 21 real-world flow networks. We provide a brief summary of these networks in Table 1 and [31][32][33][34][35][36][37][38][39] can be referred for a detailed description of the networks. We have considered various types of transport networks, energy networks, internet peer-to-peer network, etc. The columns of Table 1 consist of names of the network instances, the number of nodes (n), the number of edges (m), the average degree of the networks (Avg. deg.), density of the network, average clustering coefficient ( Ĉ ), degree assortativity, the size of the maximum clique and the network type, respectively.
Simulation we have conducted simulations for evaluating the performance of various traditional centrality measures and the hybrid measures proposed in this paper which are summarized in Table 2. The performance of a centrality measure is evaluated by computing the expected response-time in terms of the average distance of the service requesting nodes from the top central node as per this centrality measure. We have emphasized in bold, the best expected response time in Table 2 for each network instances given in Table 1, when a service maintenence center has been installed in a node picked according to the considered centrality measures.
Betweenness centrality has been used as one of the best measure to map loads on nodes in flow networks [24]. A node having higher load will be more frequent in requesting for services. Therefore, we consider the probability of a node requesting for services proportional to the betweenness centrality of that node. The expected responsetime is computed over ⌈n/10⌉ service requests where n is the number of nodes in respective real-world networks. The first column in the table contains label of the considered real-world network instances. The next columns lists the expected response-time of the traditional centrality measures (BC,CC, DC, EC) and the proposed hybrid measures (HABC, EABC(α = 0.5) , and EABC(α = 0.75)).
It is evident from Table 2 that no traditional measure can consistently find an ideal service center. While at least one of the proposed hybrid centrality measures are always able to minimize the average response-time for all considered networks. In some cases such as Madrid Metro, Paris Metro, Osaka Metro, Seoul Metro, and Tokyo Metro, the proposed measures even outperform all the traditional measures.
Node ranks here, we discuss the ranking result of nodes by various centrality measures. Table 3 contains the ranking list of the top three central nodes using the proposed hybrid measures and the traditional centrality measures on the considered real-world networks given in Table 1. Due to space constraints, only the data of top three central nodes have been presented in the paper. The first column in the table contains the label of the considered real-world network instances. The next columns in a group of three lists top 3 central nodes using the proposed hybrid measures (HABC, EABC(α = 0.5) , and EABC(α = 0.75)) and the traditional centrality measures (BC,CC, DC, EC).
Finding only the most central node might not be useful in the cases when the top central node does not allow it to be made as a service facility due to some constraints. In that case, finding the first top-k potential centers are important. It is evident from the experimental results that the proposed measures are results of hybridizations between closeness centrality and betweenness centrality.
We have emphasized the top-ranking node for a few entries in the table and have written them in bold to exhibit the above phenomenon. For some networks, the top-ranking node as per HABC is the same as BC but not as CC and vice-versa. The ranking on two networks (Minnesota Road and Moscow Metro) also provide evidence that topranking nodes due to the proposed centrality measures are different from closeness and betweenness centrality. In addition to these two, another network (Openflights) shows that these proposed measures also rank nodes differently than each other.
It is evident that no single standard centrality measure is a perfect substitute for our proposed centrality measures. As our measures follow directly from the requirements of the Service Coverage Problem, it is evident that current centrality measures are not sufficient to solve the problem adequately.
We present the Spearman's rank correlation coefficient, Kendall's rank correlation coefficient and Fagin's intersection metric [40] to evaluate the correlation between traditional centrality measures and the proposed hybrid centrality measures in Tables 4, 5 and 6, respectively. For our purposes we compute the Fagin's rank intersection for top-1000 ranks in the network. In case the number of nodes is less than or equal to 1000, the rank intersection is computed for all nodes in the network. The first column in the tables contain the label of the considered real-world network instances. The next four columns comprise the rank correlation between HABC and the traditional centrality measures (BC, CC, DC, EC). Similarly the next four columns contain the rank correlation values between and EABC(α = 0.5) and BC, CC, DC, EC, respectively. The last four columns are the rank correlation between EABC(α = 0.75) and the traditional centrality measures.
The experimental results in Tables 4 and 5 for the Spearman's and Kendall's rank correlation coefficients make it evident that ranking by the proposed hybrid measures is different than traditional centrality measures. The average correlation coefficient between the proposed measures and traditional measures, although is positive but very small. The standard deviation is larger than the average coefficient values for most of the computation of rank correlation. It is due to several negative correlation coefficient values. Based on the average rank correlation coefficient values, HABC is best correlated with CC and then BC among the traditional measures. EABC(α = 0.75) also exhibits similar pattern as the decay rate is slower than EABC(α = 0.5) . EABC(α = 0.5) is best correlated with the CC and then EC among the traditional measures. The proposed measures are least correlated with DC. The high Fagin's intersection is due to the ranks being restricted to the top-1000. Therefore, the existing traditional measures do not provide the solution to the service coverage problem. The top-ranked nodes by the proposed hybrid centrality measures are more appropriate in this application.

Spread maximization in complex contagion
In this section, we present the second application. Contagions are the phenomena of disease/behavior/trends/information/idea spreading across networks. Contagions can be classified as simple or complex. Simple contagions can be transmitted by a single infected individual to other. Complex contagions, however, require multiple exposures to infect an individual. Instead of simple contagions like diseases, there have been studies such as [41] and [42] which indicate that the spread of ideas, trends, behavior, influence and information in a social network can be more accurately modeled as complex contagions. In a recent study by [43] on real-world networks, it has been noted that influence maximization is not due to the nodes in higher cores. We consider complex contagions first in a linear threshold diffusion model setting [44,45]. Linear threshold models are one of the classical diffusion models. The considered linear threshold model in this paper assumes that a fixed proportion of neighbors of an individual need to be infected in order to transmit infection to that individual. It is also referred as relative threshold model. There are other models that considered deterministic thresholds yet distinct for each individual. Another deterministic threshold model fixes the threshold to be a constant number and does not depend on the neighborhood size. In stochastic setting, uniformly random thresholds from an interval are considered for individuals.
We also consider two stochastic models of diffusion, independent cascade and stochastic linear threshold. In the independent cascade model every edge in the network is associated with a random value which represents the probability of diffusion across the edge. The stochastic linear threshold model is a linear threshold model in a stochastic setting as explained earlier.
The total number of people infected can depend upon where the contagion starts, threshold on individual node, etc. There can be multiple sources in a network which can start a complex contagion. This is especially true in the case for complex contagions, which have a hard time jumping across communities [46]. Hence identification of an ideal origin of trend/behavior/idea, etc., for a contagion can be helpful if we want the contagion to spread to maximum number of people. This has real-world benefits, such as in social marketing campaigns.
In this section, we consider the problem of finding a seed node that could spread a behavior/advertisement/idea/trend or influence to maximum number of nodes. It is referred as spread maximization problem. In complex contagion, a single node might not have the strength to even start the propagation of infection, therefore, it is assumed that the neighborhood of the seed node is also infected with the seed node at time step zero.

Formulation
In this section, we formulate hybrid centrality measures for the above stated problem of spread maximization in complex contagions. When choosing a source for infection, several factors come to mind. Out of which, the most important is that the node must have high enough degree to expose multiple individuals with the infection. It also helps if the node is closer to other nodes with high degree to spread it further. Hence to measure the potential of a node in acting as a top spreader, we use its degree as its node weight. We present the following two hybrid measures to identify ideal sources for the spread of complex contagions in the relative threshold model, leveraging the hybridization similar to the one used in previous section.
• Harmonically attenuated degree centrality (HADC): in this measure, we attenuate the node weights harmonically with respect to distance. Formally, for a node u, the HADC(u) can be expressed as where DC(v) is the degree centrality of node v, m is the number of links in the graph. • Exponentially attenuated degree centrality (EADC): in this measure, we attenuate the node weights exponentially with respect to distance. Formally, for a node u, the EADC(u) can be expressed as, where α is a parameter that lies between 0 and 1, m is the number of links in the graph, DC(v) is the degree centrality of node v. Let m be the number of links and n be the number of nodes in a given network. The over all time to compute proposed hybrid centrality measure is O(mn) in unweighted and O(nm + n 2 log n) in weighted graphs. It is due to the time complexity for all pair shortest paths. In real-world, each node has its own threshold which may be deterministic or stochastic in nature and varies from individual to individual. Finding best spreader node is possible in O(mn) time if the threshold values at all nodes are fixed to some constant value or proportion. Yet, for a different value of threshold, the best spreader may change and requires re-computation. The proposed measures do not require the knowledge of threshold value to be known beforehand. From the simulation results in the next section, it is evident that these measures have consistently performed the best in most cases for different values of threshold.

Experimental results for spread maximization problem
In this section, we discuss the simulation results for for spread maximization problem on various real-world networks. First, we discuss the experimental setup. Then, we mention all the data set used for experimentation. Next, we provide a comparison of traditional centrality measures: degree (DC), closeness (CC), betweenness (BC) and, eigenvector (EC) and the proposed hybrid centrality measure (HADC, EADC(α = 0.25) , EADC(α = 0.5) , EADC(α = 0.75)) in the considered networks. Considered real-world data sets the proposed solution to solve spread maximization problem discussed in this paper hybridizes degree centrality within closeness/harmonic and decay centrality. The considered problem is related to spread of trends, behaviors, influences, etc., therefore, we have picked 15 moderate size real-world social networks for simulations. We provide a brief summary of these networks in Table 7 and [31,36] can be referred for a detailed description of the networks. The columns of Table 7 consist of names of the network instances, the number of nodes (n), the number of edges (m), the average degree of the networks (Avg. deg.), density of the network, average clustering coefficient ( Ĉ ), degree assortativity and the size of the maximum clique, respectively.
Simulation we have conducted simulations for evaluating the performance of various traditional centrality measures and the hybrid measures proposed in this paper for spread maximization problem which are summarized in Tables 8, 9, 10 and 11. The simulation is carried out as follows. The top-ranked node for every measure is taken as the source of the infection. The source and its neighborhood are infected at time t = 0 . At each time step the infection is propagated as per the diffusion model being used. The contagion ends if it is unable to infect any susceptible nodes in a time-step. The total number of infected nodes by the end of the contagion is the spread-score for the source node in the simulation. The performance of a centrality measure is evaluated by computing the spread-score of the top most central node as per the centrality measure.
For the deterministic threshold model, the spread-scores of top most central nodes according to the traditional and proposed hybrid centrality measures in all considered real-world networks for different possible threshold value in the interval (0,1) with a step increase of 0.1 are plotted in Fig. 2. The figure shows the number of infections on the y-axis and the threshold values on the x-axis. As a specific example consider the plot for Twitter Lists. The number of infections have a large dip when the threshold increases from 0.1 to 0.2. The worst performing metric at the threshold of 0.2 is EC. When increasing the threshold from 0.2 to 0.3, we see another dip in number of infections, with DC and BC performing almost equally while BC was superior at the 0.2 threshold. CC and proposed centrality measures outperform the other measures across the rest of the thresholds. The main aim to plot these is to understand, for which interval of threshold values, the spread-scores for various centrality measures vary majorly. It is clear from the plots that for most of the considered real-world networks, the spread scores vary mostly in the sub-interval [0.3, 0.6]. Yet, the spread-scores for different measures are not clear and comparable from these plots. Therefore, the results of the simulation for threshold value 0.3, 0.4, 0.5, and 0.6, i.e., for threshold in terms of percentage set at 30%, 40%, 50% and 60% are presented in Tables 8, 9, 10, and 11, respectively. For each of the tables, the first column shows the network name, the second column shows the spread-scores of Betweenness Centrality, the third column shows the spread-scores of Closeness Centrality, the fourth column shows the spread-scores for the Degree Centrality and the fifth column shows the spread-scores for Eigenvector Centrality. The other half of the table, from the sixth column onward shows the scores of the proposed hybrid measures. Namely, the sixth column shows the spread-score for HADC, the seventh We have emphasized in bold, the best spread-score in Table 8, 9, 10, 11, 12 and 13 for each network instances given in Table 7 overvarious models, when the seed node is picked according to the considered centrality measures. As seen in Table 8, at a lower threshold, CC performs better than other traditional measures for most networks. However, as we increase the threshold, we observe that DC starts performing better than other traditional measures, as shown in Table 9. Since our measures are a hybrid of both the measures, they are able to maintain high scores across the thresholds. As an example if we look at Table 9 for Twitter Lists, the measures have similar results as CC, on the Apart from the deterministic linear threshold model, we have also evaluated the performance of the proposed measures as well as traditional measure using stochastic diffusion models, namely independent cascade (IC) and stochastic linear threshold (LT). The results of these experiments are shown in Tables 13 and 12, respectively. The results shown in the tables are the averages taken over 100 independent iterations of the diffusion models. From Table 12, we see that HADC and EADC with α = 0.75 have the highest number of expected infections across most networks. For some networks such as soc-pages-publicfigure, they outperform all of the traditional measures. In the network Twitter lists, CC is the top performing traditional measure while in the network soc-pages-media, DC and EC are the top performing traditional measures. In both cases HADC and EADC with α = 0.75 perform optimally as they are a type of hybridization between CC and DC. In case of the IC model as shown in Table 13, we see that no measure performs consistently across the networks. As none of the traditional measures are able to perform consistently across the networks, a hybridization between them may lead to sub-optimal results.
We present the Spearman's rank correlation coefficient, Kendall's rank correlation coefficient and Fagin's intersection metric [40] to evaluate the correlation between traditional centrality measures and proposed hybrid centrality measures in Tables 14,  Table 3 Ranking list of top three central nodes using the proposed hybrid measures and the traditional centrality measures on considered real-world networks given in Table 1 Instance  Table 4 Spearman's rank correlation between the proposed and the traditional centrality measures and on considered real-world networks given in Table 1 Instance  Table 5 Kendall rank correlation coefficient (Kendall's Tau) between the proposed and the traditional centrality measures and on considered real-world networks given in Table 1 Instance  Table 6 Fagin's intersection metric between the proposed and the traditional centrality measures and on considered real-world networks given in Table 1 Instance   From the Spearman's and Kendall's rank correlation coefficients, we observe that our measures have a small positive correlation with traditional measures. The correlation is small enough to conclude that the proposed measures rank nodes differently from traditional measures. The correlations are positive and almost same for all traditional measures. This indicates that our measures share some characteristics with them and are able to switch between traditional measures to perform optimally across networks and thresholds. Fagin's intersection metric has a higher value due to only considering top-1000 ranked nodes to compute the metric.   Table 14 Spearman's rank correlation between the proposed and the traditional centrality measures and on considered real-world networks given in Table 7 Instance  Table 15 Kendall rank correlation between the proposed and the traditional centrality measures and on considered real-world networks given in Table 7 Instance  Table 16 Fagin's intersection metric between the proposed and the traditional centrality measures and on considered real-world networks given in Table 7 Instance

Conclusion
In this paper, we have proposed new hybrid centrality measures based on closeness (harmonic) and decay measures. The first hybridization is used to solve service coverage problem in flow networks where the demand for services is assumed to be proportional to the betweenness centrality. The proposed measures can also be used in another application that requires installing a facility farthest from the failure prone nodes. The solution in this case will be the node with minimum expected responseefficiency. The second hybridization attempts to find most ideal node for spreading information to the maximum fraction of population while considering a complex contagion. The proposed hybridization of centrality measures for both applications are based on the formulation for node-weighted centrality measures. The experimental results on several real-world networks show that the proposed measures perform relatively better than the individual traditional measures and also rank nodes differently. Although, the reference centrality measures considered in this paper are due to the nature of considered applications, the framework allows using other measures based on the requirement. Analyzing the proposed measures on various other real-world networks and at the place of betweenness and degree centrality, hybridizing other measures specific to some particular applications are the possible future directions. The analysis of real-world networks where nonuniform weights are given at nodes using node-weighted centrality measures is another open direction. Another interesting direction is using hybrid centrality measures in multi-layer networks [47]. In multi-layer networks, we may use one layer of the network to compute node weights to be used for hybridization in other layers.