- Research
- Open Access
- Published:

# Network-based indices of individual and collective advising impacts in mathematics

*Computational Social Networks*
**volume 7**, Article number: 1 (2020)

## Abstract

Advising and mentoring Ph.D. students is an increasingly important aspect of the academic profession. We define and interpret a family of metrics (collectively referred to as “*a*-indices”) that can potentially be applied to “ranking academic advisors” using the academic genealogical records of scientists, with the emphasis on taking into account not only the number of students advised by an individual, but also subsequent academic advising records of those students. We also define and calculate the extensions of the proposed indices that account for student co-advising (referred to as “adjusted *a*-indices”). In addition, we extend some of the proposed metrics to ranking universities and countries with respect to their “collective” advising impacts, as well as track the evolution of these metrics over the past several decades. To illustrate the proposed metrics, we consider the social network of over 200,000 mathematicians (as of July 2018) constructed using the Mathematics Genealogy Project data.

## Introduction

In recent years, universities and other research institutions have put a lot of emphasis on assessing and enhancing the productivity of their faculty. One aspect that has been traditionally deemed important in these efforts is the number and quality of a researcher’s publications. The popular metrics of publication productivity include various quantities based on an individual’s citation record (e.g., total number of citations, weighted citations, *i*10-index, *h*-index, etc.), typically accounting for the “prestige” measures of publication outlets (e.g., journal impact factors, 5-year impact factors, SNIP, CiteScore, etc.). However, besides publication output, another—possibly equally important—aspect of the academic profession success is associated with advising and mentoring Ph.D. students. One can argue that a successful academician is not only the one who publishes many highly cited articles, but also the one who successfully advises students, and further, whose students in turn become successful academic advisors, thus ensuring the continuity and prosperity of an academic discipline. Indeed, in the modern era, many universities emphasize the importance of effective mentorship and post-graduation academic productivity of their Ph.D. students.

This paper makes contributions towards a systematic network-based analysis of large-scale Ph.D. student advising data. We define and interpret a family of new network-based metrics (collectively referred to as “*a*-indices”) that can be used for “ranking academic advisors” using the academic genealogical records of scientists. We rely on the well-known web-based Mathematics Genealogy project resource that has collected a vast amount of data on Ph.D. student advising records in mathematics-related fields.

Due to its popularity and public availability, MathGenealogy dataset has been used as a testbed in several previous studies. The basic characteristics of the MathGenealogy network snapshot from 2011, as well as those of the underlying network of countries, were presented in [1]. In [2], the authors analyzed the performance of students of those individuals who were near the beginning versus near the end of their academic careers and revealed interesting insights. Another study [3] used the data of Ph.D. degrees granted after 1973 and used it to compose a network of universities, where some of the universities were then labeled as strong sources (“authorities”) of Ph.D. production, while the others were labeled as strong destinations (“hubs”). The authors of [4] presented a comprehensive analysis of the MathGenealogy network with respect to the classification of mathematics-related subjects, as well as most influential countries in terms of the Ph.D. graduates output. Further, they revealed the major “families” of mathematicians that originated in certain root nodes (“fathers” of mathematics’ genealogical families), in the different “eras”, covered by the project data. A new concept of eigenvector-based centrality was defined and tested on the MathGenealogy network in [5]. In [6], the authors proposed the so-called “genealogical index” for measuring individuals’ advising records. As it will be seen below, one of the indices proposed in this paper can be viewed as a special case of the “genealogical index” proposed in [6].

This paper takes a further step towards studying and ranking academic advising impact using MathGenealogy social network. The emphasis of this study is on taking into account not only the number of students advised by an individual but also subsequent academic advising records of those students, while providing the respective metrics that are easy to calculate, understand, and interpret. It should also be noted that this study does not aim to explicitly compare the proposed indices with other metrics/results available in the aforementioned related literature. However, we believe that the presented approaches and results provide a new perspective on this interesting subject and further demonstrate the utility of social network analysis tools in the considered context.

The paper is organized as follows. In the next section, we briefly describe the MathGenealogy dataset and provide its basic characteristics along with definitions and notations that will be used in the paper. Next, we define and interpret the family of “*a*-indices” that we propose for ranking academic advisors. We then extend these definitions to take into account co-advising. Finally, we present the results obtained on the most recent snapshot of the MathGenealogy dataset, as well as investigate the evolution of individual and collective *a*-indices over the past several decades.

## Data description, notations, and basic characteristics of MathGenealogy network

To facilitate further discussion, we first describe the MathGenealogy dataset and provide its basic characteristics, as well as define graph-theoretic concepts that will be used in the paper.

### Data description

The data were collected from the Mathematics Genealogy Project website^{Footnote 1} using a web-crawler software. The dataset contains the records about nearly 231,000 mathematicians (as of July 2018). The information for each mathematician in the database includes name, graduation year, university, country, Ph.D. thesis topic and its subject classification, as well as the list of students advised by this individual. This available data allowed us to construct the directed network of advisor–advisee relationships.

### Related graph-theoretic concepts

Due to the fact that the considered dataset is a directed network, it is represented by a directed acyclic graph \(G=(N,\mathcal {A})\), with a set of *n* nodes, *N* = \(\left\{ 1,\ldots , n\right\}\), and a set of *m* arcs (links) \(\mathcal {A}\), where the mathematicians are represented by the nodes of the graph, and the relation “*i* is an advisor of *j*” is represented by an arc from *i* to *j*. The in-degree (\(\text{deg}^{\text{in}}(i)\)) and out-degree (\(\text{deg}^{\text{out}}(i)\)) of node *i* are the numbers of the arcs coming into and going out of node *i*, respectively. Clearly, the in-degree of node *i* is the number of this individual’s Ph.D. dissertation advisors (equal to one for many nodes in the network, although a substantial fraction of nodes do have higher in-degrees), whereas the out-degree of node *i* is the number of Ph.D. students that this individual has successfully graduated. Node *j* is said to be reachable from node *i* if there exists a directed path from *i* to *j*. The number of links in the shortest path from *i* to *j* is referred to as the distance between these nodes and denoted by *d*(*i*, *j*) (\(d(i,j)=+\infty\) if there is no such path). A group of nodes is said to form a weakly connected component if any two nodes in this group are connected via a path and no other nodes are connected to the group nodes, where the directions of arcs in a path are ignored.

The *harmonic centrality* of node *i* is defined as \(C_h(i) = \sum _{j \in N} \frac{1}{d(i,j)}\) [7, 8]. The *decay centrality* of node *i* is defined as \(C_d(i) = \sum _{j \in N} \delta ^{d(i,j)}\) [9, 10], where the parameter \(\delta \in (0,1)\) is user-defined, although it is often set at \(\delta =1/2\), which is the value used in this study (it is assumed that \(1/d(i,j)=\delta ^{d(i,j)}=0\) if \(d(i,j)=+\infty\)).

### Basic characteristics of MathGenealogy network

The retrieved network had 12,263 weakly connected components, with the giant weakly connected component having 208,526 nodes and 238,212 arcs (thus containing about 90% of all the nodes in the network). All the computational results presented below were obtained for this giant component. Further in the text, we will use the term “network” implying this giant weakly connected component.

The analysis of many basic characteristics of an earlier snapshot of this network was conducted in [1]. Since such analysis is not the main focus of this study, we report only some of these basic characteristics for the most recent snapshot that are relevant to the material presented in this paper. The distribution of out-degrees in this network is presented in Fig. 1. As one can observe, it does resemble a power law, although it is not a “pure” power law, which is consistent with observations for many other real-world networks [11].

The out-degree correlation for all “tail-head” (or, “advisor–student”) pairs of nodes corresponding to all arcs (directed links) in the considered directed network was calculated as follows. Consider an ordered list of all directed links \(l \in \{1, \ldots , |\mathcal {A}|\}\) in the network, let *i* and *j* be the head and tail nodes of link *l*, and let \(\text{deg}_l^\text{{out}}(i)\) and \(\text{deg}_l^{\text{out}}(j)\) be their out-degrees, respectively. Thus, we have an array of size \(|\mathcal {A}|\) of head nodes (denote the average out-degree of all nodes in this array by \(\overline{\rm{deg}^{\rm{out}}(i)}\)) and an array of size \(|\mathcal {A}|\) of tail nodes (denote the average out-degree of all nodes in this array by \(\overline{\rm{deg}^{\rm{out}}(i)}\)). Then, the out-degree correlation (also sometimes referred to as the out-assortativity) can be calculated as:

The value of the out-degree correlation for this network was found to be approximately 0.055. This implies that on average there is a very minor correlation between the mentorship productivity of an advisor and a student. Therefore, we believe that in the proposed metrics and rankings of academic advisors it makes sense to “reward” those prolific advisors whose students are also successful academic mentors.

As for the in-degree distribution, it is not surprising that the majority of the nodes have in-degree equal to one. However, the network contains over 30,000 nodes with in-degree greater than one, which means that a substantial fraction (about 15%) of the mathematicians in the dataset had more than one Ph.D. advisor. Therefore, it is important to take into account the effects of co-advising, which is why we define “adjusted” versions of the proposed metrics (indices).

## Advising impact metrics

In this section, we define four metrics (“*a*-indices”) that we believe are appropriate for quantifying an individual’s advising impact, with a focus on taking into account the mentoring success of an individual’s students (going beyond just the number of the Ph.D. students that an individual has graduated). One way to address this is to consider the numbers of students and students-of-students, whereas another approach is to take into account all the academic descendants of an individual. These considerations are reflected in the following definitions.

###
**Definition 1**

(*a*-*index*) The *a*-index^{Footnote 2} of an individual *i* is the largest integer number *n* such that the individual *i* has advised *n* students (Ph.D. graduates) each of whom has advised at least *n* of their own students (Ph.D. graduates). Equivalently, this is the largest number *n* of out-neighbors of node *i* in the directed network such that each of these neighbors has out-degree of at least *n*.

###
**Definition 2**

(\(a_\infty\)-*index*) The \(a_\infty\)-index of an individual *i* is the total number of their academic descendants, computed as the largest number of distinct nodes that are reachable from node *i* through a directed path.

###
**Definition 3**

(\(a_1\)-*index*) The \(a_1\)-index of an individual *i* is the harmonic centrality of the corresponding node *i* in the directed network: \(a_1 (i) = C_h(i) = \sum _{j \in N} \frac{1}{d(i,j)}\).

###
**Definition 4**

(\(a_2\)-*index*) The \(a_2\)-index of an individual *i* is the decay centrality (with \(\delta = \frac{1}{2}\)) of the corresponding node *i* in the directed network: \(a_2 (i) = C_d(i) = \sum _{j \in N} \frac{1}{2^{d(i,j)}}\).

It can be seen from Definitions 1–4 that the *a*-index is a measure of the most “immediate” advising impact of an individual, which takes into account their advising success simultaneously with the advising success of their students.^{Footnote 3} Note that the *a*-index is similar to the *h*-index well-accepted for citations record evaluation; however, it turns out that it is rather hard to achieve a double-digit value of the *a*-index over one’s academic career due to the fact that graduating a Ph.D. student is generally a less frequent event than publishing a paper. As it can be seen in Table 1, the highest *a*-index value in the considered dataset is 12 (achieved by only four mathematicians). Note that a relevant study [6] reported only one mathematician with the value of *a*-index (\(g_{(1)}\) measure in their terminology) equal to 12. Overall, the *a*-index may be applicable as a metric of the advising impact for middle- to late-career academic scientists.

Note that the *a*-index can be extended in a straightforward fashion to reflect a more “long-term” advising impact of an individual by considering third, fourth, etc., generations of an individual’s students as it was proposed in the definition of the “genealogical index” in [6]. However, the main issue with this approach is that close to 100% of the mathematicians in the considered dataset would have zero values of such index, which would not allow one to effectively rank advisors’ long-term impacts using this metric.

Therefore, in order to provide more practically usable quantifications of “long-term” advising impacts of individuals, especially for those scientists who are in the late stages of their careers and for those who have lived and worked centuries ago, we propose the \(a_1\), \(a_2\), and \(a_\infty\) indices. The \(a_\infty\)-index essentially assigns equal weights to all the academic descendants of an individual, whereas the \(a_1\) and \(a_2\) indices prioritize (with different weights) the immediate (directly connected) students and students-of-students while still giving an individual some credit for more distant descendants. Possible practical interpretations of these indices are as follows.

The \(a_\infty\)-index is appropriate for ranking the “root nodes” of the mathematics genealogy network, that is, nodes with zero in-degrees, which essentially correspond to “fathers” of mathematics’ “genealogical families”, such as those described in [4]. It is not practically significant to calculate this index for nodes with non-zero in-degree values, since their predecessors in the network would obviously have higher values of this index. Thus, the \(a_\infty\) index is interesting primarily from the perspective of history of mathematics, although it can certainly be calculated very easily for any contemporary mathematician.

On the other hand, the \(a_1\)-index and \(a_2\)-index do not necessarily possess the aforementioned property of the \(a_\infty\)-index: the values of these indices may be higher for contemporary mathematicians than for the “fathers” of genealogical families due to the fact that an individual’s immediate students and any other early-generation students attain higher index values than do any distant descendants. These indices are based on the well-known concepts of harmonic and decay centralities, which makes them easy to calculate and interpret, and hence, attractive from a practical perspective. These indices can be applied to an academic advisor from any era, thus providing a universal tool of assessing the academic advising impact. However, it is still likely that the advisors in the late stages of their careers would have higher values of these indices (especially the \(a_1\)-index that gives higher weights to distant descendants) than those in early-to-mid-stages of their careers. This is not surprising, since these indices are designed to assess the long-term advising impact beyond the number of immediate students.

Further, note that there are several natural extensions of these definitions. First, all of these indices can be adjusted by taking into account the effects of co-advising, that is, giving a special treatment to the cases when multiple individuals have advised the same student *j* (that is, with node *j* having multiple incoming links). These particular extensions are addressed in greater detail in the next section. Second, the *a*-index can also be defined for a specific country or university (similarly to the *h*-index of a journal among citations metrics), that is, considering the respective country or university as a “super-node”, with the outgoing links directed to all the Ph.D. graduates ever produced (or produced during a specific time frame) by this country or university, respectively. The resulting *collective* advising impact values for universities and countries, based on MathGenealogy dataset, will also be presented below.

## Advising impact metrics adjusted for co-advising

In this section, we define the extensions of our basic indices (Definitions 1–4) to handle the cases of co-advising, that is, the situations where one Ph.D. student was co-advised by more than one individual. It makes practical sense to introduce these definitions due to the fact that a substantial fraction of the individuals in the considered dataset were advised by more than one advisor. The basic assumption that we make in the definitions below is that the credit for advising such a student is split equally between each of the co-advisors (i.e., if there are *n* listed co-advisors for a student, then each of the co-advisors receives 1 / *n* credit for graduating the student).

### Adjusted \(a_\infty\), \(a_1\), \(a_2\) indices

The definitions of \(a_\infty\), \(a_1\), \(a_2\) indices can be modified to take into account co-advising as follows.

###
**Definition 5**

(*adjusted* \(a_\infty\)*-index*) The adjusted \(a_\infty\)-index of an individual *i* is the total number of their academic descendants weighted by the reciprocals of their in-degrees, that is, \(a_{\infty , adj}(i) = \sum _{j \in N} \frac{1}{deg^{in}(j)}\mathbb {1}_{\{d(i,j)< +\infty \}}\), where \(\mathbb {1}_{\{d(i,j)< +\infty \}}\) is the indicator function corresponding to the condition that node *j* is reachable from node *i* through a directed path.

###
**Definition 6**

(*adjusted* \(a_1\)*-index*) The adjusted \(a_1\)-index of an individual *i* is defined as \(a_{1, adj} (i) = \sum _{j \in N} \frac{1}{ deg^{in}(j)}\frac{1}{d(i,j)}\).

###
**Definition 7**

(*adjusted* \(a_2\)*-index*) The adjusted \(a_2\)-index of an individual *i* is defined as \(a_{2,adj} (i) = \sum _{j \in N} \frac{1}{ deg^{in}(j)} \frac{1}{2^{d(i,j)}}\).

As one can clearly see from these definitions, the values of these adjusted indices are always less than or equal to the respective values of their “regular” counterparts, as common sense would suggest.

### Adjusted *a*-index

The above definition of *a*-index can also be modified to take into account co-advising, although this extension is not as straightforward as those in the previous subsection. The “adjusted *a*-index” of node *i* can be calculated as follows:

- 1.
Calculate the “adjusted” out-degree of node

*i*: \(deg_{adj}^{out}(i) = \sum _{j: (i,j) \in \mathcal {A}} \frac{1}{deg^{in}(j)}\). Clearly, this value can be fractional and is reduced to simply the out-degree of node*i*if none of the students of the corresponding individual*i*were co-advised. - 2.
Compute and sort the adjusted out-degrees (defined as indicated above) of all nodes \(\{ j:(i,j) \in \mathcal {A} \}\) in the non-increasing order. Denote this sorted array as \(D_1, D_2, \ldots\) and let \(D_k\) be the

*k*th element of this array such that*k*is the largest integer satisfying \(\lceil D_k \rceil \ge k\). Calculate \(\min \{ D_k, k\}\). - 3.
Calculate the

*adjusted a-index*of node*i*, \(a_{adj}(i)\), as the minimum over the values obtained in the steps 1 and 2 above.

This computational procedure ensures that the adjusted *a*-index of any node *i* is always less than or equal to its “regular” *a*-index, whereas the possibility of fractional values of the adjusted *a*-index provides a more diverse set of its possible values. This would potentially allow one to create a more “diversified” ranking of academic advisors based on their own productivity and productivity of their students, while taking into account co-advising.

## Results for MathGenealogy dataset

In this section, we present the results obtained on the MathGenealogy network using the metrics proposed above. Figure 2 shows the distribution of the values of the *a*-index and the adjusted *a*-index over the entire network. One can observe that while the “regular” *a*-index is always integer by definition, the adjusted *a*-index does often take fractional values, especially for lower spectrum values of the index, thus providing a more diverse set of possible values in a ranking. Further, Table 1 provides a ranking of top academic advisors with an *a*-index of at least 10, many of whom are prominent mathematicians from the nineteenth and twentieth centuries (note that none of the mathematicians who worked before the nineteenth century made it into this ranking). Their respective adjusted *a*-index values are also given in the same table for comparison. One can observe that this ranking would change if it was done using the adjusted *a*-index, thus showing that co-advising is indeed a significant factor to consider in this context.

Table 2 presents the collective advising impact rankings of universities and countries based on their respective values of *a*-index. It can be observed that universities and countries with prominent reputation in mathematics-related research fields lead these rankings, which shows that (i) not surprisingly, there is correlation between collective university-scale and country-scale research and advising impacts, and (ii) the *a*-index appears to be a realistic and appropriate metric for collective advising impact of a university or a country. Note that we do not consider adjusted *a*-index in this case (although it would be possible), since it is rare in the dataset that an individual’s co-advisors come from different universities or countries.

Figure 3 shows the distribution of regular and adjusted \(a_1\) and \(a_2\) indices in the network. It appears that both of these distributions are close to power-law, whereas the range of values of the \(a_1\)-index is larger than that of the \(a_2\)-index, which follows from the respective definitions. Tables 3 and 4 present the rankings of the top 25 advisors by regular versus adjusted \(a_1\) and \(a_2\) indices. For each index, mostly the same group of advisors appears in the regular versus adjusted index rankings, although their order slightly changes in both tables. Moreover, one can observe that the \(a_1\)-index-based ranking favors earlier generations of mathematicians (those from sixteenth, seventeenth, and eighteenth centuries), whereas the \(a_2\)-index-based ranking features mathematicians from the nineteenth and the twentieth centuries. This is a direct consequence of the impact of the different weights given by these indices to distant academic descendants of an individual.

The ranking of individuals with in-degree zero in the network (that is, “fathers” of genealogical families) by their \(a_\infty\) and adjusted \(a_\infty\)-index values is given in Table 5. The top-ranked scientist with respect to both of these indices is Sharaf al-Din al-Tusi, who lived in the twelfth century and currently has 149,942 academic descendants.

## Evolution of individual and collective *a*-indices in MathGenealogy dataset

As a natural further step in the analysis of individual and collective advising impacts using *a*-indices, we consider the dynamics of year-by-year evolution of the aforementioned indices over the past several decades. Specifically, we consider the time period starting from 1900 till 2017 (which was the last full year for which MathGenealogy data was collected in this study). The main reasons for considering only the data starting from 1900 are that (i) the growth of mathematics as a major research field occurred during the twentieth century with many new Ph.D. degrees awarded during that time frame, and (ii) the collected data itself is more reliable and complete for this most recent time frame, which makes the results on *a*-indices evolution corresponding to this time interval more interesting. We should also note that some of the plots presented below reflect the data starting from 1950, which is done for visual clarity purposes.

Figure 4 shows the year-by-year evolution of *a*-index values of top 10 mathematicians (according to their *a*-index value at present, as indicated in Table 1) starting from 1900 and until 2017. An interesting observation is that for most of these individuals, it took around 20–30 years to grow their *a*-index from 0 to 1 (that is the time period from the year an individual received his/her own Ph.D. degree to the year when his/her first student successfully graduates a student of their own). Further, it took another \(\sim\)30 years to grow their respective *a*-index value from 1 to around 10. The overall time period of 50–60 years to grow the *a*-index from zero to a high value of 10 or more is on the same order of magnitude as the length of a lifetime academic career (i.e., from the receipt of a Ph.D. degree till retirement). This shows that most of these “high-impact” advisors followed a similar temporal pattern of their careers. This observation is also consistent with the intent for this index to reflect an individual’s career-long rather than short-term advising impact.

It should also be noted that the “outliers” in this plot are E.E. Kummer and K.T.W. Weierstrass, who received their own Ph.D. degrees substantially earlier than the other individuals in this list, and their *a*-indices were already equal to 6 in 1900. Interestingly, both of their *a*-index values have “saturated” at 11 around 1920 and have not changed since then. This is most likely due to the fact that all of their direct descendants (students) finished their own academic careers by that time; therefore, they did not produce any more students after that, which means that the respective *a*-index cannot increase anymore. Thus, the *a*-index is a good measure of an individual’s lifetime advising impact; however, it does not reflect any further advising impact that an individual might achieve after the end of his/her own and his/her students’ academic careers. On the other hand, an individual’s \(a_1\), \(a_2\), and \(a_{\infty }\) indices clearly can grow indefinitely, even decades or centuries after the end of one’s career (as it will be illustrated below). Therefore, a long-term advising impact may need to be evaluated by considering a combination of metrics (such as the indices defined in this paper) rather than by taking into account only one metric.

It is also worth mentioning that a *collective* *a*-index of a university or a country does not exhibit the “saturation” behavior that was mentioned above for an *individual* *a*-index. Indeed, a university or a country would typically keep producing Ph.D. graduates indefinitely (unless a university/country ceases to exist). Figures 5 and 6 illustrate the evolution of collective *a*-index values corresponding to top universities and countries (according to their current *a*-index values as shown in Table 2). For visual clarity, these plots are shown starting from 1950 rather than 1900.

For universities’ collective *a*-index values, there were several lead changes during 1900–1950 (not pictured), with Princeton being top-ranked for most of the 1950s and 1960s (briefly overtaken by the University of Chicago in mid-1950s), whereas in 1968 Harvard took the top-ranked position, which it has held till now. It should be also noted that Stanford has made a big jump from number 10 to number 4 in the *a*-index ranking during the past half-century. As for countries’ *a*-indices evolution, the United States passed Germany as number one in the collective *a*-index ranking in 1956 and has held this top position since then.

Note that the collective *a*-indices of most universities and countries that are depicted in Figs. 5 and 6 have not changed since 1990s. This may be explained by the fact that it becomes harder and harder to increase the *a*-index when it has already reached high values (similarly to what happens to the *h*-index in research citations). Another factor may be that not all Ph.D. graduates from the past 10–20 years have been added to the MathGenealogy dataset yet. However, as mentioned above, this “temporary” saturation behavior of collective *a*-indices is not the same as the one we observed for individual *a*-indices, since the production of Ph.D. graduates by a city or a country is not limited by the lengths of academic careers of individual advisors.

Further, we consider the evolution of \(a_1\) and \(a_2\) indices (along with their adjusted versions) of the top advisors listed in Tables 3 and 4. The respective plots are shown in Figs. 7, 8, 9, and 10. Note that the evolution of \(a_{\infty }\)-index is not depicted here, since the plots corresponding to all top advisors according to this index exhibit a highly similar pattern and thus would look indistinguishable in a figure.

Interestingly, from Figs. 7 and 8 one can observe that both the \(a_1\) and the adjusted \(a_1\) index values for all of the top advisors were very close to each other up to around 1970, which is a very recent date compared to the dates of their respective careers. However, in the past 3–4 decades, these indices have increased substantially and started to spread over a broader range of values, approximately between 6000 and 10,000. The ranking of advisors according to both the \(a_1\) and the adjusted \(a_1\) index has been stable over the past decades, with S.D. Poisson holding the top spot.

As for the evolution of \(a_2\) and adjusted \(a_2\) indices, there has been much more diversity and changes in the ranking of top advisors over the past decades (compared to \(a_1\)-index). As noted above, the \(a_2\)-index gives lower weights to distant descendants of an individual, which results in a lower order of magnitude of this index compared to the \(a_1\)-index. Nevertheless, despite a narrower range of values for this index, there have been several changes in the ranks of top advisors according to this index in the twentieth century (although many of these mathematicians worked in the nineteenth century). Notably, D. Hilbert has assumed the top spot in the \(a_2\)-index-based ranking only in the 1990s despite the fact that he received his own Ph.D. degree more than 100 years prior. Thus, the \(a_2\)-index can be viewed as a meaningful metric of an individual’s advising impact that lasts well beyond the end of one’s career and still can increase considerably in subsequent decades or even centuries.

As a concluding remark of this section, we should note that all of the aforementioned results should be viewed in the light of the fact that MathGenealogy dataset is not necessarily complete for the considered time interval, and some Ph.D. graduates (as well as some Ph.D. graduation year information, as mentioned above) may not have been added to the database yet. This may lead to discrepancies in the results presented here with those that may be obtained in future studies when more entries are added to the database. Nevertheless, the considered dataset is still rather large and comprehensive, and the presented results reveal interesting temporal patterns of the proposed individual and collective advising indices.

## Concluding remarks

We proposed a family of network-based advising impact metrics (*a*-indices) that are easy to calculate and interpret, as well as provided a flexible framework for quantifying advising impacts of individuals from different “eras” and stages of their academic careers, as well as collective advising impacts of countries and universities. Although we illustrated our approaches on MathGenealogy dataset only, these approaches are certainly applicable to other scientific domains where comprehensive advisor–student datasets may become available.

Due to the fact that we focus on the advising impact beyond the number of immediate students of an individual, this approach is not intended for measuring advising impacts of early-career scientists (simply calculating an out-degree for “young” advisors would still be a viable option). However, one may argue that a true impact of an academic advisor is evident towards later stages of career when one’s students achieve their own advising success. Therefore, we believe that these indices can be used in practical settings, for instance, by universities in order to quantify and promote individual and collective advising successes of their faculty members. This study shows the applicability of network-based techniques for these purposes. As one of the possible directions of future research, it could be of interest to look at “groups of influential advisors”, for instance, using optimization-based techniques that identify “central” groups of nodes in a network [12].

It should also be noted that this study is not intended to build direct comparisons or preferences between different metrics of advising impact, including those proposed here or those proposed in other related studies. Instead, we believe that long-term individual or collective advising impact should be considered in the context of an “ensemble” of various quantitative metrics, including the proposed *a*-indices. Similarly to debates regarding citation indices (e.g., whether the *h*-index or some other quantitative metrics of citations are the most appropriate to measure citation impact), there is no definitive answer to the question about the “best” metric for advising impact. We hope that this study will stimulate further research in this interesting research direction.

## Availability of data and materials

The dataset analyzed in this study is publicly available online at http://www.genealogy.ams.org/.

## Notes

To be more consistent with the notation for the rest of the “

*a*-indices” defined here, one may denote this index as \(a_0\)-index; however, for simplicity, throughout the paper we will call this metric the “*a*-index” (which may be viewed as an analogy to the*h*-index widely used as a citation metric).Of course, the out-degree of a node, that is, the number of advised students, is the simplest measure that assesses immediate advising impact; however, it is not in the scope of this study as it does not reflect the advising impact of descendants.

## References

Arslan E, Gunes MH, Yuksel M. Analysis of academic ties: a case study of mathematics genealogy. In: GLOBECOM Workshops (GC Wkshps), 2011. IEEE, 2011. p. 125–9.

Malmgren RD, Ottino JM, Amaral LAN. The role of mentorship in protégé performance. Nature. 2010;465(7298):622.

Myers SA, Mucha PJ, Porter MA. Mathematical genealogy and department prestige. Chaos. 2011;21(4):041104.

Gargiulo F, Caen A, Lambiotte R, Carletti T. The classical origin of modern mathematics. EPJ Data Sci. 2016;5(1):26.

Taylor D, Myers SA, Clauset A, Porter MA, Mucha PJ. Eigenvector-based centrality measures for temporal networks. Multiscale Model Simul. 2017;15(1):537–74.

Rossi L, Freire IL, Mena-Chalco JP. Genealogical index: a metric to analyze advisor–advisee relationships. J Inform. 2017;11(2):564–82.

Boldi P, Vigna S. Axioms for centrality. Internet Math. 2014;10(3–4):222–62.

Marchiori M, Latora V. Harmony in the small-world. Physica A. 2000;285(3–4):539–46.

Jackson MO. Social and economic networks. Princeton: Princeton University Press; 2010.

Tsakas N. On decay centrality. BE J Theor Econ. 2016;19:1–18.

Broido AD, Clauset A. Scale-free networks are rare. Nat Commun. 2019;10(1):1017.

Veremyev A, Prokopyev OA, Pasiliao EL. Finding groups with maximum betweenness centrality. Optim Methods Softw. 2017;32(2):369–99.

## Acknowledgements

This material is based upon work supported by the AFRL Mathematical Modeling and Optimization Institute.

### Further information

Preliminary version of this paper appeared in: Chen X., Sen A., Li W., Thai M. (eds) Computational Data and Social Networks, Proceeding of CSoNet 2018, Lecture Notes in Computer Science, vol 11280, pp. 437–449, Springer, 2018.

## Funding

The work of V. Boginski and A. Veremyev was supported in part by the U.S. Air Force Research Laboratory (AFRL) Award FA8651-16-2-0009. The work of A. Semenov was funded in part by the AFRL European Office of Aerospace Research and Development (Grant FA9550-17-1-0030).

## Author information

### Authors and Affiliations

### Contributions

A.S. conducted the experiments. All authors contributed to the analysis of the results and to writing the paper. All authors read and approved the final manuscript.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Semenov, A., Veremyev, A., Nikolaev, A. *et al.* Network-based indices of individual and collective advising impacts in mathematics.
*Comput Soc Netw* **7**, 1 (2020). https://doi.org/10.1186/s40649-019-0075-0

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/s40649-019-0075-0

### Keywords

- Social networks
- Big data
- Scientific advising impact
*a*-Indices- Mathematics Genealogy Project