Factorization threshold models for scalefree networks generation
 Akmal Artikov^{1, 3}Email author,
 Aleksandr Dorodnykh^{1},
 Yana Kashinskaya^{1, 2} and
 Egor Samosvat^{3}View ORCID ID profile
Received: 21 February 2016
Accepted: 4 August 2016
Published: 22 August 2016
Abstract
Background
Several models for producing scalefree networks have been suggested; most of them are based on the preferential attachment approach. In this article, we suggest a new approach for generating scalefree networks with an alternative source of the powerlaw degree distribution.
Methods
The model derives from matrix factorization methods and geographical threshold models that were recently proven to show good results in generating scalefree networks. We associate each node with a vector having latent features distributed over a unit sphere and with a weight variable sampled from a Pareto distribution. We join two nodes by an edge if they are spatially close and/or have large weights.
Results and conclusion
The network produced by this approach is scale free and has a powerlaw degree distribution with an exponent of 2. In addition, we propose an extension of the model that allows us to generate directed networks with tunable powerlaw exponents.
Keywords
Background
Most social, biological, topological and technological networks display distinct nontrivial topological features demonstrating that connections between the nodes are neither regular nor random at the same time [1]. Such systems are called complex networks. On of the wellknown and wellstudied classes of complex networks is scalefree networks whose degree distribution P(k) follows a power law \(P(k) \sim k^{\alpha }\), where \(\alpha \) is a parameter whose value is typically in the range \(2< \alpha < 3\). Many real networks have been reported to be scalefree [2].
Generating scalefree networks is an important problem because they usually have useful properties, such as high clustering [3], robustness to random attacks [4] and easy achievable synchronization [5]. Several models for producing scalefree networks have been suggested; most of them are based on the preferential attachment approach [1]. This approach forces existing nodes of higher degrees to gain edges added to the network more rapidly in a “richgetricher” manner. This paper offers a model with another explanation of scalefree property.
Our approach is inspired by matrix factorization, a machine learning method being successfully used for link prediction [6]. The main idea is to approximate a network adjacency matrix by a product of matrices V and \(V^T\), where V is the matrix of nodes’ latent features vectors. To create a generative model of scalefree networks, we sample latent features V from some probabilistic distribution and try to generate a network adjacency matrix. Two nodes are connected by an edge if the dot product of their latent features exceeds some threshold. This threshold condition is influenced by the geographical threshold models that are applied to scalefree network generation [7]. Because of the methods used (adjacency matrix factorization and threshold condition), we call our model the factorization threshold model.
A network produced in such a way is scalefree and follows powerlaw degree distribution with an exponent of 2, which differs from the results for basic preferential attachment models [8–10] where the exponent equals 3. We also suggest an extension of our model that allows us to generate directed networks with a tunable powerlaw exponent.
This paper is organized as follows. “Related work” section provides information about related works that inspired us. The formal description of our model in the case of an undirected fixed size network is presented in “Model description” section, which is followed by a discussion of how to generate growing networks. In “Generating sparse networks” section, the problem of making resulting networks sparse is considered. “Degree distribution” section shows that our model indeed produces scalefree networks. Extensions of our model, which allows to generate directed networks with a tunable powerlaw exponents and some other interesting properties, will be discussed in “Model modifications” section. “Conclusion” section concludes the paper.
Related work
In this section, we consider related works that encouraged us to create a new model for complex networks generation.
Matrix factorization
Matrix factorization is a group of algorithms where a given matrix R is factorized into two smaller matrices Q and P such that: \(R \approx Q^TP\) [11].
There is a popular approach in recommendation systems which is based on matrix factorization [12]. Assume that users express their preferences by rating some items, this can be viewed as an approximate representation of their interests. Combining known ratings, we get partially filled matrix R, the idea is to approximate unknown ratings using matrix factorization \(R \approx Q^TP\). A geometrical interpretation is the following. The rows of matrices Q and P can be seen as latent features vectors \(\vec {q}_i\) and \(\vec {p}_u\) of items and users, respectively. The dot product \((\vec {q}_i, \vec {p}_u)\) captures an interaction between an user u and an item i, and it should approximate the rating of the item i by the user u: \(R_{ui} \approx (\vec {q}_i, \vec {p}_u)\). Mapping of each user and item to latent features is considered as an optimization problem of minimizing distance between R and \(Q^TP\) that is usually solved using stochastic gradient descent (SGD) or alternating least squares (ALS) methods.
Furthermore, matrix factorization was suggested to be used for link prediction in networks [6]. Link prediction refers to the problem of finding missing or hidden links which probably exist in a network [13]. In [6] it is solved via matrix factorization: a network adjacency matrix A is approximated by a product of the matrices V and \(V^T\), where V is the matrix of nodes’ latent features.
Geographical threshold models
Geographical threshold models were recently proven to have good results in scalefree networks generation [7]. We are going to briefly summarize one variation of these models [14].
First, exponential distribution of weights with the inverse scale parameter \(\lambda \) has been studied. This distribution of weights leads to scalefree networks with a powerlaw exponent of 2: \(P(k) \propto k^{2}\). It is interesting that the exponent of a power law does not depend on the \(\lambda \), d and \(\beta \) in this case. Second, Pareto weight distribution with scale parameter \(w_0\) and shape parameter a has been considered. In this case, a tunable powerlaw degree distribution has been achieved: \(P(k) \propto k^{1  \frac{a \beta }{d} }\).
There are other variations of this approach: uniform distribution of coordinates in the \(d\)dimensional unit cube [15], latticebased models [16, 17] and even networks embedded in fractal space [18].
Model description
We studied theoretically matrix factorization by turning it from a trainable supervised model into a generative probabilistic model. When matrix factorization is used in machine learning, the adjacency matrix A is given and the goal is to train the model by tuning the matrix of latent features V in such way that \(A \approx V^T V\). In our model, we make the reverse: latent features V are sampled from some probabilistic distribution and we generate a network adjacency matrix A based on \(V^T V\).

Network has n nodes and each node is associated with a ddimensional latent features vector \(\vec {v_i}\).

Each latent features vector \(\vec {v_i}\) is a product of weight \(w_i\) and direction \(\vec {x_i}\).

Directions \(\vec {x_i}\) are i.i.d. random vectors uniformly distributed over the surface of \((d1)\)sphere.

Weights are i.i.d. random variables distributed according to Pareto distribution with the following density function f(w):$$\begin{aligned} f(w) = \frac{a}{w_0} {\left( \frac{w_0}{w}\right) }^{a + 1}\; (w \ge w_0). \end{aligned}$$(2)

Edges between nodes i and j appear if a dot product of their latent features vectors \((\vec {v_i}, \vec {v_j})\) exceeds a threshold parameter \(\theta \).
We have defined our model for fixed size networks, but in principle, our model can be generalized for the case of growing networks. The problem is that a fixed threshold \(\theta \) when the size of a network tends to infinity with high probability leads to a complete graph. But real networks are usually sparse.
Therefore, to introduce growing factorization threshold models we use a threshold function \(\theta := \theta (n)\) which depends on the number of nodes n in the network. Then for every value of network size n we have the same parameters except of threshold \(\theta \). This means that at every step, when a new node will be added to the graph, some of the existing edges will be removed. In the next section, we will try to find threshold functions which lead to sparse networks.
To preserve readability of the proofs, we consider only the case \(d = 3\) because proofs for higher dimensions can be derived in a similar way. However, we will give not only meanfield approximations but also strict probabilistic proofs, which to the best of our knowledge have not been done for geographical threshold models yet and can be likely applied in the other works too.
Generating sparse networks
The aim of this section is to model sparse growing networks. To do this, we need to find a proper threshold function.
Analysis of the expected number of edges
Let M(n) denote the number of edges in the network of size n. To find its expectation, we need the two following lemmas.
Lemma 1
Lemma 2
To improve readability, we moved the proofs of Lemmas 1 and 2 to Appendix.
The next theorem shows that our model can have any growth which is less than quadratic.
Theorem 1
Proof It easy to check that \(P_{e}\) is a continuous function of \(\theta \). The intermediate value theorem states that \(P_{e}(\theta )\) takes any value between \(P_{e}(\theta = 0) = 1/2\) and \(P_{e}(\theta = \infty ) = 0\) at some point within the interval.
Since \(R(n) = o(n^2)\) and positive, there exists N such that for all \(n \ge N\), \(0< R(n) < \frac{1}{2} \times \frac{n(n1)}{2}\).
It means that the equation \(\mathrm {E}M(n) = R(n)\) is feasible for all \(n \ge N\). \(\square \)
Taking into account Theorem 1, we obtain parameters for the linearithmic and linear growths of the expected number of edges.
Theorem 2
Theorem 3
Suppose that the growth of the model is sublinearithmic: \({\frac{\mathrm {E}M(n)}{n\ln n} = o(1)}\) , then \({\frac{n^{\frac{1}{a}}}{\theta (n)} = o(1)}\).
Concentration theorem
In this section, we will find the variance of the number of the edges and prove the concentration theorem
Proofs of the following lemmas can be found in the Appendix.
Lemma 3
Lemma 4
Combining these results, we get the following theorem that will be needed to prove the concentration theorem
Theorem 4
Theorem 5
Combining Theorems 2, 3 and 5, we obtain the following corollary.
Corollary 1

The threshold function \(\theta (n)\) equals \(D n^{\frac{1}{a}}\)

\(\frac{n}{\mathrm {E}M(n)} = O(1)\) and \(\frac{\mathrm {E}M(n)}{n\ln n} = o(1)\)
In this way, we have proved that the number of edges in the graph does not deviate much from its expected value. It means that having the linearithmic or the sublinearithmic growth of the expected number of edges we also have the same growth for the actual number of edges.
Degree distribution
In this section, we show that our model follows powerlaw degree distribution with an exponent of 2 and give two proofs. The first is a meanfield approximation. It is usually applied for a fast checking of hypotheses. The second one is a strict probabilistic proof. To the best of our knowledge it has not been considered in the context of the geographic threshold models yet.
Theorem 6
Meanfield approximation
Note that we have not used conditions on k(n) and \(\theta (n)\) yet, they are needed to estimate residual terms in the following rigorous proof.
Note that regardless of the shape parameter of the Pareto distribution of weights we always generate networks with a degree distribution following a power law with an exponent equals 2. In the next section, we modify our model to change the exponent of the degree destribution and some other properties of the resulting networks.
Model modifications
In this section, we will show how to modify our model to get new properties and how these modifications will affect the degree distribution.
Directed network
Theorem 7
Proof Here is a proof for the outdegree distribution. The case of the indegree distribution is similar.
With \(\alpha = \beta \) this model turns into an undirected case with the powerlaw exponent equals 2 that agrees with Theorem 6.
Functions of dot product
In our model because of the condition \({w_i w_j (\vec {x_i}, \vec {x_j}) \ge \theta \ge 0}\) node \(\vec {v_i}\) can only be connected to the node \(\vec {v_j}\) if an angle between \(\vec {x_i}\) and \(\vec {x_j}\) is less than \(\pi /2\). This is a constraint on the possible neighbors of a node that restricts the scope of our model.
Theorem 8
Short scheme of proof
Here is the scheme of proof for the outdegree distribution. The case of the indegree is similar.
 A.
The first case is \([r, q] \subset (0, +\infty )\). If \(\frac{\theta }{w^\alpha (w')^\beta } \in [r, q]\), then we may invert h and the inner integral I is equal to \(2\pi \left( 1  h^{1}\left( \frac{\theta }{w^\alpha (w')^\beta }\right) \right) \). If \(\frac{\theta }{w^\alpha (w')^\beta } > q\), then the inequality (15) is not satisfied and \(I=0\). If \(0< \frac{\theta }{w^\alpha (w')^\beta } < r\), then the inequality (15) is satisfied for any pair of x and \(x'\), \(I = 4\pi \), the surface area of \(S^2\).
To deal with \(P_{e}(w)\), we need to compare \(w_0\) with boundaries for each range of \(\frac{\theta }{w^\alpha (w')^\beta }\) 1.If \(w_0 < \frac{\theta ^{1/\beta }}{w^{\alpha /\beta } q^{1/\beta }}\), then$$\begin{aligned} P_{e}(w) &= \int\limits _{w_0}^{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } q^{1/\beta }}} 0 \mathrm {d}w' + \int\limits _{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } q^{1/\beta }}}^{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}} \frac{aw_0^a}{(w')^{a+1}} \frac{1}{2} \left[ 1  h^{1}\left( \frac{\theta }{w^\alpha (w')^\beta }\right) \right] \mathrm {d}w' \\& \quad + \int\limits _{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}}^{\infty } 4\pi \frac{aw_0^a}{(w')^{a+1}} \mathrm {d}w'. \end{aligned}$$
 2.If \(\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } q^{1/\beta }} \le w_0 < \frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}\), then$$\begin{aligned} P_{e}(w) = \int\limits _{w_0}^{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}} \frac{aw_0^a}{(w')^{a+1}} \frac{1}{2} \left[ 1  h^{1}\left( \frac{\theta }{w^\alpha (w')^\beta }\right) \right] \mathrm {d}w' + \int\limits _{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}}^{\infty } 4\pi \frac{aw_0^a}{(w')^{a+1}} \mathrm {d}w'. \end{aligned}$$
 3.
Last case is \(w_0 \ge \frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}\). But \(\theta (n)\) grows with n, and for big enough n this inequality will not be satisfied.
 1.
 B.The second case is \([r, q] \not \subset (0, +\infty )\), which implies \(r \le 0\). If \(\frac{\theta }{w^\alpha (w')^\beta } \in (0, q]\), then \(I=2\pi \left( 1  h^{1}\left( \frac{\theta }{w^\alpha (w')^\beta }\right) \right) \). If \(\frac{\theta }{w^\alpha (w')^\beta } > q\), then \(I=0\). This givesIt remains only to show that \(P_{out}(k) = k^{2}(1+o(1))\). But now it is easy to see that the influence of every kind of the principal parts of the integral for \(P_{e}(w)\) has been already examined in previous theorems for degree distributions. For example,$$\begin{aligned} P_{e}(w) = \int\limits _{\max (w_0, \frac{\theta ^{1/\beta }}{w^{\alpha /\beta } q^{1/\beta }})}^{\infty } \frac{aw_0^a}{(w')^{a+1}} \frac{1}{2} \left[ 1  h^{1}\left( \frac{\theta }{w^\alpha (w')^\beta }\right) \right] \mathrm {d}w' \end{aligned}$$what is proportional to the one we got in Theorem 7. Therefore, we are not giving here additional details. \(\square \)$$\begin{aligned} \int\limits _{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } q^{1/\beta }}}^{\frac{\theta ^{1/\beta }}{w^{\alpha /\beta } r^{1/\beta }}} \frac{aw_0^a}{(w')^{a+1}} \frac{1}{2} \left[ 1  h^{1}\left( \frac{\theta }{w^\alpha (w')^\beta }\right) \right] \mathrm {d}w' = \frac{w_0^a w^{2a\alpha /\beta }}{\beta \theta ^{a/\beta }}\int\limits _{r}^{q} (1  h^{1}(t)) t^{a/\beta  1} \mathrm {d}t, \end{aligned}$$
For example, described class of functions contains functions like \(e^x\) and \({x^{2m+1} + c}\), \({m \in \mathbb {N}}\), for a proper constant c.
Of course, not only this small class of functions h(x) has no influence on the degree distribution. For example, it is easy to show that \(h(x) = x^{2m}, m \in \mathbb {N}\) also has this property. In this way, a proof will be different only in the computation of \(P_{e}(w)\).
Conclusion
In our work, we suggest a new model for scalefree networks generation, which is based on the matrix factorization and has a geographical interpretation. We formalize it for fixed size and growing networks. We proof and validate empirically that degree distribution of resulting networks obeys power law with an exponent of 2.
We also consider several extensions of the model. First, we research the case of the directed network and obtain powerlaw degree distribution with a tunable exponent. Then, we apply different functions to the dot product of latent features vectors, which give us modifications with interesting properties.
Further research could focus on the deep study of latent features vectors distribution. It seems that not only a uniform distribution over the surface of the sphere should be considered because, for example, cities are not uniformly distributed over the surface of Earth. Besides, we want to try other distributions of weights.
Declarations
Authors' contributions
This work is the result of a close joint effort in which all authors contributed almost equally to defining and shaping the problem definition, proofs, algorithms, and manuscript. The research would not have been conducted without the participation of any of the authors. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Albert R, Barabási AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002;74(1):47.MathSciNetView ArticleMATHGoogle Scholar
 Clauset A, Shalizi CR, Newman ME. Powerlaw distributions in empirical data. SIAM Rev. 2009;51(4):661–703.MathSciNetView ArticleMATHGoogle Scholar
 ColomerdeSimon P, Boguná M. Clustering of random scalefree networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2012;86:026120 (preprint arXiv:1205.2877).View ArticleGoogle Scholar
 Callaway DS, Newman ME, Strogatz SH, Watts DJ. Network robustness and fragility: percolation on random graphs. Phys Rev Lett. 2000;85(25):5468.View ArticleGoogle Scholar
 Moreno Y, Pacheco AF. Synchronization of kuramoto oscillators in scalefree networks. EPL (Europhys Lett). 2004;68(4):603.View ArticleGoogle Scholar
 Menon AK, Elkan C. Link prediction via matrix factorization. In: Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M, editors. Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2011, Athens, September 5–9, 2011, Proceedings, Part II. Berlin: Springer; 2011. p. 437–52.View ArticleGoogle Scholar
 Hayashi Y. A review of recent studies of geographical scalefree networks. arXiv preprint physics/0512011; 2005.Google Scholar
 Barabási AL, Albert R. Emergence of scaling in random networks. Science. 1999;286(5439):509–12.MathSciNetView ArticleMATHGoogle Scholar
 Bollobás B, Riordan O, Spencer J, Tusnády G, et al. The degree sequence of a scalefree random graph process. Random Struct Algorithms. 2001;18(3):279–90.MathSciNetView ArticleMATHGoogle Scholar
 Holme P, Kim BJ. Growing scalefree networks with tunable clustering. Phys Rev E. 2002;65(2):026107.View ArticleGoogle Scholar
 Lee DD, Seung HS. Algorithms for nonnegative matrix factorization. In: Advances in neural information processing systems; 2001. p. 556–562.Google Scholar
 Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer. 2009;8:30–7.View ArticleGoogle Scholar
 LibenNowell D, Kleinberg J. The linkprediction problem for social networks. J Am Soc Inf Sci Technol. 2007;58(7):1019–31.View ArticleGoogle Scholar
 Masuda N, Miwa H, Konno N. Geographical threshold graphs with smallworld and scalefree properties. Phys Rev E. 2005;71(3):036108.MathSciNetView ArticleGoogle Scholar
 Morita S. Crossovers in scalefree networks on geographical space. Phys Rev E. 2006;73(3):035104.View ArticleGoogle Scholar
 Rozenfeld AF, Cohen R, BenAvraham D, Havlin S. Scalefree networks on lattices. Phys Rev Lett. 2002;89(21):218701.View ArticleGoogle Scholar
 Warren CP, Sander LM, Sokolov IM. Geography in a scalefree network model. Phys Rev E. 2002;66(5):056105.View ArticleGoogle Scholar
 Yakubo K, Korošak D. Scalefree networks embedded in fractal space. Phys Rev E. 2011;83(6):066111.View ArticleGoogle Scholar