In e-commerce, link prediction is often a subtask for recommending items to users.
It is also used to identify hidden groups of terrorists and criminals in security related applications.
represents the set of "true" links across entities in the network.
, and the goal is to infer the set of true links at time
Link prediction approaches for this setting learn a classifier
Link prediction approaches for this setting learn a model
The task of link prediction has attracted attention from several research communities ranging from statistics and network science to machine learning and data mining.
In statistics, generative random graph models such as stochastic block models propose an approach to generate links between nodes in a random graph.
For social networks, Liben-Nowell and Kleinberg proposed a link prediction models based on different graph proximity measures.
[2] Several statistical models have been proposed for link prediction by the machine learning and data mining community.
For example, Popescul et al. proposed a structured logistic regression model that can make use of relational features.
[8] For more information on link prediction refer to the survey by Getoor et al.[9] and Yu et al.[10] Several link predication approaches have been proposed including unsupervised approaches such as similarity measures computed on the entity attributes, random walk and matrix factorization based approaches, and supervised approaches based on graphical models and deep learning.
[11] Topology-based methods broadly make the assumption that nodes with similar network structure are more likely to form a link.
It is computed as follows: A weakness of this approach is that it does not take into account the relative number of common neighbors.
This captures a two-hop similarity, which can yield better results than simple one-hop methods.
The powers of A indicate the presence (or absence) of links between two nodes through intermediaries.
denotes Katz centrality of a node i, then mathematically: Note that the above definition uses the fact that the element at location
Node-similarity methods predict the existence of a link based on the similarity of the node attributes.
After normalizing the attribute values, computing the cosine between the two vectors is a good measure of similarity, with higher values indicating higher similarity.
Graph embeddings also offer a convenient way to predict links.
One can then use other machine learning techniques to predict edges on the basis of vector similarity.
A probabilistic relational model (PRM) specifies a template for a probability distribution over databases.
HL-MRFs are created by a set of templated first-order logic-like rules, which are then grounded over the data.
While PSL can incorporate local predictors, such as cosine similarity, it also supports relational rules, such as triangle completion in a network.
These networks are defined by templated first-order logic-like rules, which is then grounded over the training data.
MLNs are able to incorporate both local and relational rules for the purpose of link prediction.
[15] R-Models (RMLs) is a neural network model created to provide a deep learning approach to the link weight prediction problem.
[17] A common applications of link prediction is improving similarity measures for collaborative filtering approaches to recommendation.
Link prediction is also frequently used in social networks to suggest friends to users.
Some authors have used context information in network structured domains to improve entity resolution.