A method for pruning dense networks to highlight key linksRelationships among a set of elements are often represented as a square matrix with entries representing the relations between all pairs of the elements.
Relations such as distances, dissimilarities, similarities, relatedness, correlations, co-occurrences, conditional probabilities, etc., can be represented by such matrices.
Such data can also be represented as networks with weighted links between the elements.
Such matrices and networks are extremely dense and are not easily apprehended without some form of data reduction or pruning.
A pathfinder network results from applying a pruning method that removes weaker links from a (usually dense) network according to the lengths of alternative paths (see below).
[1][2][3] It is used as a psychometric scaling method based on graph theory and used in the study of expertise, education,[4] knowledge acquisition, mental models,[5] and knowledge engineering.
It is also employed in generating communication networks,[6] software debugging,[7] visualizing scientific citation patterns,[8] information retrieval, and other forms of data visualization.
It helps to simplify the collection of connections involved which is valuable in data visualization and in comprehending essential relations among the elements represented in the network.
Several psychometric scaling methods start from pairwise data and yield structures revealing the underlying organization of the data.
Data clustering and multidimensional scaling are two such methods.
Network scaling represents another method based on graph theory.
Pathfinder networks are derived from matrices of data for pairs of entities.
Because the algorithm uses distances, similarity data are inverted to yield dissimilarities for the computations.
The links in the network will be undirected if the proximities are symmetrical for every pair of entities.
If the proximities are not symmetrical for every pair, the links will be directed.
is simply the sum of the distances of the links in the path.
A link is pruned if its distance is greater than the minimum distance of paths between the nodes connected by the link.
Efficient methods for finding minimum distances include the Floyd–Warshall algorithm (for
Both of the parameters have the effect of decreasing the number of links in the network as their values are increased.
The network with the minimum number of links is obtained when
With ordinal-scale data (see level of measurement), the
would result from any positive monotonic transformation of the proximity data.
require data measured on a ratio scale.
parameter can be varied to yield the desired number of links in the network or to focus on more local relations with smaller values of
Essentially, pathfinder networks preserve the shortest possible paths given the data.
will be the minimum spanning tree for the links defined by the proximity data if a unique minimum spanning tree exists.
includes all of the links in any minimum spanning tree.
Here is an example of an undirected pathfinder network derived from average ratings of a group of biology graduate students.
Further information on pathfinder networks and several examples of the application of PFnets to a variety of problems can be found in the references.
Three papers describing fast implementations of pathfinder networks: (The two variants by Quirin et al. are significantly faster.