Lancichinetti–Fortunato–Radicchi benchmark

[1] The advantage of the benchmark over other methods is that it accounts for the heterogeneity in the distributions of node degrees and of community sizes.

[2] The node degrees and the community sizes are distributed according to a power law, with different exponents.

The benchmark assumes that both the degree and the community size have power law distributions with different exponents,

is the number of nodes and the average degree is

This parameter controls the fraction of edges that are between communities.

[2] Thus, it reflects the amount of noise in the network.

all links are between nodes belonging to different communities.

[3] One can generate the benchmark network using the following steps.

Step 1: Generate a network with nodes following a power law distribution with exponent

to get desired average degree is

Step 3: Generate community sizes from a power law distribution with exponent

The sum of all sizes must be equal to

The minimal and maximal community sizes

Then, each node is randomly assigned to a community.

As long as the number of neighboring nodes within the community does not exceed the community size a new node is added to the community, otherwise stays out.

In the following iterations the “homeless” node is randomly assigned to some community.

If that community is complete, i.e. the size is exhausted, a randomly selected node of that community must be unlinked.

Step 5: Implement rewiring of nodes keeping the same node degrees but only affecting the fraction of internal and external links such that the number of links outside the community for each node is approximately equal to the mixing parameter

The communities of randomly chosen nodes in each iteration follow a

distribution that represents the probability that a randomly picked node is from the community

Consider a partition of the same network that was predicted by some community finding algorithm and has

The similarity of these two partitions is captured by the normalized mutual information.

the benchmark and the detected partitions are identical, and if