Peer-to-peer file sharing (P2P) systems like Gnutella, KaZaA, and eDonkey/eMule, have become extremely popular in recent years, with the estimated user population in the millions.
[1] Users of file sharing networks, such as eMule and Gnutella, are subject to monitoring of their activity.
[1] Clients may also share their private files to the network without notice due to inappropriate settings.
[2] Much is known about the network structure, routing schemes, performance load and fault tolerance of P2P systems in general.
[citation needed] Ping and Pong messages are used for detecting new nodes that can be linked to the actual file download performed by opening TCP connection and using the HTTP GET mechanism.
If a query arrives with a search string that matches one of the files in the leaves, the ultrapeer replies and pointing to the specific leaf.
An academic research performed the following experiment: At NYU, a regular Gnucleus software client that was connected to the Gnutella network as a leaf node, with distinctive listening TCP port 44121.
In less than 15 minutes the crawler found the IP address of the Gnucleus client in NYU with the unique port.
[citation needed] Gnucleus on Windows uses the Ethernet MAC address used as the GUID 6 lower bytes.
[citation needed] The monitoring facility of Gnutella reveals an abundance of precious information on its users.
[citation needed] Some Gnutella users have a small look-alike set, which makes it easier to track them by knowing this very partial information.
Although the usage of hash function is intended to improve the privacy, an academic research showed that the query content can be exposed easily by a dictionary attack: collaborators ultrapeers can gradually collect common search strings, calculate their hash value and store them into a dictionary.
When a hashed query arrives, each collaborated ultrapeer can check matches with the dictionary and expose the original string accordingly.