Distributed web crawling

Such systems may allow for users to voluntarily offer their own computing and bandwidth resources towards crawling web pages.

Newer projects are attempting to use a less structured, more ad hoc form of collaboration by enlisting volunteers to join the effort using, in many cases, their home or personal computers.

LookSmart is the largest search engine to use this technique, which powers its Grub distributed web-crawling project.

Upon downloading crawled web pages, they are compressed and sent back, together with a status flag (e.g. changed, new, down, redirected) to the powerful central servers.

According to the FAQ about Nutch, an open-source search engine website, the savings in bandwidth by distributed web crawling are not significant, since "A successful search engine requires more bandwidth to upload query result pages than its crawler needs to download pages...".