My cron-driven alter ego ran several scripts on the Trolltech web server. This is about one I wrote in 1996 and ran every half-hour to make sure that qt-related subjects were well covered by the search engines. At the time, some search requests didn't give as good results as they could have, and I thought the likely reason was that the search engines hadn't crawled the right pages.
But the pages tended to be in the Trolltech referrer log, or if not, then some other page very close to them were.
So I wrote a crontab script to watch for new referrers in our apache logs, and whenever it saw one, it did the following.
First, get rid of spam (yes, even then there were spam pages). The test was simple: At most x% of the text could be links, the page should mention one of a set of keywords, the page could contain at most y links, and at least one link had to point to Trolltech.
If the page passed that test, the script tried to clean up the URL a bit (delete session cookies, delete index.html). Next it tried to locate a higher-level index page linking to the candidate page and other related pages (since submitting an index page gave the search engine more to work with).
Finally, the script would submit either the payload page or the index page to Altavista, Hotbot, Lycos and a fourth engine whose name I've forgotten. I don't think it was Google, Google came later.
It worked very well. Searches for Qt-related subjects gave better results than before, and yes, the search engines saw more links to troll.no. The script ran until shortly before I left Trolltech in 2001. By that time Google had learned to crawl well, and the script laid unused and forgotten until I found it today, while going through and wiping my old hard disks. (Update: The reason I don't know the name of the fourth search engine is that I had put the submission URLs in a configuration file.)
Update: The fourth was, of course, Excite, whose existence I had quite forgotten.