Personal tools

Foil Search Engine Snoops

From Wired How-To Wiki

Jump to: navigation, search


On March 17, 2006, a federal judge in San Jose, Calif., ordered Google to partially comply with a subpoena from the Justice Department seeking search-engine records in its defense of the Child Online Protection Act, or COPA. U.S. District Judge James Ware denied the DOJ's request for a sample of one million search terms, but ordered Google to hand over a sample of 50,000 URLs returned as search results.

The DOJ had requested the data from all of the major search-engine providers, including Microsoft, AOL and Yahoo, but Google was the only one to fight it.

Privacy advocates praised the decision for keeping a lid on search terms, which can identify searchers even in the absence of any other information.

In August 2006, AOL researchers posted online 20 million search terms representing three months worth of queries from some 658,000 users of its web browser. The researchers had stripped out user names and other personally identifying material, but the search terms alone were sufficient to reveal the identities of the searchers in at least some cases (see FAQ: AOL's Search Gaffe and You, http://www.wired.com/politics/security/news/2006/08/71579). Under criticism, AOL pulled down the data.

In order to better protect consumer privacy, search engine companies have begun to limit the length of time they store search data.

Still, for those worried about what companies or federal investigators might do with such records in the future, here's a primer on how search logs work, and how to avoid being writ large within them.

Contents

Why do search engines save logs of search terms?

Search companies use logs and data-mining techniques to tune their engines and deliver focused advertising, as well to create cool features such as Google Zeitgeist. They also use them to help with local searches and return more relevant, personalized search results.

How does a search engine tie a search to a user?

If you have never logged in to search engine's site, or a partner service like Google's Gmail offering, the company probably doesn't know your name. But it connects your searches through a cookie, which has a unique identifying number. Using its cookies, Google will remember all searches from your browser. It might also link searches by a user's IP address.

How long do cookies last?

It varies. Google in July 2007 changed its cookie expiration from 2038 to 2 years. Microsoft stores information for 18 months, unless users specify otherwise. Ask.com offers an anonymous search option for users, promising to store no information whatsoever.

What if you sign in to a service?

If you sign in on Google's personalized homepage or Yahoo's homepage, the companies can then correlate your search history with any other information, such as your name, that you give them.

Why should anyone worry about the government requesting search logs or bother to disguise their search history?

Some people simply don't like the idea of their search history being tied to their personal lives. Others don't know what the information could be used for, but worry that the search companies could find surprising uses for that data that may invade privacy in the future.

For example, if you use Google's Gmail and web optimizing software, the company could correlate everyone you've e-mailed, all the websites you've visited after a search and even all the words you misspell in queries.

What's the first thing people should do who worry about their search history?

Cookie management helps. Those who want to avoid a permanent record should delete their cookies at least once a week. Other options might be to obliterate certain cookies when a browser is closed and avoid logging in to other services, such as web mail, offered by a search engine.

How do you do that with your browser?

In Firefox, you can go into the privacy preference dialog and open Cookies. From there you can remove your search engine cookies and click the box that says: "Don't allow sites that set removed cookies to set future cookies."

In Safari, try the free and versatile PithHelmet plug-in. You can let some cookies in temporarily, decide that some can last longer or prohibit some sites, including third-party advertisers, from setting cookies at all.

While Internet Explorer's tools are not quite as flexible, you can manage your cookies through the Tools menu by following these instructions.

Have search histories ever been used to prosecute someone?

Robert Petrick was convicted in November 2005 of murdering his wife, in part based on evidence that he had googled the words "neck," "snap" and "break." But police obtained his search history from an examination of his computer, not from Google.

Can I see mine?

Usually, no. But if you want to trace your own Google search histories and see trends, and you don't mind if the company uses the information to personalize search results, you can sign up for Google's beta search history service.

Could search histories be used in civil cases?

Certainly. Google may well be fighting the government simply on principle -- or, as court papers suggest, to keep outsiders from using Google's proprietary database for free. But a business case can also be made that if users knew the company regularly turned over their records wholesale to the government, they might curtail their use of the site.

A related question is whether Google or any other search engine would fight a subpoena from a divorce attorney, or protest a more focused subpoena from local police who want information on someone they say is making methamphetamines.

What if I want more anonymity than simply deleting my cookie when I'm searching?

If you are doing any search you wouldn't print on a T-shirt, consider using Tor, The Onion Router. An EFF-sponsored service, Tor helps anonymize your web traffic by bouncing it between volunteer servers. It masks the origins and makes it easier to evade filters, such as those installed by schools or repressive regimes.

The service has its drawbacks. While it can be very useful for a journalist in China, data services can be slower or have greater latency due to the extra stops the data makes, and a general dearth of servers.

Is Tor perfectly anonymous?

No. Computers leak data. Tor, combined with the Privoxy proxy server (which comes bundled with Tor), reduces some of that leakage, but still isn't foolproof. But when used with Firefox, Tor and Privoxy can provide a mostly-anonymous web browsing experience.

Are there other options?

Anonymizer offers a limited free browsing service and sells software, both of which are supposed to protect your anonymity, but have had serious performance issues. There are other proxy servers on the internet, but you have to judge for yourself whether you trust them, and some websites actively block anonymous browsing.

Answers were compiled with the generous assistance of security consultant Adam Shostack and hacker Jacob Appelbaum.


This page was last modified 20:35, 28 January 2008 by amyatwired. Based on work by Anonymous user(s) of Wired How-To Wiki.

All text and artwork shared under a Creative Commons License.
 
Navigation

Welcome to the Wired How-To Wiki, a collaborative site dedicated to the burgeoning DIY culture. Here you'll find all kinds of projects, hacks, tricks and tips on how to make each day better than the last. Anyone can contribute new items or edit an existing item.

Create an Article