Main Page | Recent changes | Edit this page | Page history

Printable version

Not logged in
Log in | Help
 
Other languages: Dansk | Deutsch | Esperanto | Français | 日本語 (Nihongo) | 한국어 (Hangukeo) | Nederlands | Polski | 中文 (Zhongwen) | Simple English | Svenska

Google

From Wikipedia, the free encyclopedia.

Google is an Internet search engine that not only indexes the World Wide Web, but also caches the web pages themselves. It also indexes pictures on the web, Usenet newsgroups and news sites. As of 2003, it was the most popular search engine, handling upwards of 80% of all internet searches through its website and clients like Yahoo! and AOL.

The popularity of Google is shown by the fact that nowadays the verb "to google" is often used for "doing a web search".

Table of contents

The company

Google was established by Larry Page and Sergey Brin, two Stanford Ph.D. students who developed the theory that a search engine based on a mathematical analysis of the relationships between websites would produce better results than the basic techniques then in use. Convinced that the pages with the most links to them on other sites must be the most relevant ones, they decided to test this thesis as part of their studies, and laid the foundation for a their search engine. They founded their company in September 1998.

The company, headquartered in Mountain View, California, is privately held, with the major investors being the venture capital firms Kleiner Perkins Caufield & Byers and Sequoia Capital. In October 2003, while discussing a possible initial public offering, the company was approached by Microsoft about a possible partnership or merger; Google apparently rejected the offer.

Etymology

The word "Google" is a play on the word 'googol', which was coined by Milton Sirotta, nephew of American mathematician Edward Kasner in 1938, to refer to the number represented by 1 followed by 100 zeros. Google's use of the term reflects the company's mission to organize the immense amount of information available on the Web. Originally the search engine was called 'Googol'. When the founders presented their project to an angel investor, they received a cheque made out to 'Google'! They thought about it for a couple weeks, then decided to open an account in the name 'Google'.

PageRank

Google uses an algorithm called PageRank to rank web pages that match a given search string. The PageRank algorithm computes a recursive figure of merit for web pages, based on the weighted sum of the PageRanks of the pages linking to them. The PageRank thus derives from human-generated links, and correlates well with human concepts of 'importance'. Previous keyword-based methods of ranking search results, used by many search engines that were once more popular than Google, would rank pages by how often the search terms occurred in the page, or how strongly associated the search terms were within each resulting page. In addition to PageRank, Google also uses other secret criteria for determining the ranking of pages on result lists.

Servers

Google employs a farm of more than 10,000 GNU/Linux computers to answer search requests and to index the web. The indexing is performed by a program ("googlebot") which periodically requests new copies of the web pages it already knows about. The links in these pages are examined to discover new pages to be added to its database. The index database and web page cache is several terabytes in size.

Optimization

Since Google has become one of the most popular search engines, many webmasters have become interested in following and attempting to explain changes to the rankings of their websites.

An industry of consultants has arisen to assist websites in improving their rankings at Google, as well as other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for increasing rankings.

Forums can be found on the web where phenomena such as the "Google dance" are discussed. The Google dance is a period of a few days towards the end of a month when Google updates its database and ranking algorithms. Changes to the database can be observed by examining the number of results to a search such as "link:www.yahoo.com".

During the "dance" period, a site's ranking may change dramatically over a short period of time and different Google servers (e.g., www.google.com, www2.google.com, www3.google.com, www.google.co.uk, www.google.com.au etc.) may give different results for the same search. The dance period appears to coincide with the time at which the googlebot examines "stable" sites. Rapidly changing sites, highly ranked sites and news sites are examined more often, although apart from news, only minor adjustments are made to rankings during most of the month. In some cases it may take two or three months before new pages appear in search results.

The monthly searching, indexing and ranking cycle was replaced by a continuous rolling update in the summer of 2003. This change in the way google updates significantly reduced the unstable results of the monthly update "dance".

One of Google's chief challenges is that as its algorithms and results have gained the trust of web users, the profit to be gained by a commercial web site in subverting those results has increased dramatically. Some search engine optimization firms have attempted to inflate their Google ranking by various artificial means, attempting to draw more searchers to their clients' sites. Google has apparently managed to defeat or weaken these attempts by reducing the ranking of sites known to use them.

Google publishes a set of guidelines for website owners interested in improving their rankings using legitimate optimization consultants. [1]

Google and the courts

For its efforts, Google has drawn a possibly barratrous lawsuit from a company, SearchKing, that sought to sell advertisements on pages with inflated Google rankings. Google has stated in its defense that its rankings are its constitutionally protected opinions of the web sites that it lists. [2] [3]

Google has been criticized for placing long-term cookies on users' machines, enabling them to track a user's search terms over time. However, most of Google's services can be used with cookies disabled.

A number of organizations have used DMCA laws of the USA to demand that Google remove references to material on other sites that they claim copyright over. Google typically handles this by removing the link as requested and including a link to the complaint in the search results. There have also been complaints that the "Google cache" feature violates copyright, however the consensus seems to be that caching is a normal part of the functionality of the web, and that HTTP provides adequate mechanisms for requesting that caching be disabled (which Google respects; it also honors the robots.txt file).

In 2002, there were news reports that the Google search engine had been banned in China. A mirror site (in all respects, including mirrored text) called elgooG proved useful to get around the ban. The ban was later lifted, and reports indicated that it was not Google itself that was targeted. Rather, Google's feature of a cached version of a website, would allow Chinese users to circumvent any ban of a website itself, by merely visiting the cache instead. There is also a dynamic Google mirror working as a Proxy at http://www.zensur.freerk.com/google/. It is interesting to note that a better caching service is provided by http://www.archive.org/, yet this site was not banned.

Other tools

Google also has a usenet archive, called Google Groups (formerly an independent site known as Dejanews), an experimental machine translation services (see link below), and an image search function (called "Google Images").

Google introduced a beta release (a product of the so- called Google Labs) of an automated news compilation service, "Google News" in April 2002. At first, articles from 150 sources were updated hourly. By September 2002, the number of sources had been expanded to over 4000. Since then, the service had expanded to German, French, Spanish, and Italian news sites, and continuous updating has been initiated.

In April 2002, Google launched a new service called "Google answers". Google answers is an extension to the conventional search - rather than doing the search yourself, you pay someone else to do the search. Customers ask questions, offer a price for an answer, and researchers answer them. Prices for questions range from $2 to $200, Google keeps 25% of the payment, passing the rest to the researchers, and charges an additional 50c listing fee. In May 2003 this service came out of beta, though the site hasn't attracted as many customers as hoped.

In February 2003, Google acquired Pyra Labs, owner of blogger.com, a pioneering and leading blog-hosting website. Upon first glance, the acquisition seemed inconsistent with the general "mission" of Google. However, it was soon theorized that Google perhaps plans to utilize information gleaned from blog postings to improve the speed and relevancy of articles contained in Google News.

Google also includes a calculator and units converter, see [4].

Books

Google Hacks from O'Reilly & Associates is a book containing tips about using Google effectively.

Related articles

External links

Google.com links

Other sites


[Main Page]
Main Page
Recent changes
Random page
Current events

Edit this page
Discuss this page
Page history
What links here
Related changes

Special pages
Bug reports
Donations