Sunday, February 19, 2006

Search Engines

During the last months I suggested a few times  to build a mini-Wikipedia by using QTsaver's algorithm to extract relevant content from Wikipedia.

I demonstrated the feasibility of the project, but couldn't find support to really make it happen.

Since I believe in this proposal and since I don't like being dependent on others – I decided to start it alone and see what happens. Maybe I'll get tired after a few postings, maybe someone will join the effort – we'll see…

Where to start?

Well, in what interests me most, search engines.

A Search Engine is a program designed to help find information stored on a computer system such as the World Wide Web, or a personal computer. The Search Engine allows one to ask for content meeting specific criteria (typically those containing a given word or phrase) and retrieves a list of references that match those criteria. Search engines use regularly updated indexes to operate quickly and efficiently. Without further qualification, Search Engine usually refers to a Web search engine, which searches for information on the public Web. Other kinds of Search Engine are enterprise search engines, which search on intranets, personal search engines, which search individual personal computers, and mobile search engines… Some search engines also mine data available in newsgroups, large databases, or open directories like Unlike Web directories, which are maintained by human editors, search engines operate algorithmically. Most web sites which call themselves search engines are actually front ends to search engines owned by other companies.

Wikipedia derives 66% of traffic from Search Engine referrals, and 50% of that traffic comes from Google, meaning 33% of all Wikipedia traffic is from Google referrals alone (according to Hitwise)

AikiWeb Aikido Information is a comprehensive site on aikido, with essays, forums, images, reviews, columns, and other information. Chief among its notable content is its aikido dojo search engine.

In November 2000, Benjamin Cohen of CyberBritain registered the domain name "" for an MP3 search engine; his first choice, "", was taken.However, he never actually used the domain name for this purpose or ran any company using this name; in fact, the domain name was inoperative for a long time. Apple was granted a UK restricted (non music) trademark for ITUNES on March 23, 2001, and launched its popular iTunes music store service in the UK in 2004.

On December 8 2004, Clinton announced that he was the new spokesperson for Accoona, an internet Search Engine company.

Archie is a Search Engine designed to index FTP archives, allowing people to find specific files. The original implementation was written in 1990 by Alan Emtage, Bill Heelan, and Peter J. Deutsch, then students at McGill University in Montreal.

Search Engine optimization (SEO) is a set of methods aimed at improving the ranking of a website in Search Engine listings.The term also refers to an industry of consultants that carry out optimization projects on behalf of clients' sites.

Using search engines, visitors can find sites in a variety of ways: via paid-for advertisements in the Search Engine results pages (SERPs), via third parties who are listed in the search engines, or via "organic" listings, i.e. the results the search engines present users. SEO is primarily concerned with improving the visibility of a site in the organic search results.

Grub is the name for a Search Engine acquired by LookSmart based on distributed computing.Users may download the grubclient software and let it run during computer idle time. The client indexes URLs and sends them back to the main grub server in a highly compressed form.

Though many believe in Grub's distributed computing system, the Search Engine has its share of opponents.Many state that a large cache is not the strength of a good search engine, rather, that it is the ability to deliver accurate, precise results to users.Loyal fans of Google state that they enjoy that Search Engine for its targeted results and would not switch to Grub unless its search technology were superior to Google's.Quite a few webmasters are opposed to Grub for its apparent ignorance of sites' robots.txt files. These files can prevent robots from caching certain areas.

A Search Engine that displays possible search queries as you type and shows results with a logo of the top level domain.

Snap was founded by Idealab in October of 2004. The name "Snap" was purchased and no longer has any connection with the earlier Search Engine of the same name.

Search Engine Marketing is a marketing method to promote a website in Search Engine results pages.It can be done using Pay-Per-Click campaigns or by using Search Engine optimization methods.

Search Industry Explained - Four part article that gives the past, present and future of Search Engine marketing, optimization and related technologies.

The Google search engine keeps a cached copy of each page it examines on the web.These copies are used by the Google indexing software, but they are also made available to Google users, in case the original page is unavailable.If you click on the "Cached" link in a Google search result, you will see the web page as it looked when Google indexed it.

The first Web search engine was "Wandex", a now-defunct index collected by the World Wide Web Wanderer, a web crawler developed by Matthew Gray at MIT in 1993. Another very early search engine, Aliweb, also appeared in 1993, and still runs today. The first "full text" crawler-based search engine was WebCrawler, which came out in 1994. Unlike its predecessors, it let users search for any word in any web page, which became the standard for all major search engines since. It was also the first one to be widely known by the public. Also in 1994 Lycos (which started at Carnegie Mellon University) came out, and became a major commercial endeavor.

Many of the portals started initially as either Internet directories (notably Yahoo!) and/or search engines (Excite, Lycos, AltaVista, infoseek, and Hotbot among the old ones).The expansion of service provision occurred as a strategy to secure the user-base and lengthen the time a user stays on the portal. Services which require user registration such as free email, customization features, chatrooms were considered to enhance repeat use of the portal.

Image:AltaVista-1996.png Early AltaVista site header. The search engine went online in 1995 and soon surpassed Lycos and Excite in popularity.It was the first-ever multi-lingual search engine.It was also the first major search engine to support non-Latin language, such as Japanese or Chinese.AltaVista later extended this by introducing localized portals in many countries.

The original idea behind Ask Jeeves was the ability to answer questions asked in natural language. Ask Jeeves was the first commercial question-answering search engine for the World Wide Web.It supports a variety of user queries in plain English (natural language), as well as traditional keyword searching and strives to be more intuitive and user-friendly than other search engines.Ask Jeeves sold the same technology used on the site to corporate entities including Dell, Toshiba, and E*Trade. That part of the business was sold to Kanisa in 2002.

Brainboost, Dogpile, Excite, ez2find, InCrawler, Ithaki, IxQuick, KillerInfo, Mamma, Metacrawler,, Webcrawler,THETOPDOWN.

Excite is an Internet portal with an included search engine.It is one of the most recognized brands on the Internet, and along with Yahoo and Netscape was one of the pioneering "dotcoms" of the 1990s.

Excite offers a variety of services, including search, web-mail, stock quotes, and a customizable user homepage.The news and other content on the portal is provided by over 100 different providers.

In 1996, the company bought two search engines, Magellan and WebCrawler, and went public with an initial offering of two million shares priced at $17 USD.It gained exclusive distribution agreements with companies such as Netscape, Microsoft and Apple Computer.  

Powered By Qumana

1 comment:

deji said...

Your blog content is just themed and original. Please keep up the great work.