Wednesday, November 23, 2005

Micro Content Revolution

I tried in this table to sum up the differences between current prevailing search engines like Google and Yahoo and next generation micro content engines like QTSaver.

To clarify a bit the too short phrases in the table I decided to add paragraphs from my past postings that shed light where needed.

Authors' intention
I believe that the future of search engines is in microcontent manipulation. The current sequence of articles from a certain beginning to a certain end will be shattered to pieces and the development of an argument from assumption to deduction will lose its hypnotic power. Each excerpt will have a life of its own in cyberspace and will find its place sometimes in one role other times in another role. The original intention of the author will be forgotten and each new author will recycle the excerpt for his new intention.

When people talk about relevance they are talking about Macro-Content relevance. Macro-Content is almost always a mixture of relevant and irrelevant content. When you're looking for a monkey in a certain zoo you usually get many other animals and many other zoos.
Computers are simple. Language is complicated.
Computers look for a match between words in a query and words in a result. They don't care about the meaning of these words.
The simplest case is when there is one unique word with one unique meaning… Then there is a case of one word that has many meanings (synonyms)… In case many meanings have one word (Homonyms) for example:' apple' from the tree and ' apple' the company - the computer can find a right match or a wrong match or both. If it found a right one there is no problem, and if it found wrong ones the user is either frustrated or confused.

I found the terms "fair dealing" or "fair use" as relevant to cover this issue. I found many answers to this question which say the same thing over and over again. Here's one of these excerpts:

You may use short, direct quotations without the need to obtain written
permission from the copyright holder provided that you give proper credit to
author and sources. We define 'fair use' as excerpts under 400 words (or
series of excerpts totaling fewer than 800 words as long as no single
excerpt is
longer than 300 words) from one work. If extensive, longer
extracts are being
used, you must obtain permission if they amount to 10% or
more of the original
work. If the whole work is being used, e.g. a poem,
written permission is required.
Michael Arnzen wrote about The Work-for-Hire Plagiarist:

There's been a spate of job listings coming in from student plagiarists looking
to hire professionals to write their papers for them...

It seems that there is a potential market here for QTSaver which delivers similar results without being a plagiarist.

Automatic multi queries (Pileups)
Since QTSaver retrieves sometimes too few excerpts other times an avalanche of excerpts I started exercising a Pileup Query which will take the first "suggestion to refine your query" and add it to the query phrase, then the first Suggestion to refine your query from the second results page and so on until I get 18 excerpts. This might bring some homogeneity to the retrieval and the user will get used to receiving 18 excerpts in each query.

Filtering by place, by person or company, etc.
There will be several views to choose from:

· View by country
· View by date
· View by person or company etc.

In order to rearrange the microcontents future search engines will need first to unify the scattered answers into one document and then to retrieve the "views" according to special algorithms.

User feedback (more frustration-less frustration
My hall of frustration

Why QTSaver?
- In order to reduce search engine frustration.

Why frustration?
- Because we want to find information and we can not find it. In our imagination we translate our failure in survival terms:
I will not finish my term paper
so they will kick me out of school
so I will never find a decent job
so I will never find a spouse
so I will never have kids…

No comments: