You can find many results, but even before reading through 15 million results, you see when Einstein was actually born:
By clicking on “Show details” you can find how Google knows this particular information:
Other examples look less promising, like:
The first answers are actually rather good, as they ultimately will answer my question. However, a closer look at the actual words that have been highlighted suggests that this is thanks to a good keyword match, not because of a natural language processed query.
You can try different question about this remarkable scientist, such as:
- How long did Albert Einstein live?
- How old was Albert Einstein when he died?
- Did Albert Einstein have a sister?
- Was Albert Einstein jewish?
You will find answers to these questions in the documents that Google shows as results indeed, most of them thanks to Wikipedia or Answers.com by the way.
But the point here is that these matchings have been found thanks to a great keyword-based search algorithm, not because of a true natural-language search like in the first example “When was Albert Einstein born?”, where we didn’t even see a link to any particular page.
This is very interesting, specially because “how long did Albert Einstein live?” and “how old was Albert Einstein when he died?” are basically rephrasings of the same question.
Companies like Wolfram Alpha have been working with the concept of “fact computing” for a while now, and the secret behind those is this one: brute force.
They have a huge team of humans entering “facts” (many extracted from Wikipedia or similar sources) and those facts are entered into the system using some form of proprietary knowledge editor.
The question is: can this “semantic search” be exported to the enterprise? Can the enterprise get benefited from these new developments from Google, based on its Google Search Appliance or some other product?
I think this is not quite going to happen.
First of all, entering “facts” based on a particular enterprise using a particular knowledge-editing software is simply not practical or economic. By the time those “facts” have been introduced into the system, reality has already changed making them rather obsolete.
Secondly, data like “the age of”, “the father of” or “the height of” are rather simple facts that are easy to edit and maintain, but corporations handle much more complex “facts”, like:
- “standard procedure to approve a new loan from a foreign customer” or
- “policy to travel with pets and domestic animals on our airplanes”
These “facts” should be found when users ask questions like “can I give a loan to this Spaniard citizen?” for the first one or “can cats fly in your planes?” for the second one.
None of these questions would be answered with this level of accuracy using “semantic search” as it is understood by Google, but by using a much more tailored approach that involves a huge lexicon and business dictionaries to make the appropriate matchings.
Are you still keen to make your Google Search Appliance really Natural Language Based and Semantic?
I would rather try with software like Inbenta Semantic Connector.
Besides, that is the first result you will find when you search “semantic gsa” at Google.com!
Founder and CEO of Inbenta