Thursday, July 24, 2008

Knol



Inspired by and entry in Digital Inspiration i took a look at Knol, what is meant to be a competitor of the wikipedia.

In general i think the system neither the idea are bad, in fact I like them however is early and there are things to work on.

However it was the content what surprised me the most. I had to look and search for more than 10 concepts to find a non-medical entry.

I believe that tech community are more active than biomedical community in the use of Internet, Web 2.0 resources. However i haven't seen many entries from this community. Maybe is only because Google has chosen experts in medicine as beta testers.

Anyhow is worth taking a look and yet we have an other source from where to search and retrive information.

Tuesday, July 15, 2008

Using Google for everything?


The other day one of my colleagues told me that he uses Google for everything and i guess he is not the only one.

My question is, is that really effective?. As we don't use the same shoes to go to the beach than to play football we should use different Internet searching resources depending on our objective.

For example if we are looking for a specific document list result engines such as Google or PubMed are fast and effective. We'll probably find the documents within the first 3 results. However if our goal is to know what has been published about something, lets say gene BRCA2, does types of search resources are useless. They are time consuming and we will probably have to do more than one query to have a clear picture of the information available. In those cases search systems which apply more cluster technologies as vivisimo our exalead, our the ones that apply NLP are more effective.

Therefore, don't use Google or PubMed for everything. Choose the right tool for the job.

Sunday, July 13, 2008

Text mining vs Search

A really interesting topic rose up recently on the difference of a text mining solution vs Google search solution at the text analytics forum.

When Stanley Kubrick did his movie 2001: A Space Odyssey in 1968 Artificial
Intelligence(AI) research was creating really high expectation. Why AI
didn't rich such expectations?. Maybe an overestimation of computer
capabilities or that they didn't realize the tremendous amount of
information and heuristics the human being takes into account on daily
communications. The human comprehension process is complex and is the
result of relating environment information, previous knowledge and the
new input of the conversation.

For decades know, computer scientist have been trying to organize data
in a way that can be heavily analyzed by computers, and therefore we
can extract information (data mining). Organizing data in databases
(structure in fields and with restrictive values) to apply statistical
analyses, find correlations and make decisions are now processes well
established. In fact, the risk analyses of the insurance companies our
banks are based on this kind of technologies.

When it comes to free text data these analyses become more difficult.
The data is not stored and organized a priori for a computer to
analyze it, is organize for a human to understand it. Text mining
techniques try to organize the text in order to be able to analyze it
with computers algorithms. Text mining technologies study the words in
its context, their meaning and their role in the sentence. Is not the
same "Flying planes are dangerous" and "Flying planes is dangerous".

The main difference of text mining and search technologies is that,
when we search we are trying to find something somewhere, when we
apply text mining technologies we are trying to have a better
understanding of whatever we are searching. Therefore, if we know what
we are looking for (where I can get a certain type of shoes?) search
technologies such as Google's are really efficient. If we want to
understand impact of and antibiotic in our body and the environment,
search technologies will make us read to much and have a narrower
picture; text mining technologies would be more efficient.

Wednesday, July 9, 2008

The more results the better?

Robert B. Cialdini uses the term judgmental heuristics in his book Influence, science and practice to define different mental shortcuts we build to deal with the increasing complex and rapidly moving environment. One of the examples he gives is expensive=good.

Many people seem to follow this same rule when it comes to evaluate results from search engines.

In the years I have been presenting search engines products it was fairly common to see people surprised or annoyed when the system I was presenting retrieved less results than the one they were currently using. However, they didn't seem to mind if when, with a different query, it was the system I presented the one offering a higher number of results. Provably they followed the judgmental heuristic of more results=better.

With the amount of information available, its accessibility and the daily use of information retrieval technology we should be have in mid a different heuristic. One more like more results=more time. But I guess the fear of missing relevant results prevents from totally vanishing this more result=better shortcut.


Friday, July 4, 2008

The first


After a hard meditation I decided to take the commitment of becoming a "blogger". I read a lot of blogs and I'm always amaze of the energy that the blogger have to keep up and give us a pice of their knowledge every week or day. I want as well share some of that with whoever wants to read, learn and discuss.

This blog will try to comment and discuss over Web technologies in the biomedical arena which is my field of expertise.

I hope to reach my expectations, so far I will be my only reader, and keep up with the commitment of writing first and writing interesting things second.

Let the show begin.