Sunday, August 01, 2004

I'm back!

Back from my 2 days vacation! It was a pretty cool trip, except for the mosquitoes part (I must be tasty, I got eaten alive while my friend was barely getting any attention from them!) and the 'Oh no I don't have a lift to get back in town so let's hope someone will take me on the highway while it's RAINING' part :) Took 4 hours instead of 2... even with these inconvenient, it was totally worth it. Wolves howling at night… can we ask for something more?

Let’s get back on topic. Today’s rant is about GenBank (well, the protein and nucleotide sections of Entrez). Right now, this database is growing exponentially, which is a good thing. Except for the “our search engine isn’t worth squat” thing. Try to search for the most common thing, let’s say GFP. First result?

PREDICTED: Gallus gallus similar to ataxin-1 ubiquitin-like interacting protein (LOC426353), mRNA

Hmmm... not very relevant. Out of the 2990 results, the true GFP is at position 2348. I’m not joking… and it’s basically the same story for every search, especially on HIV-1 clones (let’s say NL4.3). No options to sort by relevance. It’s a big, exponentially expanding mess. Couldn’t they sort with a Google like algorithm, with the most linked entry being to most important? I hope that they’ll fix it pretty soon, because it’s really a pain to use…

