Friday, December 10, 2004

GECKO: a complete large-scale gene expression analysis platform

Fresh bioinformatics / microarray news coming from BMC Bioinformatics (which RSS feed can be found on the right sidebar). GECKO, a free, open source application designed to analyze massive amounts of data coming from high-throughput microarray experiments. I didn't have the chance to try it yet; the binary isn't available for download but will be very soon according to the abstract. You can check the sourceforge project page to download the binary and source when it'll be ready.

At first glance, it seems to do everything costly programs (like GeneSpring - ~3000$/year) can do, and then some. Is the future of bioinformatics open-source? Only time will tell, but if everything developed by the Open source bioinformatics movement is as feature-complete as this particular piece of software, the industry don't have a chance... Here's the abstract for details and features.

GECKO: a complete large-scale gene expression analysis platform
Joachim Theilhaber , Anatoly Ulyanov , Anish Malanthara , Jack Cole , Dapeng Xu , Robert Nahf , Michael Heuer , Christoph Brockel and Steven Bushnell

BMC Bioinformatics 2004, 5:195 doi:10.1186/1471-2105-5-195

Published 10 December 2004

Gecko (Gene Expression: Computation and Knowledge Organization) is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community.

Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing ~50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph), in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (~100 users) and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data.

The Gecko system is being made publicly available as free software ( In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.

Back Home