Bioinformatics

The emergence of the genomics era is redefining biomedical research and, to an even larger extent, cancer research. Rather than focusing on individual genes and proteins, scientists can now explore a substantial component of the expressed genome and proteome. Thus bioinformatics, which is the convergence of biology, information science and computation, will play a critical role in the future of cancer biology.

The large amounts of high-throughput data we have generated have caused us to develop expertise in bioinformatics approaches for data analysis.

Initially we explored mathematical and statistical models to meta-analyze gene expression microarray data (Cancer Research 62:4427) generated by different laboratories and often using different experimental platforms (e.g., spotted cDNA microarrays or Affymetrix). This proved to be a powerful approach to validate gene expression signatures and nominate biomarkers in an in-silico fashion by inter-study validation of results. For example, this approach led us to identify AMACR as a tissue biomarker of prostate cancer and allowed us to extend our results regarding EZH2 to other cancers such as breast cancer.

More recently, meta-analysis approaches for microarray data inspired us to develop Oncomine, a cancer microarray compendium and integrated data-mining platform (www.oncomine.org). Oncomine is targeted to cancer biologists who may have little bioinformatics expertise with the goal of making publicly available tumor gene expression datasets more accessible. Oncomine contains data from:

As of August 2004, Oncomine has more than 2,000 registered users worldwide, a large fraction of which are repeat users. At least 100 distinct users worldwide access the database on a daily basis.

We have proceeded to take advantage of this cancer microarray compendium to identify a universal meta-signature of cancer that represents a set of genes generally over-expressed in cancer regardless of their tissue of origin and a de-differentiation meta-signature that represents a gene expression pattern shared in aggressive cancers (PNAS 101:9309).

We are now exploring common pathway and transcriptional network characteristics in human tumors using the data housed in Oncomine.