UCSC Genome Bioinformatics

Genocoding Project

Genomic Text Indexing: Scanning papers for genomic identifiers and mapping them to the human genome. We currently recognize DNA and protein sequences, SNPs, bands and gene symbols.

Project details »

Open letter to publishers

In February 2012 we sent out a letter to various publishers.

Letter »

Press Coverage

Richard van Norden wrote a news article in Nature 2012. See also the editorial. We have also been mentioned in an article about text mining in the Guardian.

Read main article »


You can follow the progress of permission requests here.

Read more »

Example results

The current Elsevier/PMC data on the UCSC genome browser zoomed to the EGF gene.

UCSC Genome Browser »

Current crawler status

Since June 2012, we are crawling documents.

Current status »


We often hear similar questions from publishers and are trying to address them here.

Read more »

Classification results

Articles assigned to biological databases

Read more »