Generating research data is easier than ever before, but interpreting and analyzing it is still hard, and getting harder as the volume increases. This is especially true of genomics. Sequencing the whole genome of a single person produces more than 100 gigabytes of raw data, and a million genomes will add up to more than 100 petabytes. In 2003, the Human Genome Project completed after 15 years and $3 billion. Today, it takes closer to one day and $1,000 to sequence a human genome.
This abundance of new information carries great potential for research and human health -- and requires new standards, policies and technology. Thats why Google has joined the Global Alliance for Genomics and Health. The Alliance is an international effort to develop harmonized approaches to enable responsible, secure, and effective sharing of genomic and clinical information in the cloud with the research and healthcare communities, meeting the highest standards of ethics and privacy. Members of the Global Alliance include leading technology, healthcare, research, and disease advocacy organizations from around the world.
To contribute to the genomics community and help meet the data-intensive needs of the life sciences, we are introducing:
- a proposal for a simple web-based API to import, process, store, and search genomic data at scale
- a preview implementation of the API built on Googles cloud infrastructure, including sample data from public datasets like the 1,000 Genomes Project
- a collection of in-progress open-source sample projects built around the common API
Interoperability: One API, Many Apps
Any of the apps at the top (one graphical, one command-line, and one for batch processing) can work with information in any of the repositories at the bottom (one using cloud-based storage and one using local files). As the ecosystem grows, all developers and researchers benefit from each individual developers work. |
With these first steps, it is our goal to support the global research community in bringing the vision of the Global Alliance for Genomics and Health to fruition. Imagine the impact if researchers everywhere had larger sample sizes to distinguish between people who become sick and those who remain healthy, between patients who respond to treatment and those whose condition worsens, between pathogens that cause outbreaks and those that are harmless. Imagine if they could test biological hypotheses in seconds instead of days, without owning a supercomputer.
We are honored to be part of the community, working together to refine the technology and evolve the ecosystem, and aligning with appropriate standards as they arise.
How you can be involved
To request access to the API for your research, please fill out this simple form to tell us about yourself and your research interests, and we will let you know when were ready to work with more partners.
Together with the members of the Global Alliance for Genomics and Health, we believe we are at the beginning of a transformation in medicine and basic research, driven by advances in genome sequencing and huge-scale computing. We invite you to contact us and share your ideas about how to bring data science and life science together.