Date: January 17, 2018
Town & Country San Diego, Sunrise Room, 9:30 AM – 5:30 PM.
2018 Plant and Animal Genomes conference.
Contact: contact@genomeark.org
Workshop Organizers:
Beth Shapiro, Ph.D., UC Santa Cruz, Santa Cruz, CA: bashapir@ucsc.edu
Sadye Paez, Ph.D., MSPT, MPH, Rockefeller University, NY: spaez@rockefeller.edu
Erich D. Jarvis, Ph.D., G10K Chair, Rockefeller University, NY: ejarvis@rockefeller.edu
VGP Goals:
The goal of the Reference Vertebrate Genomes Project (VGP) is to generate high-quality, error-free, near gapless, chromosome-level, phased, and annotated reference genome assemblies of at least one individual each representing all 66,000 extant vertebrate species, and to use those genomes to address fundamental questions in biology, disease, and conservation. We will use tested methods to attempt to achieve a minimum G10K-defined genome assembly metric of: N50 contig > 1 Mb; N50 scaffold > 10 Mb; 90% of the genome assembled into chromosomes validated by at least 2 methods; an average QV40 or higher base call quality; and haplotypes phased as much as possible – called a 3.4.2.QV40 phased genome. Following the B10K bird consortium model, we are conducting the VGP in taxonomic phases, from orders (Phase 1), families (Phase 2), and genera (Phase 3), to eventually all species (Phase 4). For Phase 1, the primary species selection criterion are those representing shared divergence times before and soon after the last mass extinction event 66 MYA, which results in over 260 lineages. Analyses of these ordinal genomes will be published as consortium papers in special issues of journals with high visibility. These studies will include: A genome-scale ordinal-level vertebrate family tree; Development of a more universal vertebrate gene nomenclature based on gene and genome evolution; Reconstruction of the common ancestor genome of all vertebrates; Determination of genomic signatures of specialized traits for each vertebrate class and species, including humans; and Determination of the genetics of why some lineages are more resistant to specific diseases or extinction; among other studies. Phase 1 is being funded by grass roots efforts among participating scientists. The G10K negotiated significantly discounted cost for this and all phases. We have an open-door policy for scientist to join the VGP, benefit from the discounts and conduct science on the shared Genome Ark Library. The successful outcome of Phase 1 will be used as leverage to raise the necessary funds to sequence the genomes of all ~1K vertebrate families, then all ~10K genera, and finally all ~66K species.
Mission statement for the Jan 2018 reference G10K-VGP workshop:
The mission of the 2018 G10K workshop is to advance Phase 1 of the reference Vertebrate Genomes Project (VGP), with major focuses on: developing sequencing and assembly approaches for much-improved error-free, near gapless, chromosomal-level, haplotype-phased assemblies; setting alignment and annotation standards; and setting a path for project-wide permits, databases, websites, additional funding, and publications.
Preparation:
To prepare in advance so that you can more easily follow the workshop presentation and discussions, please read background information in the 3-page G10K-VGP workshop document and watch the G10K-VGP 2017 year-end presentation: https://hstream.hhmi.org/Mediasite/Play/0ecc9f46d2c942e1888747863e2d51001d Username: G10K ; Password: Genomes. With these documents and presentations, please respect standard rules of scientific ethical conduct for credit and the G10K-VGP Embargo Data Use Policy:
Workshop Agenda
Introduction (Will start promptly at 9:30AM)
9:30-9:45 The reference VGP and goals of the 2018 G10K workshop
G10K Chair, Erich Jarvis, Rockefeller University, NY, USA
Workshop Session 1: Sequencing and Assembly (10:00AM-3:00PM)
Chair: Adam Phillippy, NIH, Bethesda, MD, USA
9:45–10:00 The VGP assembly working group: assembling high-quality reference genomes for all vertebrate orders
Adam Phillippy & Arang Rhie, NIH, Bethesda, MD, USA
10:00–10:15 Towards engineering a better diploid assembler
James Drake & Jonas Korlach, Pacific Biosciences, Palo Alto, CA, USA
10:15–10:30 Generating Arima-HiC data for polished chromosome-scale phased genome assemblies
Siddarth Selvaraj, Arima Genomics, San Diego, CA, USA
10:30–10:45 New Bionano Direct Labeling and Staining Chemistry (DLS), Chromosomal-level Maps, and Algorithm Improvements for Hybrid Scaffolding Alex Hastie, Bionano Genomics, San Diego, CA, USA
10:45-11:00 Coffee/Tea Break
11:00–11:15 10X Genomics Supernova 2 and haplotype phasing
Deanna Church, 10X Genomics, San Francisco, CA, USA
11:15–11:30 Scaff10X: A relational matrix based algorithm for genome scaffolding using 10X Genomics data
Zemin Ning, Sanger Institute, Hinxton, UK
11:30–11:45 Phased diploid genomes using short, long, and linked reads
Michael Schatz, Johns Hopkins, MD, USA
11:45-12:00 Lessons learned from long-read sequence and assembly of primate genomes
Evan Eichler, University of Washington, Seattle, WA, USA
12:00-12:15 Evaluation of VGP assemblies with gEVAL
Kerstin Howe, Jo Wood, & Richard Durbin, Sanger Institute, Hinxton, UK
12:15-12:30 Evaluation of VGP hummingbird assemblies using tools for comparative genomics
Harris Lewin & Joana Damas, UC Davis, Davis, CA, USA
12:30-12:45 Pick up lunch food and settle at discussion tables
12:45-1:00 DNA isolation and sequencing technology challenges for the VGP
Olivier Fedrigo & Jacquelyn Mountcastle, Rockefeller University, NY, USA
1:00-1:15 DNANexus and AWS pipeline for the VGP
Brett Hannigan, DNANexus, San Francisco, CA, USA
1:15–3:00 Working afternoon lunch group discussion to advance sequencing and assembly approaches for the VGP ordinal Phase 1
3:00-3:15 Coffee/Tea Break
Workshop session 2: Alignment and Annotation (3:15PM-4:30PM)
Chair: Francoise Thibaud-Nissen, NCBI, Washington DC, USA
3:15-3:30 Status of Cactus algorithm for reference-free alignment and annotation of vertebrate genomes
Benedict Paten, UCSC, Santa Cruz, CA, USA
3:30-3:45 New Ensembl annotation pipeline developed for phased VGP genomes and their meta-data
Paul Flicek, EMBL-EBI, Hinxton, UK
3:45-4:00 NCBI annotation of VGP assemblies: pre-requisites and recommendations
Francoise Thibaud-Nissen and Kim Pruit, NCBI, Washington DC, USA
4:00-4:15 Group discussion to finalize requirements and path for annotation
4:15-4:30 Coffee/Tea Break
Workshop session 3: Permits, Websites, Funding, Publications (4:30PM-5:30PM)
Chair: Beth Shapiro, UCSC, Santa Cruz, CA, USA
4:30-4:45 Obtaining blanket national and international VGP sample permits
Bob Murphy, University of Toronto, Ontario, Canada
4:45-5:00 VGP Website and GenomeArk database
Sadye Paez, Rockefeller University, NY, USA
5:00-5:15 Raising remaining funds and planning publications for ordinal VGP
Erich Jarvis, Rockefeller University, NY, USA
5:15-5:30 Summary of workshop outcomes
5:30 Close of G10K-VGP 2018 workshop