Whole Genome Shotgun (old)

WGS, Incomplete or Draft Genomes

This is the submission pathway for genomes of prokaryotes or eukaryotes that are draft or incomplete. The primary data in a WGS submission are the contigs (BACs or other clones can also be the primary data). Contigs are assembled from contiguous overlapping sequence reads. Information on how the contigs are assembled into scaffolds and/or chromosomes can be supplied in an AGP file. Annotation can be submitted on either the contig or scaffold/chromosome level; however annotation is not required for a WGS submission. We recommend that assemblies of prokaryotes be annotated on the contigs. We strongly recommend that assemblies of eukaryotes, especially those with thousands of contigs, be annotated on the scaffolds or chromosomes.

Some assemblers generate sequences that link paired-ends with Ns. These units represent scaffolds. For WGS submission, all scaffolds should be split into individual contigs and respective assembly information described in an AGP file.

Contigs should be at least 200bp and should not have any Ns that represent gaps or terminal Ns.

Instructions for WGS genome submission

  • Submit complete organellar and viral genomes as regular GenBank records by emailing the submissions to GenBank Submissions.
  • Complete, annotated genomes are submitted to GenBank as a complete genome. The most common complete genomes are bacteria and archaea. Complete genomes are defined for GenBank as gap-free sequences that are annotated. Therefore, the sequences do not contain Ns that represent gaps. Any genome that contains gaps should be submitted as an incomplete genome (see above). For information about complete genomes, see the bacterial genome submission guidelines.