An official website of the United States government
The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before
sharing sensitive information, make sure you’re on a federal
government site.
The site is secure.
The https:// ensures that you are connecting to the
official website and that any information you provide is encrypted
and transmitted securely.
Use Submission Portal GenBank (SP-GenBank) to submit assembled nucleotide sequences to GenBank.
SP-GenBank accepts assembled sequences, except prokaryotic and eukaryotic genomes.
Brief submission guidelines and FAQs are below.
What You Should Expect
SP-GenBank offers four major workflows, organized by sequenced material:
(1) Prokaryote, (2) Eukaryote, (3) Virus, and (4) Synthetic construct.
Some sequence types have features automatically annotated.
Table: Overview of the types of assembled sequences to submit to SP-GenBank () or elsewhere, as noted below.
Prokaryote
Eukaryote
Virus
Synthetic constructs
Feature annotation
rRNA
rRNA-IGS
rRNA
rRNA-ITS
metazoan COX1
SARS-CoV-2
Influenza
Norovirus
Dengue
Automatic
protein coding genes (CDS)
regulatory regions
any other genomic region
naturally occurring plasmids
operons
protein coding genes (CDS)
regulatory regions
any other genomic region
mRNA/cDNA
organelle genomes
ncRNAs
intergenic spacers
markers
all other assembled virus
phage
viroid sequences
viral MAGs
Includes: partial sequences, complete genomes
All synthetic sequences that have a physical counterpart, such as cloning vectors, expression vectors, codon-optimized sequences
Checklist of what you will need for your submission:
Submitter contact details, authors, publication, data release date
Sequencing technology (e.g. Sanger, Illumina)
Nucleotide sequences
Sequence source information (e.g. organism, collection_date)
Feature annotation
Features are automatically annotated for specific submissions, including SARS-CoV-2, Influenza, Dengue virus, Norovirus, metazoan COX1, rRNA, and rRNA-ITS.
For other submission types, you will need to provide feature annotation.
Before you begin your submission, you will need to sign in to NCBI using the Log in button on the upper right side of the SP-GenBank page.
Prepare a plain text file containing sequences in FASTA format.
FASTA format starts with a definition line followed with a hard return and the sequence.
The simplest definition line requires the ">" symbol and a sequence_ID, like the example below.
In one submission, you can upload a single FASTA file that contains up to 3,000 sequences.
However, if your sequences are long (e.g. chloroplast genomes), prepare a FASTA file with fewer sequences and make separate submissions as needed.
All submissions should include information about the biological or environmental source of the sequences, including organism name and descriptive details such as collection_date, geo_loc_name ( geographic location), etc.
You will enter relevant source information by typing directly into a table within the Submission Portal or by uploading a tab-delimited text file.
organism: species name of the organism (e.g. Oryza sativa, Hepacivirus hominis).
You may provide a higher-level taxonomic node if you cannot identify the organism to the species level (e.g. Lactobacillaceae bacterium).
Organism names will be checked against the Taxonomy database, and for unrecognized organisms, you may be presented with additional forms to complete.
Source identifiers distinguish each sequence in a submission file.
These include:
isolate (all organism groups)
strain (prokaryotes and viruses)
clone (prokaryotes and sometimes for synthetic constructs)
specimen_voucher (eukaryotes)
collection_date and geo_loc_name (geographic location) are required for prokaryote, eukaryote, and viruses.
host and/or isolation_source are required for viruses and some prokaryotes.
GenBank features are biological elements present in a sequence, such as coding regions, promoters, rRNAs, etc.
Annotation is the process of applying the features to your sequences.
Automatic feature annotation (no submitter features required):
Features are automatically applied for specific submission types, including SARS-CoV-2, Influenza, Dengue virus, Norovirus, metazoan COX1, rRNA, and rRNA-ITS.
You do not need to annotate features for these submissions.
Submitter provided feature annotation:
For submission types not listed above, there are three ways to provide annotation:
Create and upload protein FASTA file to annotate CDS features
After completing your submission, it will undergo processing at GenBank.
Processing includes automated validations and manual review by NCBI staff.
Various reports will be sent to the email accounts associated with the submission, including copies of your processed submission.
The amount of time to go from submitted to processed status varies.
You can monitor the status of your submission in SP-GenBank.
Some of the submission statuses are:
Error: the submission has errors that you need to fix before accession numbers can be assigned.
Processing: the submission is undergoing an initial review by our staff before assigning accession numbers.
Processing (accession numbers assigned): the submission is undergoing final review by NCBI staff.
We will contact you if we require additional information.
Processed (accession numbers assigned): the submission has gone through all processing steps at GenBank and is ready for public release.
Any changes requested after processed status will be incorporated into your GenBank records, however these changes will not be shown in Submission Portal.
For information on updating processed records, see the GenBank update page.
Submit your sequence data on desktop. The desktop view allows you to easily:
The BankIt submission history page will be permanently discontinued when BankIt is discontinued in late 2026.
We recommend downloading any needed files before that time.
BankIt submission history is not available in Submission Portal.
Alignment import and feature propagation are not yet available in Submission Portal GenBank, so you will need to use BankIt.
Go to BankIt and click the “Submit aligned sequences in BankIt” button.
This will change as more functionality is added to Submission Portal GenBank.
If you have questions about which tool is appropriate for your data type, please contact info@ncbi.nlm.nih.gov
Generally, it is optional to provide BioProject, BioSample, and SRA accessions for GenBank submissions.
However, there are a few exceptions such as viral sequences assembled from metagenomes, where you may be asked to provide this information.
Yes. Obtain accession numbers for your BioProject, BioSample, and SRA data in advance of your SP-GenBank submission.
You may then provide the accession numbers on the Source Modifiers page while you are creating your SP-GenBank submission.
Yes, you may collaborate on your submissions by using a submission group.
Shared submission groups give multiple submitters access to submission data in the Submission Portal.
A shared submission group consists of multiple NCBI accounts within the Submission Portal, all of which have permissions-based access to submissions associated with the group.
Learn more.
SP-GenBank is suitable for most submissions to GenBank.
However, there are alternatives for certain scenarios:
A command-line program, table2asn, is available to use for large submissions to GenBank.
GenBank offers API/programmatic submissions for SARS-CoV-2 and Influenza A, B, and C virus sequences.
If you are interested in programmatic submissions for these types of sequences, review the documents below.
GenBank is the world's largest nucleotide archive containing sequences from all branches of life.
The archive is a foundation for medical and biological discovery.
Review the three GenBank submission paths below to select the appropriate option for your data.
GenBank
Use the Submission Portal-GenBank to submit assembled sequences, except for prokaryotic and eukaryotic genomes and transcriptomes. Apr 2026: Now accepts more sequence types
GenBank-Genome
Submit assembled prokaryotic and eukaryotic genomes.
GenBank-TSA
Submit computationally assembled, transcribed RNA sequences (transcriptomes) after submitting reads to SRA.
Sequence Read Archive (SRA)
SRA is the largest publicly-available repository of high throughput sequencing data. The archive accepts data from all branches of life as well as metagenomic and environmental surveys.
SRA
Submit unassembled, high throughput sequencing reads
Other Tools
GEO
Submit RNA-seq, ChIP-seq, and other types of gene expression and epigenomics datasets.
Learn more
BioProject & BioSample
Choose a tool above if submitting sequence data.
Learn more
Medical Genetics & Variation Tools
Submit clinical data, small & large human genomics variants, and genotype & phenotype data.