Transcriptome Shotgun Assembly (TSA)

TSA is an open access archive of computationally assembled transcribed RNA sequences from next generation sequencing technologies. Unassembled reads must be submitted to Sequence Read Archive (SRA) before starting the TSA submission.

New We recommend that you download and run NCBI’s new Foreign Contamination Screen (FCS) tool before submitting your assembly, to reduce the number of after-submission corrections and improve the quality of your TSA submission. See NCBI Insights and the FCS publication for more details.

What You Should Expect

Overview

This tool is for submitting computationally assembled transcribed RNA sequences representing a transcriptome. The computationally assembled transcripts are derived from overlapping sequence reads submitted to the Sequence Read Archive (SRA). When you submit, you will need to:

Submit your sequence reads to the SRA prior to submitting your transcriptome. Note your BioProject, BioSample and SRA run accession number(s):
- BioProject (PRJNAXXXXXX)
- BioSample (SAMNXXXXXXXX)
- SRA accession number (SRRXXXXXX)
Prepare your file in ASN.1 or FASTA format and upload your data file according to the instructions.
Note that if you submit an ASN.1 format file, your data will be autopopulated in the submission workflow.
Review or provide a BioProject and BioSample that have already been registered for an SRA submission.
Select a ‘Release Date’ for your submission.
Review or provide the SRA run accession(s) for the sequence reads used to generate this assembly.
Review or provide metadata on the sequencing and assembly of the transcriptome.
Indicate whether your submission is an update to an existing submission.

Prepare and upload your data files.

If there is no annotation, you can upload a FASTA file
If there is annotation, you will need to create an ASN.1 or .sqn file. The submission tool will automatically scan ASN.1 or .sqn files after your upload and prepopulate any provided fields with your data.

Learn more about data files.

Previous Next

Data

Review or provide the following requirements. If you have included required information in the ASN.1 file then check that the auto-populated data is correct and edit as necessary.

You will also need to indicate the release date and whether your submission is an update.

Project/Sample

Review or provide a BioProject and BioSample that have already been registered for an SRA submission.

The BioProject contains the description of the research effort, relevant grant(s), and has links to the public data. A transcriptome must belong to a BioProject, and transcriptomes sequenced as part of the same research effort can belong to a single BioProject. Use the same BioProject for the sequence reads and transcriptome assembly made from those reads; do not create duplicate BioProjects.The SRA run accessions were provided when you submitted to the Sequence Read Archive (SRA).
The BioSample contains the source information of the sample sequenced. Use the same BioSample for the sequence reads and transcriptome assembly made from those reads; do not create duplicate BioSamples.

Primary data

Review or provide SRA run accessions (SRRXXXXXX) for the sequence reads used to create your assembly. The SRA run accessions were provided when you submitted to the Sequence Read Archive (SRA).

Assembly metadata

Review or provide the following information to submit as metadata:

Assembly method: name of the assembly algorithm(s)
Version or date program was run
Assembly name (optional)
Coverage (optional)
Description of assembly method: brief description of the assembly process
Sequencing technology or technologies

Previous Next

Annotation

Annotation is optional.

If you plan to submit a transcriptome with annotation, it must show the focus of the study. Annotation must be biologically valid. If coding regions are provided, the product names should follow the International Protein Nomenclature Guidelines.

Submit your sequence data on desktop. The desktop view allows you to easily:

Enter your information
Enter or upload metadata
Upload large source files
Review your submission

Email me a link to get started

TSA FAQ

SRA archives the raw, unassembled reads that act as the basis for generating the assembled transcriptome. TSA stores the assembled transcriptome.
Submit the unassembled reads to Sequence Read Archive (SRA). This is the required first step to TSA submission.

GenBank

GenBank is the world's largest nucleotide archive containing sequences from all branches of life. The archive is a foundation for medical and biological discovery.

Submit assembled SARS-CoV-2, Influenza, Norovirus, Dengue virus, rRNA, rRNA-ITS, metazoan COX1, Eukaryotic nuclear mRNA sequences.
Learn more Submit
Submit genomic DNA, organelle, ncRNA, plasmids, other viruses, phages, other mRNA, synthetic constructs.
Learn more Submit
Submit assembled prokaryotic and eukaryotic genomes.
Learn more Submit

Sequence Read Archive (SRA)

SRA is the largest publicly-available repository of high throughput sequencing data. The archive accepts data from all branches of life as well as metagenomic and environmental surveys.

Submit unassembled, high throughput sequencing reads

SARS-CoV-2 submission instructions

Learn more Submit

Other Tools

TSA

Submit computationally assembled, transcribed RNA sequences after submitting unassembled reads to SRA. Learn more
GEO

Submit RNA-seq, ChIP-seq, and other types of gene expression and epigenomics datasets. Learn more
BioProject & BioSample

Choose a tool above if submitting sequence data. Learn more

Medical Genetics & Variation Tools

Submit clinical data, small & large human genomics variants, and genotype & phenotype data.

Submission Portal

Transcriptome Shotgun Assembly (TSA)

What You Should Expect

Overview

Files

Data

Annotation

Submit

TSA FAQ

GenBank

Sequence Read Archive (SRA)

Other Tools

TSA

GEO

BioProject & BioSample

Medical Genetics & Variation Tools

Other Resources

Submission Portal

Transcriptome Shotgun Assembly (TSA)

What You Should Expect

Overview

Files

Data

Annotation

Submit

TSA FAQ

What is the difference between a Sequence Read Archive (SRA) and Transcriptome Shotgun Assembly (TSA) submission?

What should I do first when submitting data via the Transcriptome Shotgun Assembly (TSA) tool?

GenBank

Sequence Read Archive (SRA)

Other Tools

Medical Genetics & Variation Tools

Other Resources