File(s) | Type | Description | Action |
---|---|---|---|
merged_fasta_transdecoder_pep.fasta (4.47 GB) | FASTA | Open reading frames of final assembly, predicted by Transdecoder. |
Files
Type: FASTA
Description: Open reading frames of final assembly, predicted by Transdecoder.
This is a FASTA file of protein identifications (scaffold-derived metaproteomic proteins). Samples were taken during R/V Atlantic Explorer cruise AE1913 in Subtropical North Atlantic, beginning at the Bermuda Atlantic Time-series Station (BATS) of the Sargasso Sea and ending in coastal Northeast US shelf waters in June of 2019.
Related data table and dataset descriptions:
Total spectral counts refer to the total number of spectra with peptide to spectrum matches (PSMs) that matches to each entry within the FASTA sequence database. This approach allows each peptide to map to multiple closely related sequences. In contrast, with exclusive spectral counts each peptide is only allowed to map to one sequence within the FASTA database, and when a peptide is found in multiple database sequences the one with the most peptides mapping (parsimony) to it is selected. There are pros and cons to each approach, where total spectral counts will double count peptides when two similar proteins are compared, and exclusive spectral counts will underrepresent less abundant proteins with shared peptides, favoring the most homolog with the most shared peptides. Considering protein groups with shared peptides or focusing on peptide-level analyses are alternative approaches that could be constructed from these results.
See "Related Datasets" section for:
* "AE1913 Protein Spectral Counts" which includes the exclusive and total spectral counts associated with proteins.
* "AE1913 Peptide Spectral Counts" which includes the individual peptides associated with these proteins (includes total spectral counts for each peptide).
CTD and other data from the same cruise are listed on deployment page AE1913: https://www.bco-dmo.org/deployment/916412
These data will become part of the Ocean Protein Portal (https://proteinportal.whoi.edu/; Saito et al., 2020).
The assembly, annotations, metatranscriptomic assembly products, the same exclusive protein spectral counts, and other useful information associated with this multi-omic analysis was published as a package at Zenodo (doi: 10.5281/zenodo.8287779).
Saito, M. A., Cohen, N. (2024) Protein identification FASTA file (scaffold-derived metaproteomic proteins) from samples taken during R/V Atlantic Explorer cruise AE1913 from the Sargasso Sea to Northeast US shelf waters in June of 2019. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 1) Version Date 2024-08-01 [if applicable, indicate subset used]. doi:10.26008/1912/bco-dmo.934727.1 [access date]
Terms of Use
This dataset is licensed under Creative Commons Attribution 4.0.
If you wish to use this dataset, it is highly recommended that you contact the original principal investigators (PI). Should the relevant PI be unavailable, please contact BCO-DMO (info@bco-dmo.org) for additional guidance. For general guidance please see the BCO-DMO Terms of Use document.