File(s) | Type | Description | Action |
---|---|---|---|
longterm_microbe_sample_info.csv (3.45 KB) | Comma Separated Values (.csv) | Primary data file for dataset ID 855750 | |
Ashworth_data_and_analysis.zip (18.51 KB) | ZIP Archive (ZIP) | READ ME for Filho et al. (2021) data and data analysis package. Code and data files described below are packaged within Ashworth_data_and_analysis.zip. References to figures and tables below refer to the results paper "Structure and Long-Term Stability of the Microbiome in Diverse Diatom Cultures," Filho et al. (2021). This data archival package contains 11 files: 1. This readme.txt file 2. Ashworth.mothurcode.txt which contains the mothur code used to analyze 16S sequencing data 3. Ashworth.rcode.txt which contains the R code used for some of the statistical analysis of the mothur output 4-11. Comma-separated values spreadsheets containing the raw data used in the R analyses. Ashworth.mothurcode.txt: To execute this code, first download the fastq sequence files from the NCBI SRA archive, BioProject PRJNA706454. Execute mothur from the directory containing these files, and then run each line in order. You will need to adjust the paths to the Silva 16S databases based on your own system; instructions on where to find these databases can be found on the mothur wiki page. NOTE: there are a few lines of R code commented into the mothur code. These require a package called SRS that executes the ranked subsampling algorithm described in the manuscript text. The comments explain how to transfer your mothur data to R, execute the SRS code, and then transfer the output back into mothur. Throughout the mothur code there are commented lines showing output relevant to our data analysis. These correspond to results reported in the manuscript and can be helpful guideposts if you are trying to replicate our results. Ashworth.rcode.txt To execute this code, set your R working directory to the location of the .csv files contained in this data archive. You should then be able to run all of the code at once, replicating our statistical analyses and re-creating our figures. Key results are included as commented lines. NOTE: at the end of this file we have included, as commented lines, the results of our online blastn analyses with full details on the best hits for our unidentified bacteria. Ashworth_Culture1.csv This file shows the relative abundances of the 10 most common OTUs in Culture 1 at each of the 4 sampled time points. The top row indicates the elapsed time since cultivation for each sample. This file is one of the inputs used to create the Muller plot in Figure 2. Ashworth.Muller.csv This file is the other input necessary for creating Figure 2. All it shows is that none of the OTUs are lineal descendants of any of the others. axes.csv This is the main data file containing the mothur output regarding diversity, culture identity, and ordination results. Columns are as follows: Culture: Unique diatom cultures as described in Table 1 Group: Code signifying which fastq files correspond to each sample Species: Diatom species Site: Which of the specific sampling locations the culture was collected at Locale: More coarse-grained region where the culture was collected Class: Diatom class Order: Diatom order Time: Time between culture isolation and DNA extraction MostAbundOTU: OTU number that was most abundant in each sample MostAbundTax: Taxonomy of the most abundant OTU PropMostAbund: Relative abundance of the most abundant OTU PropUbiquitous: Proportion of each sample comprised of the 32 ubiquitous OTUS NumberUbiquitous: Number of the 32 ubiquitous OTUs detected in the sample axis1, axis2, axis3: coordinates for each sample from NMDS jabund analysis chao, chao_lci, chao_hci: chao index with low and high confidence intervals coverage: estimated sequencing coverage of the sample sobs: number of OTUs in the sample shannon, shannon_lci, shannon_hci: Shannon diversity index with low and high confidence intervals invsimpson, invsimpson_lci, invsimpson_hci: Inverse Simpson diversity index with low and high confidence intervals CorrAxes.csv These coordinates describe the vectors of the 10 most abundant OTUs with a significant impact on the position of samples in the NMDS plot. The "Cultures" column indicates how many different cultures each OTU was detected in. SharedOTUs.csv, SharedOTUs.astrosyne.csv, SharedOTUs.gabgab.csv These files show the number of OTUs shared between pairs of samples. Files show either all samples, only the samples from Astrosyne radiata cultures, or only the samples from culture originally collected at Gab Gab Beach in Guam. ubiquitous.csv This file shows the relative abundances of the 32 ubiquitous OTUs in each of the 15 samples. It was used to create the hierarchical clustering plot in Figure 3. |
Files
Type: Comma Separated Values (.csv)
Description: Primary data file for dataset ID 855750
Type: ZIP Archive (ZIP)
Description: READ ME for Filho et al. (2021) data and data analysis package. Code and data files described below are packaged within Ashworth_data_and_analysis.zip. References to figures and tables below refer to the results paper "Structure and Long-Term Stability of the Microbiome in Diverse Diatom Cultures," Filho et al. (2021). This data archival package contains 11 files: 1. This readme.txt file 2. Ashworth.mothurcode.txt which contains the mothur code used to analyze 16S sequencing data 3. Ashworth.rcode.txt which contains the R code used for some of the statistical analysis of the mothur output 4-11. Comma-separated values spreadsheets containing the raw data used in the R analyses. Ashworth.mothurcode.txt: To execute this code, first download the fastq sequence files from the NCBI SRA archive, BioProject PRJNA706454. Execute mothur from the directory containing these files, and then run each line in order. You will need to adjust the paths to the Silva 16S databases based on your own system; instructions on where to find these databases can be found on the mothur wiki page. NOTE: there are a few lines of R code commented into the mothur code. These require a package called SRS that executes the ranked subsampling algorithm described in the manuscript text. The comments explain how to transfer your mothur data to R, execute the SRS code, and then transfer the output back into mothur. Throughout the mothur code there are commented lines showing output relevant to our data analysis. These correspond to results reported in the manuscript and can be helpful guideposts if you are trying to replicate our results. Ashworth.rcode.txt To execute this code, set your R working directory to the location of the .csv files contained in this data archive. You should then be able to run all of the code at once, replicating our statistical analyses and re-creating our figures. Key results are included as commented lines. NOTE: at the end of this file we have included, as commented lines, the results of our online blastn analyses with full details on the best hits for our unidentified bacteria. Ashworth_Culture1.csv This file shows the relative abundances of the 10 most common OTUs in Culture 1 at each of the 4 sampled time points. The top row indicates the elapsed time since cultivation for each sample. This file is one of the inputs used to create the Muller plot in Figure 2. Ashworth.Muller.csv This file is the other input necessary for creating Figure 2. All it shows is that none of the OTUs are lineal descendants of any of the others. axes.csv This is the main data file containing the mothur output regarding diversity, culture identity, and ordination results. Columns are as follows: Culture: Unique diatom cultures as described in Table 1 Group: Code signifying which fastq files correspond to each sample Species: Diatom species Site: Which of the specific sampling locations the culture was collected at Locale: More coarse-grained region where the culture was collected Class: Diatom class Order: Diatom order Time: Time between culture isolation and DNA extraction MostAbundOTU: OTU number that was most abundant in each sample MostAbundTax: Taxonomy of the most abundant OTU PropMostAbund: Relative abundance of the most abundant OTU PropUbiquitous: Proportion of each sample comprised of the 32 ubiquitous OTUS NumberUbiquitous: Number of the 32 ubiquitous OTUs detected in the sample axis1, axis2, axis3: coordinates for each sample from NMDS jabund analysis chao, chao_lci, chao_hci: chao index with low and high confidence intervals coverage: estimated sequencing coverage of the sample sobs: number of OTUs in the sample shannon, shannon_lci, shannon_hci: Shannon diversity index with low and high confidence intervals invsimpson, invsimpson_lci, invsimpson_hci: Inverse Simpson diversity index with low and high confidence intervals CorrAxes.csv These coordinates describe the vectors of the 10 most abundant OTUs with a significant impact on the position of samples in the NMDS plot. The "Cultures" column indicates how many different cultures each OTU was detected in. SharedOTUs.csv, SharedOTUs.astrosyne.csv, SharedOTUs.gabgab.csv These files show the number of OTUs shared between pairs of samples. Files show either all samples, only the samples from Astrosyne radiata cultures, or only the samples from culture originally collected at Gab Gab Beach in Guam. ubiquitous.csv This file shows the relative abundances of the 32 ubiquitous OTUs in each of the 15 samples. It was used to create the hierarchical clustering plot in Figure 3.