QIIME 2 Enables Comprehensive End‐to‐End Analysis of Diverse Microbiome Data and Comparative Studies with Publicly Available Data

Abstract QIIME 2 is a completely re‐engineered microbiome bioinformatics platform based on the popular QIIME platform, which it has replaced. QIIME 2 facilitates comprehensive and fully reproducible microbiome data science, improving accessibility to diverse users by adding multiple user interfaces. QIIME 2 can be combined with Qiita, an open‐source web‐based platform, to re‐use available data for meta‐analysis. The following basic protocol describes how to install QIIME 2 on a single computer and analyze microbiome sequence data, from processing of raw DNA sequence reads through generating publishable interactive figures. These interactive figures allow readers of a study to interact with data with the same ease as its authors, advancing microbiome science transparency and reproducibility. We also show how plug‐ins developed by the community to add analysis capabilities can be installed and used with QIIME 2, enhancing various aspects of microbiome analyses—e.g., improving taxonomic classification accuracy. Finally, we illustrate how users can perform meta‐analyses combining different datasets using readily available public data through Qiita. In this tutorial, we analyze a subset of the Early Childhood Antibiotics and the Microbiome (ECAM) study, which tracked the microbiome composition and development of 43 infants in the United States from birth to 2 years of age, identifying microbiome associations with antibiotic exposure, delivery mode, and diet. For more information about QIIME 2, see https://qiime2.org. To troubleshoot or ask questions about QIIME 2 and microbiome analysis, join the active community at https://forum.qiime2.org. © 2020 The Authors. Basic Protocol: Using QIIME 2 with microbiome data Support Protocol: Further microbiome analyses


INTRODUCTION
This tutorial illustrates the use of QIIME 2 (Bolyen et al., 2019) for processing, analyzing, and visualizing microbiome data. Here we use, as an example, a high-throughput 16S rRNA gene sequencing study, starting with raw sequences and producing publicationready analysis and figures (see Basic Protocol). QIIME 2 can also process other types of microbiome data, including amplicons of other markers such as 18S rRNA, internal transcribed spacers (ITS), and cytochrome oxidase I (COI), shotgun metagenomics, and untargeted metabolomics. We will also show how to combine results from an individual study with data from other studies using the Qiita public database framework , which can be used to confirm relationships between microbiome and phenotype variables in a new cohort, or to generate hypotheses for future testing.
Typical QIIME 2 analyses can vary in many ways, depending on your experimental and data analysis goals and on how you collected the data. In this tutorial, we use the QIIME 2 command-line interface, and focus on processing and analyzing a subset of samples from the Early Childhood Antibiotics and the Microbiome (ECAM) study (Bokulich, Chung, et al., 2016). We will start with raw sequence files and use a single analysis pipeline for clarity, but note where alternative methods are possible and why you might want to use them.
programming, an application programmer interface (API) version of this protocol is also available at https: . No additional software is needed for using the API. Jupyter notebooks for both of these protocols are also available at https:// github.com/ qiime2/ paper2/ tree/ master/ notebooks. Finally, an enhanced interactive live version of the CLI protocol is also available at https: // curr-protoc-bioinformatics.qiime2.org with all intermediate files precomputed. While we strongly encourage users to install QIIME 2 on their own computers and follow along with this tutorial. The enhanced live version provides an alternative for when time and computational resources are limited. Following along with the live version of this protocol enables users to skip any step and instead download the pre-processed output required for a subsequent step. Additionally, the live version also provides simple "copy to clipboard" buttons for each code block which, unlike copying from a PDF file, retains the original formatting of the code, making it easy to paste into other environments. The enhanced live protocol will also be updated regularly with every new release of QIIME 2, unlike the published version, which will remain static with the 2019.10 version.

Acquire the data from the ECAM study
In this tutorial, we will be using QIIME 2 to perform cross-sectional as well as longitudinal analyses of human infant fecal microbiome samples. The samples we will be analyzing are a subset of the ECAM study, which consists of monthly fecal samples collected from children at birth up to 24 months of life, as well as corresponding fecal samples collected from the mothers throughout the same period. The original sequence files from this study are of the V4 region of the 16S rRNA gene that were sequenced across five separate runs (2 × 150 bp) on an Illumina MiSeq machine. To simplify and reduce the computational time required for this tutorial, we have selected the forward reads of a subset of these samples for processing. To follow along with this protocol, create a new directory, then download the raw sequences (∼700 MB) and the corresponding sample metadata file into it: mkdir qiime2-ecam-tutorial cd qiime2-ecam-tutorial wget -O 81253.zip https://qiita.ucsd.edu/public_ artifact_download/?artifact_id=81253 unzip 81253.zip mv mapping_files/81253_mapping_file.txt metadata.tsv Ignore the warning errors during the unzipping step; this is expected behavior. You can now delete the original zip file 81253.zip to save space.

Explore sample metadata files
In the previous step, in addition to downloading sequence data, we downloaded a set of researcher-generated sample metadata. In the context of a microbiome study, sample metadata are any data that describe characteristics of the samples that are being studied, the site they were collected from, and/or how they were collected and processed. In this example, the ECAM study metadata include characteristics like age at the time of collection, birth mode and diet of the child, the type of DNA sequencing, and other information. This is all information that is generally compiled at the time of sample collection, and thus is something the researcher should be working on prior to a QIIME 2 analysis. Suggested standards for the type of study metadata to collect, and how to represent the values, are discussed in detail in MIMARKS and MIxS (Yilmaz et al., 2011). In this article, we also include a Support Protocol on metadata preparation to help users generate quality metadata. In QIIME 2, metadata are most commonly stored as a TSV (i.e., tab-separated values) file. These files typically have a .tsv or .txt file extension.

of 46
Current Protocols in Bioinformatics TSV files are text files used to store data tables, and the format can be read, edited, and written by many types of software, including spreadsheets and databases. Thus, it is usually straightforward to manipulate QIIME 2 sample metadata using the software of your choosing. You can use a spreadsheet program of your choice such as Google Sheets to edit and export your metadata files, but you must be extremely cautious about automatic, and often silent, reformatting of values using these applications. For example, the use of programs like Excel can lead to unwanted reformatting of values, insertion of invisible spaces, or sorting of a table in ways that scramble the connection between sample identifiers and the data. These problems are very common and can lead to incorrect results, including missing statistically significant patterns. See the "Metadata preparation" section in the Support Protocol below for details regarding best practices for creating and maintaining metadata files.
Detailed formatting requirements for QIIME 2 metadata files can be found at https: // docs.qiime2.org/ 2019.10/ tutorials/ metadata/ . Metadata files stored in Google Sheets can be validated using Keemei (Rideout et al., 2016), an open-source Google Sheets plugin available at https:// keemei.qiime2.org. Once Keemei is installed, select Add-ons > Keemei > Validate QIIME 2 metadata file in Google Sheets to determine whether the metadata file meets the required formatting of QIIME 2.
Open the metadata.tsv file with your software of choice and explore the content. Take note of the column names, as we will be referring to these throughout the protocol. Cual-ID may be useful for creating sample identifiers, and the Cual-ID paper (Chase, Bolyen, Rideout, & Caporaso, 2015) provides some recommendations on best practices for creating sample identifiers for data management.

Importing DNA sequence data into QIIME 2 and creating a visual summary
The next step is to import our DNA sequence data (in this case, from the 16S rRNA gene) into QIIME 2. All data used and generated by QIIME 2, with the exception of metadata, exist as QIIME 2 artifacts and use the .qza file extension. Artifacts are zip files containing data (in the usual formats, such as FASTQ) and QIIME 2−specific metadata describing the various characteristics of the data such as their semantic type, data file format, relevant citations for analysis steps that were performed up to this point, and the QIIME 2 steps that were taken to generate the data (i.e., the data provenance). See the Appendix (Glossary) at the end of this article for additional information. QIIME 2 allows you to import and export data at many different steps, so that you can export them to other software or try out alternative methods for particular steps. When importing data into QIIME 2, you need to provide detail on what the data are, including the file format and the semantic type. Currently, the most common type of raw data from high-throughput amplicon sequencing are in FASTQ format. These files may contain single-end or paired-end DNA sequence reads, and will be in either multiplexed or demultiplexed format. Multiplexed files typically come as two (or three in the case of paired-end runs) files consisting of your sequences (forward and/or reverse, often but not always referred to as R1 and R2 reads, respectively) and a separate barcode file (often but not always referred to as the I1 reads). In demultiplexed format, you will have one (or two in the case of paired-end data) sequence files per sample, as the sequences have already been assigned to their designated sample IDs based on the barcode files. For the demultiplexed format, the sample name will typically be a part of the file name. In this protocol, our sequences are in single-end demultiplexed FASTQ format produced by Illumina's Casava software. As our data are split across multiple files, to import we will need to provide QIIME 2 with the location of our files and assign them sample IDs; this is done using the manifest file. A manifest file is a user-created tab-separated values file with two columns: the first column, sample-id, holds the name you assign to each Estaki et al.

of 46
Current Protocols in Bioinformatics of your samples, and the second column, absolute-filepath, provides the absolute file path leading to your raw sequence files. For example: Alternatively, your sample metadata file can also be made to double as a manifest file by simply adding the absolute-filepath column to it; in this protocol we demonstrate the creation and use of a separate manifest file. You can create a manifest file in a variety of ways using your favorite text editor application. Here we use a simple bash script to create ours.
1. Create the manifest file with the required column headers: echo -e "sample-id\tabsolute-filepath" > manifest.tsv 2. Use a loop function to insert the sample names into the sample-id column, and add the full paths to the sequence files in the absolute-filepath column: for f in 'ls per_sample_FASTQ/81253/*.gz'; do n='basename $f'; echo -e "12802.${n%.fastq.gz}\t$PWD/$f"; done >> manifest.tsv 3. Use the manifest file to import the sequences into QIIME 2:

Alternative Pipeline
Your data may not be demultiplexed prior to importing to QIIME 2. Instructions on how to import multiplexed FASTQ files, as well as a variety of other data types, can be found online at https:// docs.qiime2.org/ 2019.10/ tutorials/ importing/ . With multiplexed data, you will also need to demultiplex your sequences prior to the next step. Demultiplexing in QIIME 2 can be performed using either the q2-demux (https:// docs.qiime2.org/ 2019.10/ plugins/ available/ demux/ ) plugin which is optimized for data produced using the EMP protocol (Caporaso et al., 2012), or the q2cutadapt (https:// docs.qiime2.org/ 2019.10/ plugins/ available/ cutadapt/ ) plugin (which additionally supports demultiplexing of dual-index barcodes using cutadapt; Martin, 2011). Visualizations often contain a statistical results table, an interactive figure, one or more static images, or a combination of these. Because they do not need to be used for downstream data analysis in QIIME 2, there is a lot of flexibility in what they can contain. All QIIME 2 Results, including Visualizations and Artifacts, can be viewed by running qiime tools view, or alternatively by loading them with QIIME 2 View (https:// view.qiime2.org/ ). QIIME 2 View does not require QIIME 2 to be installed, making it useful for sharing data with collaborators who do not have QIIME 2 installed. Try visualizing se-demux.qzv using each of these methods, then use the method you prefer for the rest of this tutorial: qiime tools view se-demux.qzv 5. Explore the Visualization results: In the first Overview tab, we see a summary of our sequence counts followed by a per-sample breakdown. If you click on the Interactive Quality plot tab ( Fig. 1), you can interact with the sequence quality plot, which shows a boxplot of the quality score distribution for each position in your input sequences. Because it can take a while to compute these distributions from all of your sequence data (often tens of millions of sequences), a subset of your reads are selected randomly (sampled without replacement), and the quality scores of only those sequences are used to generate the box plots. By default, 10,000 sequences are subsampled, but you can control that number with --p-n on the demux summarize command. Keep in mind that because of this random subsampling, every time you run demux summarize on the same sequence data, you will obtain slightly different plots. Click and drag on the plot to zoom in. When you hover the mouse over a boxplot for a given base position, the boxplot's data are shown in a table below the interactive plot as a parametric seven-number summary This is a standard summary statistics of a dataset composed of 2nd, 9th, 25th, 50th, 75th, 91st, and 98th percentiles, and can Estaki et al.

of 46
Current Protocols in Bioinformatics be used as a simple check for assumptions of normality. These values describe the distribution of quality scores at that position in your subsampled sequences. You can click and drag on the plot to zoom in, or double click to zoom back out to full size. These interactive plots can be used to determine if there is a drop in quality at some point in your sequences, which can be useful in choosing truncation and trimming parameters in the next section.

Sequence quality control and feature table construction
Traditionally, quality control of sequences was performed by trimming and filtering sequences based on their quality scores (Bokulich et al., 2013), followed by clustering them into operational taxonomic units (OTUs) based on a fixed dissimilarity threshold, typically 97% (Rideout et al., 2014). Today, there are better methods for quality control that correct amplicon sequence errors and produce high-resolution amplicon sequence variants that, unlike OTUs, resolve differences of as little as one nucleotide. These "denoisers" have many advantages over traditional clustering-based methods, as discussed in Callahan, McMurdie, & Holmes (2017). QIIME 2 currently offers denoising via the DADA2 (q2-dada2) and Deblur (q2-deblur) plugins. The inferred ESVs produced by DADA2 are referred to as amplicon sequence variants (ASVs), while those created by Deblur are called sub-OTUs (sOTUs). In this protocol, we will refer to products of these denoisers, regardless of their method of origin, as features. The major differences in the algorithms and motivation for these and other denoising methods are reviewed in Nearing, Douglas, Comeau, & Langille (2018) and Caruso, Song, Asquith, & Karstens (2019). According to these independent evaluations, denoising methods were consistently more successful than clustering methods in identifying true community composition, while only small differences were reported among the denoising methods. We therefore view method selection here as a personal choice that research teams should make. Some practical differences may drive selection of these methods. For instance, DADA2 includes joining of paired-end reads in its processing workflow, and is therefore simpler to use when paired-end read joining is desired, while Deblur users must join reads independently prior to denoising using other plugins such as q2-vsearch's join-pairs method (Rognes, Flouri, Nichols, Quince, & Mahé, 2016).
In this tutorial, we will denoise our sequences with q2-deblur, which uses a precalculated static sequence error profile to associate erroneous sequence reads with the true biological sequence from which they are derived. Unlike DADA2, which creates sequence error profiles on a per-analysis basis, this allows Deblur to be simultaneously applied across different datasets, reflecting its design motivation to perform meta-analyses. Additionally, using a pre-defined error profile generally results in shorter runtimes.
Deblur is applied in two steps: 1. Apply an initial quality filtering process based on quality scores. This method is an implementation of the quality-filtering approach described by Bokulich et al. (2013): qiime quality-filter q-score \ --i-demux se-demux.qza \ --o-filtered-sequences demux-filtered.qza \ --o-filter-stats demux-filter-stats.qza 2. Apply the Deblur workflow using the denoise-16S action. This method requires one parameter that is used in quality filtering, --p-trim-length, which truncates the sequences at position n. The choice of this parameter is based on the subjective assessment of the quality plots obtained from the previous step. In general, we recommend setting this value to a length where the median quality score begins to drop below 30, or 20 if the overall run quality is too low. One situation where you might deviate from that recommendation is when performing a meta-analysis across multiple sequencing runs. In this type of meta-analysis, it is critical that the read lengths be the same for all of the sequencing runs being compared, to avoid introducing a study-specific bias. In the current example dataset, our quality plot shows high-quality scores along the full length of our reads; therefore, it is reasonable to truncate our reads at the 150-bp position: qiime deblur denoise-16S \ --i-demultiplexed-seqs demux-filtered.qza \ --p-trim-length 150 \ --p-sample-stats \ --p-jobs-to-start 4 \ --o-stats deblur-stats.qza \ --o-representative-sequences rep-seqs-deblur.qza \ --o-table 3. Create a visualization summary of the DeblurStats artifact with the command: qiime deblur visualize-stats \ --i-deblur-stats deblur-stats.qza \ --o-visualization deblur-stats.qzv The statistics summary ( Fig. 2) provides us with information about what happened to each of the samples during the deblur process. The reads-raw column gives information on the number of reads presented to the deblur algorithm. Because deblur works by deleting erroneous reads that it detects, the final number of reads is smaller than the starting number. The three columns that follow (fractionartifact-with-minsize, fraction-artifact, and fractionmissed-reference) summarize the data from other columns in a convenient way. They identify potential problems with the data at an early stage. Fractionartifact-with-minsize is the fraction of sequences detected as artifactual, including those that fall below the minimum length threshold (specified by the --p-trim-length parameter). Fraction-artifact is the fraction of raw sequences that were identified as artifactual. Fraction-missed-reference is the fraction of post-deblur sequences that were not recruited by the positive reference database. The subsequent columns provide information about the number of sequences remaining after dereplication (unique-reads-derep, readsderep), following deblurring (unique-reads-deblur, reads-deblur), number of hits that were recruited to the negative reference database following Estaki et al.

of 46
Current Protocols in Bioinformatics the deblurring process (unique-reads-hit-artifact, reads-hitartifact), chimeric sequences detected (unique-reads-chimeric and reads-chimeric), sequences that match/miss the positive reference database (unique-reads-hit-reference, reads-hit-reference, uniquereads-missed-reference, and reads-missed-reference). The number in the reads-hit-reference column is the final number of per-sample sequences present in the table-deblur.qza QIIME 2 artifact.

NOTE!
The shorthand "artifact" in the per-sample Deblur statistics denotes artifactual sequences (i.e., those erroneously generated as byproducts of the PCR and DNA sequencing process), not a QIIME 2 artifact (i.e., a valid data product of QIIME 2).

Visualize the representative sequences by entering:
qiime feature-table tabulate-seqs \ --i-data rep-seqs-deblur.qza \ --o-visualization rep-seqs-deblur.qzv This Visualization (Fig. 3) will provide statistics and a seven-number summary of sequence lengths, and, more importantly, show a sequence table that maps feature IDs to sequences, with links that allow you to easily BLAST each sequence against the NCBI nt database. To BLAST a sequence against the NCBI nt database, click the sequence and then click the View report button on the resulting page. This will be useful later in the tutorial, when you want to learn more about specific features that are important in the data set. Note that automated taxonomic classification is performed at a later step, as described below; the NCBI-BLAST links provided in this Visualization are useful for assessing the taxonomic affiliation and alignment of individual features to the reference database. Results of the "top hits" from a simple BLAST search such as this are known to be poor predictors of the true taxonomic affiliations of these features, especially in cases where the closest reference sequence in the database is not very similar to the sequence that you are using as a query.

NOTE!
By default, QIIME 2 uses MD5 hashing of a feature's full sequence to assign a feature ID. These are the 32-bit strings of numbers and characters you see in the Feature ID column above. Hashing in q2-deblur can be disabled by adding the --p-no-hashedfeature-ids parameter.
5. Visualize the feature table. Note that in this step we can provide our metadata file, which then adds information about sample groups into the resulting summary output. Adding the metadata is useful for checking that all groups (e.g., a given age or sex of subject) have enough samples and sequences to proceed with analysis. This check is important because variation in the number of sequences per sample, which is typically not fully controllable, often leads to samples being dropped from the analysis because too few reads were obtained from them: qiime feature-table summarize \  --i-table table-deblur.qza \ --m-sample-metadata-file metadata.tsv \ --o-visualization table-deblur.qzv The first Overview tab gives information about how many sequences come from each sample, histograms of those distributions, and related summary statistics. The Interactive Sample Detail tab (Fig. 4) shows a bar plot of the number of samples associated with the metadata category of interest, and the feature count in each sample is shown in the table below. Note that you can choose the metadata categories and change sampling depth by dragging the bar or typing in the value. The Feature Detail Detail tab shows the frequency and number of observed samples associated with each feature.

Alternative Pipeline
If traditional OTU clustering methods are desired, QIIME 2 users can perform these using the q2-vsearch plugin (Rognes et al., 2016)

Generating a phylogenetic tree
Although microbiome data can be analyzed without a phylogenetic tree, many commonly used diversity analysis methods such as Faith's phylogenetic diversity (Faith, 1992) and Estaki et al.

of 46
Current Protocols in Bioinformatics UniFrac (Lozupone & Knight, 2005) require one. To use these methods, we must construct a phylogenetic tree that allows us to consider evolutionary relatedness between the DNA sequences. QIIME 2 offers several methods for reconstructing phylogenetic trees based on features found in your data. These include several variants of traditional alignment-based methods of building a de novo tree, as well as a fragment-insertion method that aligns your features against a reference tree. It should be noted that de novo trees reconstructed from short sequences result in low-quality trees because the sequences do not contain enough information to give the correct evolutionary relationships over large evolutionary distances, and thus should be avoided when possible (Janssen et al., 2018). For this tutorial, we will use the fragment-insertion tree-building method as described by Janssen et al. (2018) using the sepp action of the q2-fragment-insertion plugin, which has been shown to outperform traditional alignment-based methods with short 16S amplicon data. This method aligns our unknown short fragments to full-length sequences in a known reference database and then places them onto a fixed tree. Note that this plugin has only been tested and benchmarked on 16S data against the Greengenes reference database (McDonald et al., 2012), so if you are using different data types you should consider the alternative methods mentioned below.

Tip!
Building a tree using SEPP can be computationally demanding and often has longer run times than most steps in a typical microbiome analysis pipeline. The --p-threads parameter which, similar to the --p-jobs-to-start parameter from q2-deblur, allows this action to be performed in parallel across multiple cores, significantly reduces run time. See the developers'recommendations with regard to run-time optimization at https:// github.com/ qiime2/ q2-fragment-insertion#expected-runtimes. As a reminder, if you are following the online version of this protocol, you can skip this step, download the output artifacts, and use those in the following steps.
Once the insertion tree is created, you must filter your feature table so that it only contains fragments that are in the insertion tree. This step is needed because SEPP might reject the insertion of some fragments, such as erroneous sequences or those that are too distantly related to the reference alignment and phylogeny. Features in your feature table without a corresponding phylogeny will cause diversity computation to fail, because branch lengths cannot be determined for sequences not in the tree.
3. Filter your feature table by running the following: This command generates two feature tables: The filtered-tabledeblur.qza contains only features that are also present in the tree, while the removed-table.qza contains features not present in the tree. Both of these tables can be visualized as shown in step 5 of the previous section, titled "Sequence quality control and feature table construction."

Alternative Pipeline
If a traditional de novo phylogenetic tree is desired/required, QIIME 2 offers several methods [FastTree (Price, Dehal, & Arkin, 2010), IQ-TREE (Nguyen, Schmidt, von Haeseler, & Minh, 2015), and RAxML (Stamatakis, 2014)] to reconstruct these using the q2-phylogeny plugin (https:// docs.qiime2.org/ 2019.10/ plugins/ available/ phylogeny/ ). A tree produced by any of these alignment-based methods can be used with your original feature table without the need for the filtering that SEPP requires. However, if some of your sequences are not 16S rRNA genes, the tree will be incorrect in ways that may severely affect your results.

Visualize the phylogenetic tree:
The phylogenetic tree artifact (semantic type: Phylogeny[Rooted]) produced in this step can be readily visualized using q2-empress (https:// github.com/ biocore/ empress) or iTOL's (Letunic & Bork, 2019) interactive web-based tool by simply uploading the artifact at https:// itol.embl.de/ upload.cgi. The underlying tree, in Newick format, can also be easily exported for use in your application of choice (see the "Exporting QIME 2 data" section in the Support Protocol, below).

of 46
Current Protocols in Bioinformatics

Taxonomic classification
While sequences derived from denoising methods provide us with the highest possible resolution of our features given our sequencing data, it is usually desirable to know the taxonomic affiliation of the microbes from which sequences were obtained. QIIME 2 provides several methods to predict the most likely taxonomic affiliation of our features through the q2-feature-classifier plugin . These include both alignment-based consensus methods and Naive Bayes (and other machine-learning) methods. In this tutorial, we will use a Naive Bayes classifier, which must be trained on taxonomically defined reference sequences covering the target region of interest. Some pre-trained classifiers are available through the QIIME 2 Data Resources page (https:// docs.qiime2.org/ 2019.10/ data-resources/ ), and some have been made available by users on the QIIME 2 Community Contributions channel (https:// forum.qiime2.org/ c/ community-contributions). If a pre-trained classifier suited for your region of interest or reference database is not available through these resources, you can train your own by following the online tutorial (https:// docs.qiime2.org/ 2019. 10/ tutorials/ feature-classifier/ ). In the present protocol, we will train a classifier specific to our data that (optionally) also incorporates environment-specific taxonomic abundance information to improve species inference. This bespoke method has been shown to improve classification accuracy ) when compared to traditional Naive Bayes classifiers, which assume that all species in the reference database are equally likely to be observed in your sample (i.e., that sea-floor microbes are just as likely to be found in a stool sample as microbes usually associated with stool).
To train a classifier using this bespoke method, we need three files: (1) a set of reference reads, (2) a reference taxonomy, and (3) taxonomic weights. Taxonomic weights can be customized for specific sample types and reference data using the q2-clawback plugin ) (see alternative pipeline recommendation below), or we can obtain pre-assembled taxonomic weights from the readytowear collection (https:// github. com/ BenKaehler/ readytowear). This collection also contains the reference reads and taxonomies required. The taxonomic weights used in this tutorial have been assembled with 16S rRNA gene sequence data using the Greengenes reference database trimmed to the V4 domain (bound by the 515F/806R primer pair as used in the ECAM study). Here, we will use the pre-calculated taxonomic weights specific to human stool data. For other sample types, make sure to pick the appropriate weights best fit for your data, and the appropriate sequence reference database; a searchable inventory of available weights is available at https:// github.com/ BenKaehler/ readytowear/ blob/ master/ inventory.tsv.
4. Visualize our taxonomies by entering the following: qiime metadata tabulate \ --m-input-file bespoke-taxonomy.qza \ --m-input-file rep-seqs-deblur.qza \ --o-visualization bespoke-taxonomy.qzv The Visualization (Fig. 5) shows the classified taxonomic name for each feature ID, with additional information on confidence level and sequences. You can reorder the table by clicking the sorting button next to each column name. Recall that the repseqs.qzv Visualization that we created above allows you to easily BLAST the sequence associated with each feature against the NCBI nt database. Using that Visualization and the bespoke-taxonomy.qzv Visualization created here, you can compare the taxonomic assignments of features of interest with those from BLAST's top hit. Because these methods are only estimates, it is not uncommon to find disagreements between the predicted taxonomies. The results here will generally be more accurate than those received from the simple BLAST search linked from the rep-seqs.qzv Visualization.

Alternative Pipeline
To assemble your own taxonomic weights for regions not available in the readytowear inventory, follow the detailed instructions at https:// forum.qiime2.org/ t/ using-q2-clawback-to-assemble-taxonomic-weights.

of 46
Current Protocols in Bioinformatics and perform various statistical tests. In the following sections, we will explore the microbial communities of our samples from children only, and thus will separate these samples from those of the mothers. QIIME 2 provides numerous methods to filter your data. These include total feature frequency-based filtering, identity-based filtering, metadata-based filtering, taxonomybased filtering, etc. Filtering is performed through the q2-feature- To separate the child samples, we will use the filter-samples action to separate samples based on the metadata column "mom_or_child," where a value of "C" represents a child sample:

Alpha rarefaction plots
One of the first steps in a typical microbiome analysis pipeline is to evaluate the sampling depth of our samples to determine whether sufficient surveying effort has been achieved. Sampling depth will naturally differ between samples, because the number of sequences generated by current sequencing instruments are neither evenly distributed among samples nor correlated with sample biomass, and, therefore, to avoid bias, must be normalized prior to analysis (e.g., diversity estimates as described below). The methods used for normalization are an active area of research and debate (McMurdie & Holmes, 2014;Weiss et al., 2017). In this section, we will explore how sampling depth impacts alpha diversity estimates (within-sample richness, discussed in more detail below) using the alpha-rarefaction action within the q2-diversity plugin. This Visualizer computes one or more alpha diversity metrics at multiple sampling depths, in steps between 1 (optionally controlled with --p-min-depth) and the value provided, as --p-max-depth. At each sampling-depth step, 10 rarefied tables will be generated by default, and the diversity metrics will be computed for all samples in the tables. The number of iterations (rarefied tables computed at each sampling depth) can be controlled with --p-iterations. Average diversity values will be plotted for each sample at each even sampling depth, and samples can be grouped based on metadata categories in the resulting visualization if sample metadata are provided with the --m-metadatafile parameter: qiime diversity alpha-rarefaction \ -- i-table child-table.qza \ --i-phylogeny insertion-tree.qza \ --p-max-depth 10000 \ --m-metadata-file metadata.tsv \ --o-visualization child-alpha-rarefaction.qzv Load the child-alpha-rarefaction.qzv Visualization.
The resulting Visualization (Fig. 6) has two plots. The top plot is an alpha rarefaction plot, and is primarily used to determine if the within-sample diversity has been fully captured. If the lines in the plot appear to "level out" (i.e., approach a slope of zero) at some sampling depth along the x axis, this suggests that collecting additional sequences is unlikely to result in any significant changes to our samples' estimated diversity. If the lines in a plot do not level out, the full diversity of the samples may not have been captured by our sampling efforts, or this could indicate that a lot of sequencing errors remain in the data (which are being mistaken for novel diversity).
The bottom plot in this visualization is important when grouping samples by our metadata categories. It illustrates the number of samples that remain in each group when the feature table is rarefied to each sampling depth. If a given sampling depth "d" is larger than the total frequency of a sample "s" (i.e., the number of sequences that were obtained for sample "s"), it is not possible to compute the diversity metric for sample "s" at sampling depth "d." If many of the samples in a group have lower total frequencies than "d," the average diversity presented for that group at "d" in the top plot will be unreliable because it will have been computed on relatively few samples. When grouping samples by metadata, it is therefore essential to look at the bottom plot to ensure that the data presented in the top plot are reliable. Try using the drop-down menus at the top of the plots to switch between the different calculated diversity metrics and metadata categories.

of 46
Current Protocols in Bioinformatics As mentioned earlier, a normalization method to account for unequal sampling depth across samples in microbiome data is essential to avoid the introduction of bias. One common approach to dealing with this problem is to sample a random subset of sequences without replacement for each sample at a fixed depth (also referred to as rarefying) and discard all remaining samples with total read counts below that threshold. This approach, which is not ideal because it discards a large amount of information (Mc-Murdie & Holmes, 2014), has nonetheless been shown to be useful for many different microbial community analyses that are otherwise dominated by sample-to-sample variation in the number of sequences per sample obtained (Weiss et al., 2017). Selecting the depth to which to rarefy samples is a subjective decision motivated by the desire to maximize the rarefying threshold while minimizing loss of samples due to insufficient coverage.
Let us consider our current dataset as an example. In the rarefaction plots above, we can see that there is a natural leveling of our diversity metrics starting at 1000 sequences/sample, with limited additional increases observed beyond 3000 sequences/ sample. This should be our target minimum sampling depth. Now let us revisit the child-table.qzv Visualization from the Filtering data step. Select the Interactive Sample Detail tab from the top left corner, and use the Metadata Category drop-down menu to select month. Hover over each bar in the plot to see the number of samples included at each month. Now, try moving the Sampling Depth bar on the right starting from the left (zero) to the right. You will see that as the sampling depth increases we begin to rapidly lose samples, as shown by the grayed areas in the bar plot. In this dataset, the time point 0 month is better represented than the subsequent months. We would therefore ideally minimize discarding samples from the other underrepresented months to maintain sufficient statistical power in downstream analyses. Start moving the Sampling Depth bar from zero again; this time stop at the first instance where we begin to see a loss of sample at a month that is not 0. Now scroll down to the bottom of the page. The samples highlighted in red are the would-be discarded samples at that chosen sampling depth. Here we see that at a depth of exactly 3400 we are able to retain all the samples from months 6, 12, and 24, while still maintaining a minimum depth that will capture the overall signature of the alpha diversity metrics as seen by our rarefaction plots.

Alternative Pipeline
Newer methods are actively being developed that circumvent the need for rarefying by taking advantage of the compositional nature of microbiome data; we will show examples of these methods in subsequent sections. However, for some commonly used analysis tasks, no such solution yet exists.

Basic data exploration and diversity analyses
In the original ECAM study, in addition to monthly sampling, some participants were sampled multiple times in any given month. The exact day at which the samples were collected are recorded in the day_of_life column and again under the month column, with the values in the latter rounded to the nearest month. This rounding process allows us to easily compare samples that were collected at roughly the same month across groups; however, it does introduce artificial replicates, as multiple samples from the same participant will be recorded under the same month. To mitigate the appearance of these false replicates and ensure that samples meet assumptions of independence, we will filter our feature table prior to group tests to include only one sample per subject per month. We have manually identified those samples that would be considered false replicates in the rounding step under the column month_replicate, and will use this to filter our table: qiime feature- We are now ready to explore our microbial communities. One simple method to visualize the taxonomic composition of samples is to visualize them individually as stacked barplots. We can do this easily by providing our feature table, taxonomy assignments, and sample metadata file to the taxa plugin's barplot action.
1. Generate the taxonomic barplot by running: qiime taxa barplot \ --i-table child-table-norep.qza \ --i-taxonomy bespoke-taxonomy.qza \ --m-metadata-file metadata.tsv \ --o-visualization child-bar-plots.qzv This barplot (Fig. 7) shows the relative frequency of features in each sample, where you can choose the taxonomic level to display and sort the samples by a sample metadata category or taxonomic abundance in an ascending or descending order. You can also highlight a specific feature in the barplot by clicking it in the legend. The snapshot above shows a barplot at the phylum level (level 2) where samples were sorted by day. Three phyla were highlighted to show that Proteobacteria (gray) dominate Estaki et al.

of 46
Current Protocols in Bioinformatics at birth, but by 6 months of age the relative abundance of Bacteroidetes (green) and Firmicutes (purple) make up the majority of the community.
While barplots can be informative with regard to the composition of our microbial communities, it is difficult to use them to disentangle meaningful signals from noise.
Many microbial ecology studies use alpha diversity (within-sample richness and/or evenness) and beta diversity (between-sample dissimilarity) to reveal patterns in the microbial diversity in a set of samples. QIIME 2's diversity analyses are available through the q2-diversity plugin, which computes a range of alpha and beta diversity metrics, applies related statistical tests, and generates interactive visualizations. The diversity metrics used in any given study should be based on the overall goals of the experiment. For a list of available diversity metrics in QIIME 2 and a brief description of the motivation behind them, we recommend reviewing the following tutorial: https:// forum.qiime2.org/ t/ alpha-and-beta-diversity-explanations-and-commands.
In this tutorial, we will utilize the pipeline action core-metrics-phylogenetic, which simultaneously rarefies a FeatureTable[Frequency] to a userspecified depth, computes several commonly used alpha and beta diversity metrics, and generates principal coordinates analysis (PCoA) plots using the EMPeror visualization tool (Vázquez-Baeza, Pirrung, Gonzalez, & Knight, 2013) for each of the beta diversity metrics. For this tutorial, we will use a sampling depth of 3400 as determined from the previous step.
2. Compute alpha and beta diversity by entering the following commands, minding the --p-n-jobs option if multi-core usage is desired: qiime diversity core-metrics-phylogenetic \ --i-table child-table-norep.qza \ --i-phylogeny insertion-tree.qza \ --p-sampling-depth 3400 \ --m-metadata-file metadata.tsv \ --p-n-jobs 1 \ --output-dir child-norep-core-metrics-results By default, the following metrics are computed by this pipeline and stored within the child-core-metrics-results directory. After computing the core diversity metrics, we can begin to explore the microbial composition of the samples in the context of their metadata.

Performing statistical tests on diversity and generating interactive visualizations Alpha diversity
We will first test for associations between our categorical metadata columns and alpha diversity. Alpha diversity asks about the distribution of features within each sample, and once calculated for all samples can be used to test whether the per-sample diversity differs across different conditions (e.g., samples obtained at different ages). The comparison makes no assumptions about the features that are shared between samples; two samples can have the same alpha diversity and not share any features. The rarefied Sam-pleData[AlphaDiversity] artifact produced in the above step contains univariate, continuous values and can be tested using common non-parametric statistical test (e.g., Kruskal-Wallis test) with the following command: qiime diversity alpha-group-significance \ --i-alpha-diversity child-norep-core-metricsresults/shannon_vector.qza \ --m-metadata-file metadata.tsv \ --o-visualization child-norep-core-metricsresults/shannon-group-significance.qzv Load the newly created shannon-group-significance.qzv Visualization.
From the boxplots and Kurskal-Wallis test results (Fig. 8), it appears that there are no differences between the child samples in terms of Shannon H diversity when mode of delivery is considered (p value = 0.63). However, exposure to antibiotics appears to be associated with higher diversity (p value = 0.026). What are the biological implications?
One important confounding factor here is that we are simultaneously analyzing our samples across all time points and in doing so potentially losing meaningful signals at a particular time point. Importantly, having more than one time point per subject also violates the assumption of the Kurskal-Wallis test that all samples are independent. More appropriate methods that take into account repeated measurements from the same samples are demonstrated in the longitudinal data analysis section below. It is important to note that QIIME 2 is not able to detect that: you must always be knowledgeable about the assumptions of the statistical tests that you are applying, and whether they are applicable to your data. These types of questions are common on the QIIME 2 Forum, so if you are unsure, start by searching for your question on the forum and posting your own question if you do not find a pre-existing answer.
We can see now that at month 24, vaginal birth appears to be associated with a higher Shannon value than cesarean birth (p value = 0.02, Fig. 9), while antibiotic exposure is no longer associated with differences in Shannon diversity (p value = 0.87).

Beta diversity
Next, we will compare the structure of the microbiome communities using beta diversity. We start by making a visual inspection of the principal coordinates (PCoA) plots that were generated in the previous step. Load the unweighted_unifrac_emperor.qzv Visualization from the norep-C24-core-metrics-results folder.
Each dot in the PCoA plot (Fig. 10) represents a sample, and users can color them according to their metadata category of interest and rotate the 3D figure to see whether there is a clear separation in beta diversity driven by these covariates. Moreover, users can customize their figures using existing drop-down menus by hiding certain samples in Visibility, changing the brightness of dots in Opacity, controlling their size in Scale, choosing different shapes for samples in Shape, modifying the color of axes and background in Axes, and creating a moving picture under the Animations tab.

Alternative Pipeline
For longitudinal studies, we have made great use of visualizing temporal variability using animated traces in Emperor. By doing this, you can follow the longitudinal dynamics sample by sample and subject by subject. In order to do so, you need two metadata categories-one to order the samples (Gradient category) and one to group the samples (Trajectory category). For this dataset, we can use animations_gradient as the category that orders the samples, and anima-tions_subject as the category that groups our samples.

of 46
Current Protocols in Bioinformatics The values in animations_gradient represent the age in months. In this category, samples with no longitudinal data are set to 0; note that all values have to be numeric in order for the animation to be displayed. The animations_subject category includes unique identifiers for each subject. Put together, these two categories will result in animated traces on a per-individual basis.
In Emperor's user interface, go to the Animations tab, select anima-tions_gradient under the Gradient menu, and select animations_subject under the Trajectory menu. Then click play, and you will see animated traces moving on the plot. You can adjust the speed and the radius of the trajectories. To start over, click on the back button. Using the ECAM dataset, we have generated an animation visualizing the temporal trajectories of one vaginally born and one cesarean baby in the 3D PCoA plot. This animation is available at https:// tinyurl.com/ y7sbzpnd. For more information about animated ordinations, visit Emperor's online tutorial at https:// biocore.github.io/ emperor/ build/ html/ tutorials/ animations.html.
When we color the samples by delivery mode and change the shape of male infants to squares, no obvious clusters are observed. There may be a general trend towards vaginalbirth children separating from cesarean-birth samples along Axis 1, which would suggest that the microbial composition of cesarean-born children is phylogenetically more related within their own groups than that of the vaginal-birth group. However, given the low sample size in the cesarean group, we are likely underpowered to detect these changes statistically. Nevertheless, we can test our hypothesis using a PERMANOVA, which tests the hypothesis that distances between samples within one group (withingroup distances) differ from the distances to samples in another group (across-group distances). Other relevant tests in QIIME 2 exist, such as ANOSIM, PERMDISP, or the Mantel test; the choice of test should be carefully considered with regard to the biological question at hand. See Anderson and Walsh (2013) for an overview of these tests. It is also important to note that these tests are useful when testing pre-existing hypotheses about your data, but cannot be used for testing new hypotheses that were generated by looking at PCoA results. New hypotheses must unfortunately be tested with new, independent data. Here, we perform the PERMANOVA test with the following command: qiime diversity beta-group-significance \ --i-distance-matrix norep-C24-core-metricsresults/unweighted_unifrac_distance_matrix.qza \ --m-metadata-file metadata.tsv \ --m-metadata-column delivery \ --p-pairwise \ --o-visualization norep-C24-core-metrics-results/ uw_unifrac-delivery-significance.qzv Load the Visualization.
The overview statistics (Fig. 11) provide us with the parameters used in the PER-MANOVA test and the resulting values of test statistic and p value. The boxplots ( Fig. 9) show the pairwise distance between cesarean and vaginal birth. Lastly, the table (in Fig. 9) summarizes the results from PERMANOVA and gives an additional q value (adjusted p value for multiple testing). The PERMANOVA test confirms our initial assessment that vaginal-born microbial communities are not statistically different from cesarean-born communities in beta diversity (as represented by unweighted UniFrac distances) at month 24 (p value = 0.6). These results, however, should be interpreted cautiously given the limited sample size in this dataset. We would conclude that further experiments would be needed to confirm our findings.

Alternative Pipeline
The beta diversity analysis above was carried on a rarefied subset of our data. An alternative method that does not require rarefying is offered through the external q2deicode plugin (https:// library.qiime2.org/ plugins/ deicode). DEICODE is a form of Aitchison Distance that is robust with respect to compositional data with high levels of sparsity (Martino et al., 2019). This plugin can be used to generate a beta diversity ordination artifact that can easily be utilized with the existing architecture in QIIME 2 such as visualization with q2-emperor and hypothesis testing with the beta-groupsignificance as above.

Longitudinal data analysis
When microbial data are collected at different timepoints, it is useful to examine dynamic changes in the microbial communities (longitudinal analysis). This section is devoted to longitudinal microbiome analysis using the q2-longitudinal plugin (Bokulich, Estaki et al.

of 46
Current Protocols in Bioinformatics Dillon, Zhang, et al., 2018). This plugin can perform a number of analyses such as: visualization using volatility plots, testing temporal trends in alpha and beta diversities, using linear mixed-effects models to test for changes in diversity metrics or individual features with regard to metadata categories of interest, and more. A comprehensive list of available methods and instructions on how to perform them are available in the online tutorial: https:// docs.qiime2.org/ 2019.10/ tutorials/ longitudinal/ . Here we will demonstrate some of these methods.

Linear mixed effects (LME) models
In a previous section, we determined that Shannon diversity was significantly lower in cesarean-born children at 24 months of age than in vaginal-born children. But what about the change in Shannon diversity throughout the 24 months? LME models enable us to test the relationship between a single response variable (i.e., Shannon metric) and one or more independent variables (e.g., delivery mode, diet), where observations are made across dependent samples, e.g., in repeated-measures sampling experiments. LME models can also account for a random effect (e.g., individuals, sampling times) variable. Here we will use the linear-mixed-effects action, which requires the following inputs: the diversity metric of choice calculated for all samples across 24 months (in the child-core-metrics-results folder), the metric name, our sample metadata file, a comma-separated list of covariates to include in the model, the random-effect variable (day_of_life), the column name from the metadata file containing the numeric state (i.e., day_of_life), as well as the column name from the metadata file containing the individuals' ID names to track through time. Unlike the group-significant tests in the previous steps, LME models can handle continuous variables; therefore, we will utilize our full dataset by calling on the day_of_life column instead of month. We will need to calculate our diversity metrics again on the full dataset as it was before replicates were removed: qiime diversity core-metrics-phylogenetic \ -- i-table child-table.qza \ --i-phylogeny insertion-tree.qza \ --p-sampling-depth 3400 \ --m-metadata-file metadata.tsv \ --p-n-jobs 1 \ --output-dir child-core-metrics-results To demonstrate how covariates can be included in an LME model, here we will test the effects of delivery method and diet (predominantly breast-fed versus predominantly formula-fed during the first 3 months of life) simultaneously using the following: qiime longitudinal linear-mixed-effects \ --m-metadata-file metadata.tsv \ --m-metadata-file \ child-core-metrics-results/shannon_vector.qza \ --p-metric shannon \ --p-random-effects day_of_life \ --p-group-columns delivery,diet \ --p-state-column day_of_life \ --p-individual-id-column host_subject_id \ --o-visualization lme-shannon.qzv In this Visualization (Fig. 12), the model results provide all the outputs from the LME model, where we see a significant birth mode effect in Shannon diversity over time (p value = 0.02), while the diet has no bearing in Shannon diversity across time (p value = 0.55). The regression scatterplots (top) overlap the predicted group mean trajectories on the observed data (dots), and the projected residuals plot (bottom) can help users to check the validity of an LME model. For more details, see https:// docs.qiime2. org/ 2019.10/ tutorials/ longitudinal/ .

Volatility visualization
The volatility visualizer generates interactive line plots that allow us to assess how volatile a dependent variable is over a continuous, independent variable (e.g., time) in one or more groups. Multiple metadata files (including alpha and beta diversity) and feature tables can be used as input, and in the interactive visualization we can select different dependent variables to plot on the y axis. Here we examine how variance in Shannon diversity changes across time in our cohort, both in groups of samples (interactively selected) and in individual subjects.
The volatility plot can be generated by running: qiime longitudinal volatility \ --m-metadata-file metadata.tsv \ --m-metadata-file \ child-core-metrics-results/shannon_vector.qza \ --p-default-metric shannon \ --p-default-group-column delivery \ --p-state-column month \ --p-individual-id-column host_subject_id \ --o-visualization shannon-volatility.qzv The volatility plot (Fig. 13)  ± 2× and 3× standard deviation lines from global mean. Observations above those global control limits are suspected to be outliers. In this analysis, we see high variance at time zero, while they become more similar by month 8, and by month 24, vaginally born children appear to be higher than cesarean-born (as expected).

Differential abundance testing
So far, we have analyzed our data using a variety of approaches employing various diversity metrics and between-sample distances that are useful in comparing our communities in a broad approach. Now, we want to identify individual taxa whose relative abundances are significantly different across groups. Differential abundance testing in microbiome analysis is an active area of research (see the "compositional data analysis" section in the Support Protocol for more details). Two QIIME 2 plugins that can be used for this are q2-songbird  and q2-composition. In this section, we will use the ANCOM test in the q2-composition plugin to identify differential abundant features between vaginal-and cesarean-born children. Moreover, we will use q2songbird to perform a similar task, but with the additional adjustment for potential confounders.

ANCOM
As with any bioinformatics method, you should be aware of the assumptions and limitations of ANCOM before using it. For example, ANCOM assumes that few (less than ∼25%) features differ between groups. If you expect that more features differ between your groups, you should not use ANCOM because it will be more error-prone (an increase in both Type I and II errors is possible). We recommend reading the ANCOM paper (Mandal et al., 2015) before using this method. For the simplicity of the analysis, we will focus on identifying differential abundant features in children born with different birth modes at month 6 only. We have selected 6 months because this time point contains the highest number of samples (after baseline time 0), which greatly increases the power of our analysis.  When performing differential abundance testing, it is generally a good idea to filter out features that have very low abundances across your dataset, as well those that are present in only a few samples. These features tend to add noise to the results, so we will remove them. Here we use the filter-features action to filter out features appearing in less than ∼10% of our samples (min 5 of 43 samples) and those that have a total frequency less than 20 counts across all samples.

Filter out features with the following commands:
qiime feature- ; it also cannot tolerate frequencies of zero. To resolve both of these requirements, we will use the add-pseudocount action to simultaneously apply relative abundance transformation and add a pseudocount of 1 to all of our counts.

of 46
Current Protocols in Bioinformatics metadata categories (delivery modes in our case) and the y axis is the strength of the ANCOM test statistic W. As ANCOM is essentially running pairwise tests, the W value is a count of the number of sub-hypotheses that have passed for a given feature. Hence, the differentially abundant features will be those ASVs with high values on both the x and y axis, in other words, points that are close to the top right or left corners (in the figures for this tutorial, the one identified feature has been highlighted with a red circle). The identified features are summarized underneath the ANCOM statistical results section. Lastly, the percentile abundance table shows the number of sequences assigned to each identified feature in how many of the samples.
Regarding the identified feature in our analysis, of the samples in the cesarean group, in the sample with the lowest count of sequences assigned to detected feature, one sequence was observed that was ultimately assigned to this feature. Then, in 75% of the samples in the cesarean group, 1 or fewer sequences were observed that were ultimately assigned to this feature (recall that adding the pseudocount ensures that every sample will appear to have at least 1 count of every feature). However, in 75% of the samples in the Vaginal group, 884.75 or fewer sequences were observed that were ultimately assigned to this feature. This percentile abundance table suggests that the detected feature is higher in vaginally born than cesarean-born babies.
The ANCOM test has identified 1 feature that differs significantly by birth mode. To identify which taxa this feature corresponds to, we can load our bespoketaxonomy.qzv artifact made in step 4 of the "Taxononomic classification" section and look up the feature ID in the search-bar at the top. This identified feature and its corresponding taxonomic assignment are as follows: Feature with higher abundance in vaginal-born children: d75b7080930e7a77ef3de8c6154895b9 -> k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium; s__ Perhaps not surprisingly, these results echo findings from the original ECAM paper (Bokulich, Chung, et al., 2016) encompassing the full dataset.

Songbird
Songbird  can be used to identify differentially abundant features while accounting for confounding variables in the data. This is a multinomial regression designed for compositional microbiome data (in technical terms, it is an L2 regularized multinomial regression that avoids overfitting by using the sum of squares of all feature weights as penalty term to the loss function, as in Ridge regression). Here, we control for confounding variables such as antibiotic exposure and infants' diet and sex when identifying features that are significantly different between babies born vaginally versus through C-section.
3. Examine the estimated coefficients for each feature by running: qiime tools export \ --input-path songbird-results/differentials6month Controlled.qza \ --output-path songbird-results/exported-differentials6monthControlled Based on the estimated coefficients for delivery [T.Vaginal] in the output of regression stats, we consider the features with the positive coefficients to be differential relative to negative coefficients in vaginal-born infants versus cesareans, and vice versa. There is no clear cutoff in songbird on the value of coefficients to assist in choosing of number of features, but since there are few features with coefficients higher than 2.5 or lower than -2.5, we use this threshold as our cut-off for regression coefficients and thus identify five vaginally born-associated and four C-section born-associated features, as shown below: Features with higher differential ranking in vaginal-born children (listed from strongest to weakest):

Meta-analysis through the Qiita database using redbiom
After identifying differentially abundant features using ANCOM or songbird, users can search through available samples in Qiita  using redbiom  to see the characteristics of samples. This type of analysis can be used to examine what environments a particular feature was previously observed in. In addition, the FeatureTable[Frequency] data for the samples that contain a feature of interest can be extracted for further analysis. A detailed tutorial can be found on the QIIME 2 Forum (https:// forum.qiime2.org/ t/ querying-for-publicmicrobiome-data-in-qiita-using-redbiom/ 4653). Here, we will search an individual differentially abundant feature to see whether that feature appears enriched in different infants by birth mode. Note that the exact numbers and results shown below may change over time as more samples get indexed by redbiom.
To use redbiom, we first need to install the package using conda: conda install -c conda-forge redbiom In redbiom, the data are partitioned by technical and processing parameters to help improve the comparability of the contained data. Before we search for features, we need to decide the context to search within. The redbiom summarize contexts command provides information about the names of the contexts and the number of samples and features indexed. The context names themselves describe the processing parameters used:

redbiom summarize contexts
This produces a long output, as there are several sequencing technologies, sequence trim lengths, a few different variable regions, and multiple feature assessment methods represented. The first five lines of this output are shown below, which provide the context name, the number of samples in the context, the number of unique features, and a succinct description of the bioinformatic processing performed: For the analysis here, we are going to use the Deblur-Illumina-16S-V4-150nt-780653 context; this context is composed of samples which sequenced the 16S V4 region, are all 150 nucleotides in length, and were processed with Deblur. The context contains 127,413 samples spanning over 7.2 million unique features, representing hundreds of publicly available studies in Qiita.
Next, we will take the DNA sequence corresponding to our feature of interest d75b7080930e7a77ef3de8c6154895b9, identify samples within the context in which the sequence was observed, and save the output into a file called observed_samples.txt. Note that feature hashes cannot presently be used for search; use the bespoke-taxonomy.qzv Visualization to locate its corresponding DNA sequences: If we examine the observed_samples.txt file, we will see that over 17,000 samples contain this particular feature. These samples are part of 137 different studies in Qiita. We can now begin to explore what is known about the samples. A major challenge for meta-analysis, though, is having common metadata categories across studies.
First, as a sanity check, we will search against only those samples that record the Earth Microbiome Project Ontology (Thompson et al., 2017). The EMPO_3 level describes basic environmental information about a sample. Only samples that describe an entry in their metadata for empo_3 will be obtained: redbiom summarize samples \ --category empo_3 \ --from observed_samples.txt Estaki et al.

of 46
Current Protocols in Bioinformatics What we can see from this output is that (as expected) the feature is primarily observed in samples associated with the animal distal gut: Now, let us search this feature against only those samples that correspond to infants. For that, we will need to select the set of samples that correspond to a particular criterion. In this case, we will filter to include only samples associated with individuals under the age of 3. The two metadata categories we will use are host_age and ages, both of which are common labels in Qiita that correspond to an individual's recorded age. In addition, we will explicitly omit the ECAM study from our qiita search, as our dataset was drawn from this study: redbiom select samples-from-metadata \ --context Deblur-Illumina-16S-V4-150nt-780653 \ --from observed_samples.txt "where (host_age < 3 or age < 3) and qiita_study_id != 10249" > infant_samples.txt We can then summarize the metadata of these infant samples. In order to do so, we need to determine what metadata category to summarize over. So let us search Qiita for all metadata categories (not shown below) that contain the word 'birth' in the name, pick a few that seem plausible, and summarize them: redbiom search metadata \ --categories birth redbiom summarize metadata birth_method birth_mode We can see that birth_mode is represented by thousands of samples: So, let us use that metadata category: redbiom summarize samples \ --category birth_mode \ --from infant_samples.txt From this summary, it appears our feature of interest is present in many more samples associated with a vaginal birth than cesarean section: It is important to note, however, that these findings may be confounded by the possibility that there may be more representations of vaginal birth samples in Qiita. However, a summary of that metadata category across all of Qiita can be performed easily: redbiom summarize metadata-category \ --counter \ --category birth_mode This suggests the variable is not extremely unbalanced between C-section and vaginal births, and that actually more of the samples are associated with C-sections. Further exploration of these samples can be performed, such as extracting the samples and integrating them directly in a meta-analysis (see redbiom fetch to obtain feature tables and sample metadata).

FURTHER MICROBIOME ANALYSES
The following sections are offered as stand-alone additional support for further microbiome analyses and do not rely on the ECAM dataset used in previous sections.

of 46
Current Protocols in Bioinformatics Exporting QIIME 2 data Occasionally, the raw data within QIIME 2 artifacts may be required for use in other applications that cannot read these file types. QIIME 2 artifacts are simple zip files, and so their content can be extracted using any unzipping software. They can also be extracted readily using the qiime tools extract plugin, which extracts the raw data as well as QIIME 2's metadata about that artifact-including for example the artifact's provenance-into the output directory in plain-text format. The extracted files will be placed in a new directory whose name is the artifact's universally unique identifier (UUID). Alternatively, when only the raw data are desired without the metadata, qiime tools export can be used. When exporting an artifact, only the data files will be placed in the output directory. For example, a user may be interested in visualizing a phylogenetic tree using a package in R. To obtain the raw tree file (in Newick format), simply run: qiime tools export \ --input-path insertion-tree.qza \ --output-path extracted-insertion-tree

Analysis of shotgun metagenomic data
Whole-metagenome shotgun (WMS) sequencing explores the totality of genomes in the microbial community. Compared to amplicon-based analyses, it provides higher taxonomic resolution (typically beyond the genus level), direct observation of functional genes, and further information on the genome organization. Although assembly into draft genomes usually demands high sequencing depth, which is expensive, investigation of the microbial community can be as affordable as amplicon sequencing, hence enabling survey of larger quantity of samples. It has been demonstrated that "shallow" shotgun sequencing (0.5 million sequences per sample) delivers nearly equal insights into the community's taxonomic composition compared to sequencing with 100 times as much depth (Hillmann et al., 2018)-although functional profiles are not nearly as accurate in shallow shotgun sequencing. Therefore, experimental design and budget arrangements should be made based on the goals of the study.
Two plugins that are dedicated to shotgun metagenomics are currently available for QI-IME 2: q2-shogun (Hillmann et al., 2018) and q2-metaphlan2 (Truong et al., 2015). They need to be installed separately. In the example below, we demonstrate the use of q2-shogun, a wrapper for the SHOGUN pipeline (Hillmann et al., 2018).
The output file, taxatable.qza, is a feature table in which columns are sample IDs and rows are taxonomic lineages. Starting from this table, we may perform various subsequent analyses in a similar manner to amplicon sequencing data, as detailed above, such as taxonomy plots, alpha and beta diversity analyses, and differential abundance testing.
If the user wants to prepare a custom reference sequence database from a multi-FASTA file (e.g., refseqs.fa), it can be done as follows: NOTE: The below sections are presented for demonstration purposes only and are not to be executed unless the file refseqs.fa is first imported by the user.
4. Import the sequences into QIIME 2: qiime tools import \ --input-path refseqs.fa \ --type FeatureDate[Sequence] \ --output-path refseqs.qza 5. Build a Bowtie2 index based on the sequences: QIIME 2 is flexible in the types of metagenomic analyses it supports. In addition to calling SHOGUN or MetaPhlAn2 from the QIIME 2 interface, one may perform taxonomic or functional profiling of shotgun metagenomic data separately using any external tool, then import the resulting profile into QIIME 2. BIOM formatted files are supported as input. Questions about other supported formats should be directed to the QIIME 2 Forum, as this will expand over time.

Source tracking
Source tracking of microbial communities attempts to estimate the relative contributions of a set of host, environmental, and contamination sources to a novel community. QIIME 2 currently offers two methods for microbial source tracking through the external plugins q2-FEAST (https:// github.com/ cozygene/ FEAST) (Shenhav et al., 2019) and q2-SourceTracker2 (https:// github.com/ biota/ sourcetracker2) . FEAST (Fast Expectation-mAximization microbial Source Tracking) and Source-Tracker2 vary in their statistical approach and assumptions for the estimation of source contributions. Therefore, we view method selection here as a personal choice that research teams should make if they do not have a prior hypothesis that one tool addresses directly.

Compositional data analysis
Feature tables contain magnitudes determined by random sequencing depths that vary dramatically between samples irrespective of the initial microbial load, making the data Estaki et al.

of 46
Current Protocols in Bioinformatics compositional in nature (Fernandes et al., 2014). Compositional data contain relative information where the abundance of one feature can only be interpreted relative to another.

Supervised classification and regression methods for predicting sample metadata
Supervised learning (SL) methods predict sample data (e.g., metadata values) as a function of other sample data (e.g., microbiota composition) by training an SL model on training data. Various SL methods can predict either categorical data (a classification problem) or continuous values (a regression problem). SL methods have become increasingly common in microbiome studies to predict sample characteristics (e.g., disease state or location data) or to identify features that are associated with particular characteristics or sample classes (Bokulich, Collins, et al., 2016;Pasolli, Truong, Malik, Waldron, & Segata, 2016). The ability of many SL methods to perform feature selection-the identification (and ranking) of features associated with particular sample classes or values-is a particularly useful feature of these methods for application in microbiome experiments. The QIIME 2 plugin q2-sample-classifier ; https:// library.qiime2.org/ plugins/ q2-sample-classifier/ ) contains methods for performing supervised classification/regression and feature selection using microbiome data and metadata.

Metadata preparation
Metadata are a critical component of a successful study and, unlike other elements such as sequencing quality or completeness of the reference database, metadata are largely under the control of the investigator. Unfortunately, they are often treated as an afterthought, leading to uninterpretable results due to missing information. To ensure a successful data analysis, begin metadata generation at the time of sample collection. Be sure to record all sample attributes that are relevant to your hypotheses, as these attributes are the basis of QIIME 2's visualizations and statistical tests.
Spreadsheets are the most commonly used vehicle for metadata storage and management due to their ubiquity and convenience, but they have well-known drawbacks. For example, by default Microsoft Excel performs irreversible modification of certain kinds of inputs into dates or floating-point numbers (Zeeberg et al., 2004), and auto-completes values based on earlier entries (https:// support.office.com/ en-ie/ article/ turn-automatic-completion-of -cell-entries-on-or-off -0f4aa749-b927-4ea7-adaa-86f8d4 f9fe20). As these modifications are performed silently, without warning to the user, they frequently lead to mangled metadata. Although other spreadsheet programs (such as Google Sheets and LibreOffice) have slightly different defaults, all have "convenience" features that can cause data corruption, so it is critical to learn the default features of your preferred spreadsheet program, follow spreadsheet best-practices (Broman & Woo, 2018), and actively monitor the validity of your records. Alternatively, generate your metadata file in a dedicated software tool such as ISAcreator (Rocca-Serra et al., 2010), which provides a structured interface designed to prevent common errors.
Consistency is the key to high-quality metadata. Much effort has already been put into identifying and standardizing the crucial pieces of metadata for various sorts of studies, so investigate these guidelines before beginning your metadata collection. The Genomic Standards Consortium (GSC) has created the "Minimum Information about any (x) Sequence" (MIxS) and "Minimum Information about a MARKer gene Sequence" (MIMARKS) specifications (Yilmaz et al., 2011) as well as 15 "environmental packages" that extend and refine these standards for samples from environments ranging from air to human skin to waste water. To ease compliance with these standards, the GSC provides checklists outlining the expected inputs, syntax, preferred units, and more for the fields in each standard and package (https:// press3.mcs.anl.gov/ gensc/ mixs/ ). Many of these fields take values specified by subsets of controlled vocabularies such as the Experimental Factor Ontology (Malone et al., 2010) and the Environment Ontology (Buttigieg et al., 2016). Consider employing a tool such as the stand-alone ISAconfigurator (Rocca-Serra et al., 2010) or the Excel-based QIIMP (The Quick and Intuitive Interactive Metadata Portal; https:// qiita.ucsd.edu/ iframe/ ?iframe=qiimp) to identify all the fields necessary for your study type and to enforce the validity of their content.
While creating and maintaining consistent and compliant metadata is not trivial, it is well worth the effort. Not only are standards-compliant metadata required for submission to a growing number of public databases and journals [e.g., the European Estaki et al.

of 46
Current Protocols in Bioinformatics Nucleotide Archive (https: // www.ebi.ac.uk/ training/ online/ course/ ebi-metagenomicsportal-submitting-metagenomics-da/ what-are-metadata-and-why-are-they-so-impo), Qiita (https:// qiita.ucsd.edu/ static/ doc/ html/ tutorials/ prepare-information-files.html# required-fields-for-centralized-qiita), and Microbiome Journal (https:// microbiomejour nal.biomedcentral.com/ submission-guidelines/ preparing-your-manuscript/ microbiome -announcement)], but they are also critical to enable future meta-analyses-both between your own data and others'-and between your own data today and your new data tomorrow! It is much easier to record required information up front than it is to retroactively track this information down when you are working toward a tight paper submission deadline.

Background Information
Advances in the ease of microbiome data acquisition, due in large part to improved DNA sequencing instruments and standardized protocols, have led to a dramatic increase in the need to analyze the data. However, because so many studies have already been done, it is no longer the best approach to analyze each data set in isolation. Rather, combining the data with what is already known from other studies can have considerable advantages. Therefore, in this article, we show both analysis of an individual dataset and combination with data from other studies. Although we show an example relevant to the human microbiome, we stress that QIIME 2 is useful for analyzing microbiomes from any environment, including soil, seawater, animals, plants, industrial systems, built environments, foods, and a wide range of others that we never anticipated. QIIME 2 allows rapid meta-analyses across studies, but these can often be limited due to protocol variation and incomplete or incompatible metadata. Comparing multiple studies directly relies on complete standardized sample metadata, experimental protocols, and bioinformatics analysis. Incomplete metadata with mislabeled or non-standardized entries severely limits the ability to compare between studies, and leads to spurious conclusions. Experimental protocols that amplify non-standard variable regions of the 16S rRNA gene, or that use custom sequencing barcodes/adaptors, can also prevent studies from being combined into one analysis. Making the raw data from each study available in public databases, as required by many journals and funding agencies, and using standard methods, can greatly facilitate re-use and citation of your dataset.

Critical Parameters
Several points are important to consider before beginning your analysis, and have been reviewed recently to provide best practices for microbiome studies (Allaband et al., 2019;Knight et al., 2018). QIIME 2 tries its best to provide reasonable defaults, but some plugins require considerable biological or subjectmatter knowledge for picking the parameters. Rarefaction provides a good example, because knowledge of which samples cannot be left out of the analysis to obtain biologically meaningful results is needed in order to choose the number of sequences per sample to keep. Similarly, for processing raw sequencing data into a FeatureTable[Frequency] artifact, it is absolutely necessary to know how the samples have been processed, including comprehensive preparation metadata for processing and troubleshooting, as well as the type of barcodes used for multiplexed samples. Metadata must be carefully considered, because the analysis cannot use information that you did not provide. For example, if you want to check whether a parameter such as immune function is correlated with the microbiome, that parameter must actually be measured and included in the metadata table.
Because capabilities in QIIME 2 are rapidly expanding, and because topics such as compositional data analysis are receiving intense focus from the statistics community at present, this document will be continuously updated after publication on the QIIME 2 website (https:// qiime2.org/ ). For questions beyond the scope of this document, we encourage you to use the QIIME 2 forum (https:// forum.qiime2.org/ ), which provides rapid answers to a wide range of questions. We look forward to reading about your applications of QIIME 2 to answer a wide range of compelling questions that touch on the microbiome, whether in the sea, the soil, or the human body!

Troubleshooting
All actions in QIIME 2 have built-in help, accessible by including the --help parameter following any action name, e.g., qiime feature-table filter-samples --help. Additionally, technical support is available through https:// forum.qiime2.org, as discussed in the Introduction to this article.