Microbiome and Metabolomics Core
Vincent B. Young, M.D., Ph.D.
Charles Burant, M.D., Ph.D.
Core Co-Director, Metabolomics Core Director
Merritt G. Gillilland, Ph.D.
Core Technical Director
Kathryn A. Eaton, D.V.M., Ph.D.
Gnotobiotic Program Manager
Patrick D. Schloss, Ph.D.
A key function of the UMCGR MMC for center members is providing consultative services. While most researchers are aware of microbiome research, those wishing to integrate microbiome or metabolomics studies into their research program often need guidance to optimize their chances of success. Important insights into the roles of microbial communities in human health and disease have historically resulted from collaborations between environmental microbial ecologists and clinical researchers. However, the range of experimental approaches available for studying the complex microbial communities of humans can be overwhelming and are complicated by a relative lack of familiarity across the disciplines. The MMC personnel are well positioned to provide advice and expertise for these approaches.
With this challenge in mind, consultative services will be provided by the Center Director, Vincent Young, and Co-director, Charles Burant. We have included Merritt Gillilland, Ph.D. will serve as the Core Technical Director for the UMCGR. Dr. Gillilland has extensive training and experience in the use of microbiome techniques to address complex biomedical questions. He will assist members with experimental design, consultation on specimen collection and handling, interface with the core facilities, data analysis and manuscript and grant proposal writing. By making his expertise to available to all center members who are interested in utilizing the Microbiome Core, we anticipate that this will not only increase usage of the core, but will make the experience much more rewarding and useful to the center researchers. Dr. Gillilland has already been serving in this role for the past year and the feedback from center members who have used his services has been uniformly positive. (Service Request Form)
Nucleic Acid Isolation
The Microbial System Lab of the HMI has a high-speed disrupter that is compatible with deep well plates, five Eppendorf EpMotion 5075 liquid handling workstation, and centrifuges that are able to spin 96-well plates. This infrastructure is used to isolate DNA and RNA from specimens using the PowerSoil nucleic acid extraction kits from MoBio (Carlsbad, CA). Using this approach, we are able to reproducibly extract sufficient DNA and RNA for multiple PCRs and metagenomic libraries. The DNA is sufficiently pure to amplify by PCR or to use in illumina library generation. (Service Request Form)
Microbiome (16S) Data Analysis
All PCRs will be set-up using an Eppendorf EpMotion 5075 liquid handling workstation and run on Eppendorf MasterCycler thermalcyclers. To reduce the rate of chimera formation, we will use a 5-minute extension time (~10-fold excess) and to reduce PCR errors we will use high fidelity polymerases. In addition to the samples being processed, we will include two controls. The first, a defined community of 21 genomic DNAs (i.e. "mock community") will be included to allow us to quantify the amount of sequencing error. We have used the Human Microbiome Project's mock community, which is available through BEI. The second control is a constant DNA sample extract from a single fecal pellet will allow us to detect run-to-run drift and protect against possible batch effects. After amplification, PCR products will be cleaned using the AmpPure (Agencourt) magnetic bead clean up kit and quantified using a fluorometric DNA quantification kit (Invitrogen). Cleaned PCR products will then be pooled in equamolar quantities using the EpMotion. Pooled PCR products will then be sequenced on the sequencing platform according to the manufacturers' protocols. (Service Request Form)
MiSeq-base 16S rRNA Gene Sequencing
The Schloss lab has continuously developed the most widely cited software package for analyzing 16S rRNA gene sequences: mothur (3966 citations according to Web of Knowledge as of 7/4/2016). Over the past 7 years the software has continued to evolve alongside the sequencing technology. This evolution has focused on maintaining a high level of data quality and reducing the sequencing error rates from ~1% to below 0.02%. Methods proposed by others on a variety of platforms have error rates more than 10-fold higher than those that can be obtained using mothur. Based on this work, we have developed a sequencing protocol using the Illumina MiSeq platform where the V4 region of the 16S rRNA gene is amplified and sequenced so that every base is sequenced twice, resulting in optimal error reduction. The wet-lab protocol is available on the lab's website (https://github.com/SchlossLab/MiSeq_WetLab_SOP) and their data curation protocols are available on line as well (http://www.mothur.org/wiki/MiSeq_SOP). We will continue to refine our techniques over the course of the proposed research to optimize sequencing depth and length while maintaining stringent sequencing quality. (Service Request Form)
Genomic/Metagenomic shotgun sequencing
Bacterial genome sequencing strategy. The decreasing cost of bacterial whole genome sequencing (WGS) has facilitated sequencing large collections of isolates to answer fundamental questions regarding the epidemiology and evolution of bacterial pathogens. In the past several years U of M investigators have effectively applied WGS to study many of the most significant drug-resistant hospital pathogens such as Klebsiella pneumoniae and Acinetobacter baumannii. To sequence a bacterial genome of 5 Mbp, it is only necessary to obtain 150-450 Mbp of sequence for that genome. On the MiSeq it would be possible to take a bacterial genome to the draft sequence stage for 50 genomes in a single run; multiplexing fewer genomes per run would improve the quality of the genome assembly. Using the MiSeq platform available we are able to bacterial genomes either in committed runs of only genomes or as samples spiked into 16S rRNA gene sequencing runs to increase genetic diversity and reduce the amount of PhiX control that is used.
Metagenomic sequencing strategy. For metagenomic shotgun sequencing it is as yet unclear how much sequencing depth is needed to adequately describe a microbial community. For instance, using the HiSeq platform in the U of M Sequencing Core with paired 100 bp reads to sequence the microbial genomes in mouse feces, we recently obtained approximately 1.2 Gbp (i.e. 240 genome equivalents) from each of 24 fecal pellets (ca. 5,800 total genome equivalents) and 50% of our contigs were longer than 500 bp and the ten longest contigs ranged between 41 and 97 Kbp. In such a design we were able to multiplex 12 mouse metagenomes per lane of HiSeq; a similar sequencing depth per sample would be possible for 6-7 metagenomes processed on a MiSeq run.
Library construction and sequencing. Genomic and metagenomic library construction will be performed using reagents and protocols provided by Illumina and their subsidiary Epicentre. Briefly, we will quantify genomic DNA using the Invitrogen QuantIT reagents with DNA mass standards using a 96 well fluorometer. Next, we will use the Nextera DNA sample preparation kit, which is a transposome-based approach to generating barcoded ("indexed") genomic and metagenomic DNA libraries. The system generates barcoded random fragments ranging in length between 500 and 1,000 bp that can then be sequenced without additional size selection steps. Furthermore, only 50 ng of starting DNA is needed to build a library. This is ideal for isolating DNA from fecal samples and low biomass samples associated with tissue samples (e.g. from the ileum or colon mucosa). Following individual library construction, the DNA concentration in the libraries will be quantified and normalized to 4 nM. We will also check the fragment length distribution to insure that we have a range of lengths from 250 to 1000 bp with a median length near 450 bp. Finally, the libraries will be pooled in equal molar concentrations and sequenced on the HiSeq or MiSeq sequencers according to the Illumina standard protocols. (Service Request Form)
Bacterial transcriptomics and metatranscriptomics
Justification. Metagenomic shotgun sequencing can reveal the genetic content of a genome or community of bacteria; however, it does not indicate which genes are active in a community. To address questions about which genes are active, one must interrogate the transcriptome. Recent advances in transcriptomics and metatranscriptomics have made sequence-based approaches preferable to chip-based approaches in terms of cost, dynamic range, flexibility, and ability to detect polymorphisms. For metatranscriptomics a significant challenge is the availability of high quality reference sequences. These can come from cultured isolates or metagenomes that have been sequenced in parallel to the metatranscriptome. Recent work with Greg Dick, PhD at the University of Michigan has shown that for hydrothermal vent communities approximately 22% of transcripts could be mapped to a parallel reference metagenome sequence collection and 11% could be mapped to reference databases (e.g. GenBank). The remaining fragments can be assembled, de novo, into operons representing rare members of the community. We anticipate that for gut communities we will have greater success in mapping reads to reference sequences because of the large number of reference genomes available from the HMP culturing and sequencing initiative.
Approach. Much of the library generation and sequencing approach will be analogous to that of genomic and metagenomic shotgun sequencing. Samples will be barcoded so that we can pool 6 samples per MiSeq run or 12 samples per HiSeq lane. To sequence bacterial transcripts several pre-processing steps are required. First, all samples that will be analyzed by transcriptomics will be preserved in RNALater (Ambion) at -80°C. Bulk RNA will then be isolated from samples using the MoBio PowerSoil RNA extraction kit, which we have successfully used to isolate RNA from fecal pellets. Next, because rRNA can represent more than 90% of the RNA pool, we will use the RiboZero (Epicentre) bacterial rRNA depletion kit. After DNAse treating the RNA preparation we will then build barcoded libraries using the ScriptSeq RNA-Seq library preparation kit. This kit requires on 50 ng of RNA to build a library. Once we quantify the cDNA in the libraries and inspecting their length profiles, we will pool the libraries in equimolar concentrations and sequence the libraries according to the standard Illumina protocols. (Service Request Form)
Metabolomic sample preparation
The Michigan Regional Comprehensive Metabolomics Resource Core has multiple vetted pipelines available for the targeted and untargeted examination of metabolites in a variety of samples. Some of these will be briefly highlighted here.
Multiplatform Metabolomic Profiling.
Analyses in the core use a standard methodology that has been developed and refined in the Metabolomics Core that uses a semi-automated, high-throughput workflow combining both targeted and untargeted approaches. The workflow involves both unbiased feature recognition and targeted analysis of an authentic standard library containing ~1,000 individual compounds, which have been cataloged by analysis on reversed-phase LC-MS, HILIC LC-MS, and GC-MS methods. This methodology can accurately quantify thousands of features and assign an identity to approximately 400 metabolites in plasma, with similar numbers in tissue samples. For compounds in which a stable-isotope labeled internal standard is available, isotope dilution mass spectrometry can be used to enable absolute quantification of these metabolites using the same raw data.
Analysis methods. Samples are subjected to three analyses as a part of our multi-platform metabolomics analysis: RPLC-MS, HILIC-MS, and GC-MS. a) RPLC-MS: Extracted samples are dried under nitrogen gas and re-suspended in 90%/10% acetonitrile water containing injection standards. Samples are subjected to chromatography on a Waters HSS T3 C18 column, using a water/acetonitrile gradient (modified with 0.1% formic acid), as described previously. The eluent is analyzed by positive and negative ion mode ESI. At present, this assay is performed using an Agilent 1200 LC coupled to a 6530 qTOF mass spectrometer. b) HILIC-MS: Supernatant from sample extracts can be injected directly onto a Millipore SeQuant ZIC-pHILIC column (for plasma samples) or a Phenomenex Luna NH2 column (for tissue samples) and analyzed as described previously using an Agilent 6520 qTOF MS. To achieve sample pre-concentration, the supernatant can be dried and reconstituted in a smaller volume of 80% methanol 20% water prior to injection. c) GC-MS: Extracted samples are thoroughly dried under nitrogen gas prior to two-step derivatization using methoxyamine hydrochloride in pyridine, followed by methyl trimethyl-silyl-triflouroacetamide with 1% trimethylchlorosilane (MSTFA+TMCS). Samples are analyzed using EI-MS on an Agilent 7890 GC/MS with a 5975 mass-selective detector. Samples are separated on a DB5-MS column according to the retention time locking "Fiehnlib" protocol, which enables the option of searching the data against a well-established externally metabolite standard library, which may prove useful for cross-laboratory data comparison within MoTrPAC.
For routine analysis of biological samples, all mass spectral data for these assays are acquired using full-scan MS1 mode to allow reliable quantitation and untargeted analysis of the data. We anticipate that we will carry out targeted MS/MS profiling (Q-TOF) of detectable features in a representative reference sample of each type (plasma, muscle, etc.) collected by MoTrPAC; this will help accelerate identification of unknown candidate transducers of interest. Following acquisition of our new Thermo Q-Exactive Orbitrap instrument as described in the Administrative Element, the library will be fully catalogued on this instrument.
Targeted Metabolomics Assays. The Metabolomics Core maintains a growing repertoire of targeted metabolomics assays which enable absolute quantitation of a variety of metabolite classes (Table 1). Many of these metabolites are covered by the untargeted profiling methods described above, thus these assays need not be run separately unless improved sensitivity or quantitation are required. The targeted analysis of short chain fatty acids, important in GI physiology will be highlighted to provide an example of the services available.
Table 1. Targeted metabolite assays
Figure 2. Combined targeted + untargeted feature detection workflow
Short Chain Fatty acid analysis. Mouse fecal pellets are homogenized in a bullet blender (no beads), in 400 uL of a solution of 30 mM hydrochloric acid plus isotopically-labeled acetate (0.125 mM), butyrate (0.125mM), and hexanoate (0.0125 mM) followed by addition of 250 uL of Methyl tert-butyl ether (MTBE), and vortexed for 10 seconds to emulsify, held at 4 °C for 5 mins and vortexed again for 10 sec Samples are centrifuged for 1 minute to separate the solvent layers and MTBE is then removed. A pool is formed by combining 10 ul of MTBE for quality control purposes. A series of SCFA calibration standards are prepared along with samples to quantify metabolites. GC-MS analysis is performed on an Agilent 69890N GC -5973 MS detector with the following parameters: a 1µL sample is injected with a 1:10 split ratio on a ZB-WAXplus, 30m x0.25mmx0.25um (Phenomenex Cat#7HG-G013-11) GC column, with He as the carrier gas at a flow rate: 1.1 ml/min. The injector temperature is 240 °C, and the column temperature was isocratic at 310 °C. Data are processed using MassHunter Quantitative analysis version B.07.00. SCFAs were normalized to the nearest isotope labeled internal standard and quantitated using 2 replicated injections of 5 standards to create a linear calibration curve with accuracy better than 80% for each standard.
Metabolomis Data Analysis.
Targeted and untargeted peak detection and alignment. Initial data analysis (peak finding, alignment, and batch normalization) is performed within the Chemical Analysis Element, before data is passed on to the Bioinformatics Element. The workflow for this process is illustrated in Figure 2 and accomplishes of both targeted and untargeted metabolomics analysis.
Data Normalization. The ability to normalize metabolomics data between multiple batches, whether several days or several years apart, is important for analysis of large sample sets. For metabolites with stable isotope internal standards added during extraction, absolute quantification is easily accomplished by measuring the metabolite:IS ratio; but for most metabolites lacking an internal standard, an alernate approach is needed. An abundance of different data normalization strategies exist in the field of metabolomics, each with distinct advantages and disadvantages; this topic has been the subject of a recent review
Statistical analysis: Metabolomics experiments generate significant amounts of data for both named compounds and "Known Unknowns". Datasets produced pose challenges similar to microarray data (large number of variables vs small number of samples). Additional dimension of complexity is introduced in big experiments due to day-to-day variations in equipment performance. Users are provided with basic and advanced statistical techniques, such as the P-test (basic significance), Q-test (false discovery rate calculations), SVD (Singular value decompositions), NNMF (Non-Negative Matrix Factorizations), PCA (Principle Component Analysis), DA (Discriminant Analysis), HC (Hierarchical Clustering), SVM (Support Vector Machines), RF (Random Forest), partitioning and similarity analyses. Further exploratory analysis and visualization can be done with our tools, including Metscape 3 available on line as the next version of the popular Metscape which can integrate metabolomics, transcriptomics and genomic information for pathway enrichment. (Service Request Form)
Germ-free & Gnotobiotic Mouse Facilities
The germ free mouse resource at the University of Michigan is directed by Dr. Kathryn Eaton. Mice are housed in soft-sided plastic isolators, in which they remain free of all bacteria, exogenous viruses, fungi, and parasites (determined by regular fecal monitoring and periodic control necropsies). Germ-free breeding colonies of mice are maintained in this facility and experimental isolators are used to maintain the germ-free or gnotobiotic status of the mice during experiments. For colonization studies, germ-free mice can be inoculated with a suspension of a specific microbe (mono-colonization), a defined finite group of microbes, or a polymicrobial mixture. Microbes are introduced directly into the stomach with a 24-gauge ball-tipped gavage needle. We have successfully used this procedure in our gnotobiotic facility numerous times to conventionalize germ free mice such that conventionalized microbiota retains its similarity to the donor inoculum microbiota. (Service Request Form)
Bioinformatics Training, Information and Analysis
Training will largely be focused on the methodology required for data analysis (microbiome and metabolome) as desired by the investigators. While core/base analysis can be performed by Core Staff, some investigators may wish to be able to conduct these analyses and more detailed analysis on their own. Dr. Schloss runs a number of workshops on the use of mothur (see below). The Metabolomics Core has developed a number of educational tools, including on-line tutorials for the use of informatics tools, YouTube videos on various topics related to metabolomics analysis and a yearly week-long hands-on workshop. (Service Request Form)