CIS 6930, University of Florida: Bioinformatics Algorithms




The human microbiota has profound effects on health and disease.The presence of microbes is essential for health maintenance. It has been recognized that human genome is only part of our genetic composition while the human host and microbiota have coevolved. Also, the microbiome is intimately involved in the development and maintenance of the immune system. All these factors and advances in the sequencing technologies motivated the microbiome study. Our project focuses on gut microbiota which represents the community of microorganisms that live in the digestive tracts of humans. We are interested in finding out how composition of gut microbiota is associated with the host's diet and the host's state of health or disease. Additionally, we will target the changes in the microbial activities using the PICRUSt algorithm.

Human Microbiome Data

Download the utility software which facilitates joining of paired-end reads, creation of seq.fna files and creation of OTU tables using open-reference, closed-reference and denovo picking.


Requirements

  • QIIME (If you already have QIIME, proceed to the QIIME Users section.)

  • Java 1.8


Installations

  • QIIME

    1. Download and install the VirtualBox (VB) version for Windows here.
      Download and install the VirtualBox (VB) version for MAC here.

    2. Download the 64-bit QIIME Virtual Box Image here. This file is large so it may take between a few minutes to a few hours depending on your Internet connection speed. You will need to unzip this file.

    3. Create a new virtual machine:
      • Launch VirtualBox, and create a new machine (press the New button). A new window will show up. Click ‘Next’.
      • In this screen, type QIIME as the name for the virtual machine. Then select Linux as the Operating System, and Ubuntu (64 bit) as the version. Click Next.
      • Select the amount of RAM (memory). You will need at least 3 GB, but the best option is based on your mthe softeareachine. After selecting the amount of RAM, click Next.
      • Select “Use existing hard drive”, and click the folder icon next to the selector (it has a green 'UP' arrow). In the new window click ‘Add’, and locate the virtual hard drive that was downloaded in step 2. Click Select and then click Next.
      • In the new window click Finish.

    4. Double click on the new virtual machine created – it will be called QIIME – to boot it for the first time. Review any messages that are shown, and select whatever options are best for you.

    5. Open terminal(Ctrl+Alt+T). Test your QIIME installation by running the following command:
      print_qiime_config.py -t
      An 'OK' will be displayed at the end of the test results.

    6. Proceed to check if the java version is 1.8. You can check the version using the following command in the terminal:
      java -version
      If an update to version 1.8 is needed, run these commands:
      (Note : When prompted to enter the password, enter 'qiime')
      sudo add-apt-repository ppa:webupd8team/java
      sudo apt-get update
      sudo apt-get install oracle-java8-installer

    7. Open the browser in this environment, go to humanmicrobiome.github.io and download the software and user manual.

    8. Extract the tar.gz file which contains a folder with the files automate_qiime.py and the JAR file OTUGenerator.jar by running these commands in the terminal:
      Go to the directory where you just downloaded the tar.gz file(typically the 'Downloads' folder).
      cd Downloads
      Extracting the file:
      tar -xvzf OTUGenerator.tar.gz

    9. Download the GreenGenes database which will be required for OTU picking. This can be downloaded here. Move this file to directory OTUGenerator.
      In the terminal, type the following commands:
      cd OTUGenerator
      .You will need to extract the file, 'gg_13_5_otus.tar.gz'. Here is the command for extracting:
      tar -xvzf gg_13_5_otus.tar.gz

    10. The software is now ready to use. To run the software, execute the following command:
      java -jar OTUGenerator.jar

    11. To use the software, please make sure to read the user manual. You can download the user manual here.
    1. If you do not have pip, the easiest way to install it is by running:
      easy_install pip
      Or, if you are on Ubuntu, run:
      apt-get install python-pip

    2. Now, run the following commands:
      pip install numpy
      pip install qiime

    3. Test your QIIME installation by running the following command in terminal:
      print_qiime_config.py -t
      An 'OK' will be displayed at the end of the test results.

    4. Make sure you have Java 1.8. You can check the version using the following command:
      java -version

    5. If you want to update, just use these command lines :
      sudo add-apt-repository ppa:webupd8team/java
      sudo apt-get update
      sudo apt-get install oracle-java8-installer

    6. If you want to install java, perform the following steps:
      • Download the Oracle Java JRE for Linux here. Make sure you select the correct compressed binaries for your system architecture 32-bit or 64-bit (which end in tar.gz).
      • You can copy the file to the location where you want java to be installed.
      • Open the terminal in this location. Unpack the tarball and install the JRE with the following command:
        tar -xvzf jre-8uversion-linux-x64.tar.gz (e.g. jre-8u131-linux-x64.tar.gz)

    7. Download the software and user manual.

    8. Extract the tar.gz file which contains a folder with the files automate_qiime.py and the JAR file OTUGenerator.jar by running these commands in the terminal:
      Go to the directory where you just downloaded the .gz file(typically the 'Downloads' folder).
      cd Downloads
      Extracting the file:
      tar -xvzf OTUGenerator.gz

    9. Download the GreenGenes database which will be required for OTU picking. This can be downloaded here. Move this file to directory OTUGenerator.
      In the terminal, type the following commands:
      cd OTUGenerator
      .You will need to extract the file, 'gg_13_5_otus.tar.gz'. Here is the command for extracting:
      tar -xvzf gg_13_5_otus.tar.gz

    10. The software is now ready to use. To run the software, execute the following command:
      java -jar OTUGenerator.jar

    11. To use the software, please make sure to read the user manual. You can download the user manual here.

QIIME Users

  • Make sure you have Java 1.8. You can check the version using the following command:
    java -version

  • If you want to update, just use these command lines:
    sudo add-apt-repository ppa:webupd8team/java
    sudo apt-get update
    sudo apt-get install oracle-java8-installer

  • If you want to install java, perform the following steps:
    • Download the Oracle Java JRE for Linux here. Make sure you select the correct compressed binaries for your system architecture 32-bit or 64-bit (which end in tar.gz).
    • You can copy the file to the location where you want java to be installed.
    • Open the terminal in this location. Unpack the tarball and install the JRE with the following command:
      tar zxvf jre-8uversion-linux-x64.tar.gz (e.g. jre-8u131-linux-x64.tar.gz)

  • Download the software and user manual.

  • Extract the tar.gz file which contains a folder with the files automate_qiime.py and the JAR file OTUGenerator.jar by running these commands in the terminal:
    Go to the directory where you just downloaded the tar.gz file(typically the 'Downloads' folder).
    cd Downloads
    Extracting the file:
    tar -xvzf OTUGenerator.tar.gz

  • Download the GreenGenes database which will be required for OTU picking. This can be downloaded here. Move this file to directory OTUGenerator.
    In the terminal, type the following commands:
    cd OTUGenerator
    .You will need to extract the file, 'gg_13_5_otus.tar.gz'. Here is the command for extracting:
    tar -xvzf gg_13_5_otus.tar.gz

  • The software is now ready to use. To run the software, execute the following command:
    java -jar OTUGenerator.jar

  • To use the software, please make sure to read the user manual.


Explanation of the output

In the specified output folder, you will find:

  • The OTU tables in the BIOM format and some intermediate files like rep_set.fna and rep-set.tre.
  • A folder named taxa_summary consisting of txt files, one for each taxonomic rank or level starting from L2(phylum) and upto L6(genus). These files specify how many sequences from a particular sample fall into which OTU. Within this folder is another folder taxa_summary_plots which will provide a graphical representation of the OTU tables.
  • A folder named alpha_diversity. Alpha diversity is the mean species diversity within an environment at a local scale. The folder comparing_alpha_grouping contains a boxplot specifying alpha diversity for the provided input.
  • A folder named beta_diversity. Beta diversity can be defined as the extent of change in community composition, or degree of community differentiation, in relation to a complex-gradient of environment, or a pattern of environments. Locate the PCoA plots under the folders unweighted_unifrac_emperor_pcoa_plot and weighted_unifrac_emperor_pcoa_plot.
  • PCoA plot for our data can be found here