7 Module 7: Data Analysis and Experimental Design
7.1 Overview
Weeks 13 and 14 focus on genome assembly, annotation review, comparative analysis, and connecting genomic results back to phenotype.
7.2 Purpose
- Evaluate DNA sequencing results and microbial genome assemblies.
- Submit an assembly job to BV-BRC.
- Apply bioinformatics tools to assemble genomes and analyze predicted coding sequences.
7.3 Learning Outcomes
- Explain the purpose of the bioinformatics tools and workflows used.
- Discuss the significance of genome assembly.
- Practice using web-based bioinformatics tools.
- Explore genome assemblies and annotations.
- Describe the workflow for microbial genome assembly.
- Collect and interpret genomic data in the context of phenotypic analyses.
- Revise the draft for the individual and group projects.
- Explain how this work contributes to the overall experimental goal.
7.4 Skills and Knowledge
7.4.1 Skills
- Perform sequence-read quality review.
- Assemble microbial genomes.
- Annotate and explore genome content.
7.4.2 Knowledge
- Steps required to assemble a microbial genome.
- Tools for filtering, assembly, annotation, and metabolic modeling.
- Cloud-based bioinformatics workflow submission.
7.5 Task
Work in pairs to obtain, explore, and interpret genome data and then connect those results to the phenotypic evidence collected earlier in the course.
7.6 Criteria for Success
Successful completion requires use of BV-BRC and SeqHub, careful documentation of outputs, and a complete ELN entry.
7.7 Background
Students now use sequence data generated earlier in the course to compare isolates with Delftia acidovorans SPH-1 and other related genomes. The goal is to identify genetic features that explain observed growth and metabolic behavior.
Figure Figure 7.1 is reused from the sequencing workflow to anchor where assembly, annotation, and comparative analysis fit into the broader pipeline.
7.8 Procedures
7.8.1 Lab Safety
This is a bioinformatics lab.
7.8.2 Methods: Data Analysis
- Access the shared BV-BRC workspace.
- Obtain the concatenated read files for your isolate.
- Run the Comprehensive Genome Analysis workflow with long-read and paired short-read data.
- Create an output folder labeled with the isolate name.
- Submit the job and monitor the results.
- Upload the assembled FASTA file to SeqHub.
7.8.3 Methods: Genome Comparisons
- Use Similar Genome Finder in BV-BRC.
- Compare your isolate against representative and public genomes.
- Save the result tables and pie-chart images.
- Use the Genome Alignment workflow for follow-up comparisons.
7.8.4 Methods: Annotation Review and Genome Exploration
- Review coding-sequence counts and other reported features.
- Record the most abundant functional assignments.
- Note whether plasmids or antibiotic-resistance genes were identified.
- Explore low-confidence annotations, hypothetical genes, predicted protein interactions, and genes of interest in SeqHub.
7.8.5 Protocol Notes
Record any mistakes, deviations, or isolate-specific observations.
7.9 Results
Include tables, figures, charts, and screenshots generated from your assembly and annotation workflows.
7.10 Result Analysis
Explain how your isolate differs from the SPH-1 reference, how well the genomic data matches the phenotype, and what the assembly metrics suggest about data quality.
7.11 Discussion Questions
- How did BV-BRC assemble your genome, and how good was the assembly?
- What factors affect genome assembly quality?
- What applications beyond research rely on gene prediction and annotation?
- What can predicted protein interaction networks contribute to your project?
