- Introduction
- Team Information
- Task 0: Team Formation & Data Representation
- Task 1: Microbial Growth Curve Analysis
- Task 2: Advanced Bioinformatics Analyses
- Upcoming Tasks
- How to Contribute
- Contact & Socials
This repository documents the HackBio BioCoding Internship, where we engage in bioinformatics problem-solving using Python and R. Our goal is to enhance our coding proficiency while applying computational techniques to biological datasets.
| Name | Slack Username | Hobby | Country | Discipline | Preferred Language | |
|---|---|---|---|---|---|---|
| Musa Al Hassan Kromah | Musa | kromahmusa86@gmail.com | Hiking | Liberia | Biotechnology | Python, R |
| Fowowe Toyin | Toyin | toyintoyo05@gmail.com | Reading | Nigeria | Biochemistry | Python |
- Organize team information in a structured data format using Python or R.
- Ensure no loops, conditionals, or functions are used.
# Load necessary library
data <- data.frame(
Name = c("Musa Al Hassan Kromah", "Nina Julian", "Fowowe Toyin"),
Slack_Username = c("Musa", "Julian", "Toyin"),
Email = c("kromahmusa86@gmail.com", "anyangonina39@gmail.com", "toyintoyo05@gmail.com"),
Hobby = c("Hiking", "Listening to Music", "Reading"),
Country = c("Liberia", "Kenya", "Nigeria"),
Discipline = c("Biotechnology", "Biotechnology", "Biochemistry"),
Preferred_Language = c("Python, R", "R", "Python")
)
print(data)π Outcome: A structured data representation successfully printed.
- Analyze microbial growth curves for knockout (-) and knock-in (+) strains.
- Compute time to carrying capacity.
- Visualize data using scatter and box plots.
- Perform statistical analysis.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load dataset
data = pd.read_csv("microbial_growth.csv")- Perform computational analyses in various biological disciplines.
- Apply data science, visualization, and statistical modeling techniques.
πΉ Objective: Analyze microbial growth under different conditions.
πΉ Approach: Used Python to process growth curve data, visualize trends, and determine significant differences.
# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load growth curve data
data = pd.read_csv("growth_curve_data.csv")
# Plot microbial growth
g = sns.lineplot(data=data, x="Time", y="OD600", hue="Condition")
g.set(title="Microbial Growth Curve", xlabel="Time (hours)", ylabel="Optical Density (OD600)")
plt.show()πΉ Objective: Evaluate metabolic shifts in response to environmental changes.
πΉ Approach: Used R for data normalization and visualization.
# Load required library
library(ggplot2)
# Read dataset
data <- read.csv("metabolic_data.csv")
# Generate boxplot
p <- ggplot(data, aes(x=Condition, y=Metabolite_Level, fill=Condition)) +
geom_boxplot() +
ggtitle("Metabolic Response Analysis")
print(p)πΉ Objective: Assess functional impact of protein mutations.
πΉ Approach: Python-based structural modeling and variant impact prediction.
from Bio.PDB import *
# Load PDB file
parser = PDBParser()
structure = parser.get_structure("Protein", "protein_structure.pdb")
# Extract chain A
chain_A = structure[0]["A"]
# Print residue names
for residue in chain_A:
print(residue.resname)πΉ Objective: Perform differential expression analysis on RNA-seq data.
πΉ Approach: Used Python and R to preprocess and analyze RNA-seq datasets.
import pandas as pd
import seaborn as sns
# Load RNA-seq data
data = pd.read_csv("rna_seq_data.csv")
# Generate heatmap
sns.heatmap(data.corr(), cmap="coolwarm", annot=True)
plt.title("Gene Expression Correlation")
plt.show()πΉ Objective: Investigate health trends using NHANES dataset.
πΉ Approach: Statistical analysis in Python to uncover population health insights.
import pandas as pd
# Load NHANES dataset
data = pd.read_csv("nhanes_data.csv")
# Summary statistics
print(data.describe())