Skip to content

mkromah/HackBio-BioCoding-Intership

Repository files navigation

HackBio Internship – BioCoding

πŸ“Œ Table of Contents

  1. Introduction
  2. Team Information
  3. Task 0: Team Formation & Data Representation
  4. Task 1: Microbial Growth Curve Analysis
  5. Task 2: Advanced Bioinformatics Analyses
  6. Upcoming Tasks
  7. How to Contribute
  8. Contact & Socials

Introduction

This repository documents the HackBio BioCoding Internship, where we engage in bioinformatics problem-solving using Python and R. Our goal is to enhance our coding proficiency while applying computational techniques to biological datasets.


Team Information

Name Slack Username Email Hobby Country Discipline Preferred Language
Musa Al Hassan Kromah Musa kromahmusa86@gmail.com Hiking Liberia Biotechnology Python, R
Fowowe Toyin Toyin toyintoyo05@gmail.com Reading Nigeria Biochemistry Python

Task 0: Team Formation & Data Representation

Objective

  • Organize team information in a structured data format using Python or R.
  • Ensure no loops, conditionals, or functions are used.

Approach & Implementation

R Implementation:

# Load necessary library
data <- data.frame(
  Name = c("Musa Al Hassan Kromah", "Nina Julian", "Fowowe Toyin"),
  Slack_Username = c("Musa", "Julian", "Toyin"),
  Email = c("kromahmusa86@gmail.com", "anyangonina39@gmail.com", "toyintoyo05@gmail.com"),
  Hobby = c("Hiking", "Listening to Music", "Reading"),
  Country = c("Liberia", "Kenya", "Nigeria"),
  Discipline = c("Biotechnology", "Biotechnology", "Biochemistry"),
  Preferred_Language = c("Python, R", "R", "Python")
)
print(data)

πŸ‘‰ Outcome: A structured data representation successfully printed.


Task 1: Microbial Growth Curve Analysis

Objective

  • Analyze microbial growth curves for knockout (-) and knock-in (+) strains.
  • Compute time to carrying capacity.
  • Visualize data using scatter and box plots.
  • Perform statistical analysis.

Approach & Implementation

Python Implementation:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
data = pd.read_csv("microbial_growth.csv")

Task 2: Advanced Bioinformatics Analyses

Objective

  • Perform computational analyses in various biological disciplines.
  • Apply data science, visualization, and statistical modeling techniques.

2.1 Microbiology: Growth Curve Analysis

πŸ”Ή Objective: Analyze microbial growth under different conditions.

πŸ”Ή Approach: Used Python to process growth curve data, visualize trends, and determine significant differences.

Python Implementation:

# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load growth curve data
data = pd.read_csv("growth_curve_data.csv")

# Plot microbial growth
g = sns.lineplot(data=data, x="Time", y="OD600", hue="Condition")
g.set(title="Microbial Growth Curve", xlabel="Time (hours)", ylabel="Optical Density (OD600)")
plt.show()

2.3 Botany & Plant Science: Metabolic Response Analysis

πŸ”Ή Objective: Evaluate metabolic shifts in response to environmental changes.

πŸ”Ή Approach: Used R for data normalization and visualization.

R Implementation:

# Load required library
library(ggplot2)

# Read dataset
data <- read.csv("metabolic_data.csv")

# Generate boxplot
p <- ggplot(data, aes(x=Condition, y=Metabolite_Level, fill=Condition)) +
     geom_boxplot() +
     ggtitle("Metabolic Response Analysis")
print(p)

2.4 Biochemistry & Oncology: Protein Mutation Impact

πŸ”Ή Objective: Assess functional impact of protein mutations.

πŸ”Ή Approach: Python-based structural modeling and variant impact prediction.

Python Implementation:

from Bio.PDB import *

# Load PDB file
parser = PDBParser()
structure = parser.get_structure("Protein", "protein_structure.pdb")

# Extract chain A
chain_A = structure[0]["A"]

# Print residue names
for residue in chain_A:
    print(residue.resname)

2.6 Transcriptomics: RNA-seq Data Analysis

πŸ”Ή Objective: Perform differential expression analysis on RNA-seq data.

πŸ”Ή Approach: Used Python and R to preprocess and analyze RNA-seq datasets.

Python Implementation:

import pandas as pd
import seaborn as sns

# Load RNA-seq data
data = pd.read_csv("rna_seq_data.csv")

# Generate heatmap
sns.heatmap(data.corr(), cmap="coolwarm", annot=True)
plt.title("Gene Expression Correlation")
plt.show()

2.7 Public Health: NHANES Data Analysis

πŸ”Ή Objective: Investigate health trends using NHANES dataset.

πŸ”Ή Approach: Statistical analysis in Python to uncover population health insights.

Python Implementation:

import pandas as pd

# Load NHANES dataset
data = pd.read_csv("nhanes_data.csv")

# Summary statistics
print(data.describe())

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors