Overview:
Technical Area:
- Performed the following on GEO dataset GSE4107:
- Statistical analysis
- Quality control report analysis
- DGE analysis
Tools:
- R/RStudio
- GitHub
- Bioconductor packages: affy-packages, ggplot2, pheatmap, limma, EnhancedVolcano, clusterProfiler, enrichplot, msigdbr
- GSEA database
- GEO/GEOquery
- DAVID, STRING-DB, GEPIA
Soft Skills:
- Project Management/Task Management - During our meetings, I recorded what tasks we had to finish and assigned various members to different tasks.
- Time Management - We had a little over a week to perform analyses on GSE4107 samples, so I managed my time by dividing up my jobs and scheduling time for myself to work on specific tasks (ex: literature review, plot analysis, code review, etc).
- Teamwork - I worked with my team members @ivanlam27 @veyssi @Ananya_Kaushik @Roman_Ramirez @Leila to analyze GSE4107 for significant genes to colorectal cancer.
- Virtual Collaboration - My teammates and I met over Zoom calls, shared ideas and papers over Slack, and worked together on a Google Document report and our Google Slides final presentation. We shared/collaborated on our code via GitHub.
- Literature review - I looked over multiple papers to research any correlations between the FOS gene and colorectal cancer.
- Presentation - My team working on the Capstone project and I presented a full report of our analysis of GSE4107 and our identified genes of interest in correlation to colorectal cancer to mentors Anya and Ali.
Achievement Highlights (3):
- I am very proud of my teammates and I for completing a review/report of GSE4107’s sample’s significance to colorectal cancer within a week. We worked together to create data visualization plots, analyze our output and draw significant conclusions.
- During my literature review of FOS, I found an interesting paper with two polymorphisms that enhanced expression of the FOS gene, leading to cell differentiation/tumor formation and a higher risk of colorectal cancer (Chen et al. 2019).
- I have a really good understanding of reading data analysis plots: Normalization boxplots, PCA plots, Heatmaps, and Volcano Plots.
Difficulties Completing Tasks:
- Difficulties completing tasks include working around time zones, as this was a highly collaborative project and our team was working across 4 different time zones
- Without the structure of the modules, a difficulty I encountered was figuring out what to do for my final project/presentation. Luckily, I had a great team working with me and they helped me find the motivation and urgency to complete my tasks for our capstone project.