Bioinformatics -Level1-Module 7-Maryam

Module 7:‎ (Capstone project)

Technical Area

• Reading datasets and meta-data in R

• Quality control

• Normalization and background correction

• Batch effect removal

• Annotation and gene filteration

• KEGG pathway and GO analysis


• Gene concept network analysis

• TFs analysis

• PPI network analysis by Cytoscape

• Survival analysis

‎ Tools

• R packages: Affy, arrayQualityMetrics , sva, ggplot2, pheatmap, WGCNA, limma, ‎EnhancedVolcano, hgu133plus2.db, enrichplot,, msigdbr, magrittr, ‎clusterProfiler, enrichplot, tidyr, clusterProfiler, Rcpp

• Cytoscape (STRING and Cytohubba plugins)‎


Soft Skills

• I prepared a presentation of my Capstone project. So I worked on my presentation ‎skills

• Preparing Powerpoint for the presentation

Tasks completed

I merged two datasets containing 70 samples of lung cancer and removed the batch effect ‎between them and then implemented differential expression analysis for them. After ‎obtaining DEGs, I found enriched KEGG pathways and GO enriched terms for them. Then, I ‎plotted a gene-concept network and TF network for them and performed a GSEA. To find ‎hub genes I plotted a PPI network and found the 10 key genes in that network. After that, I ‎implemented the survival analysis for those 10 genes and found 5 genes that the value of ‎their expression was effective in the survival of patients with lung cancer.‎