Anca - Bioinformatics Pathway

WEEK 2 (first self assessment)

  • A concise overview of things learned:

Technical Area: I learned how to code in R and refreshed my coding skills in Python (especially on using Pandas data frame). I also learned the main steps in performing a DNA microarray and how to interpret certain figures and graphs from the research paper that we are working with.

Tools: I learned how to look for data in the Gene Ontology Resource, as well as what the KEGG database is. I learned a little bit about the Asana project management tool, by watching the webinar on youtube.

Soft Skills: I learned how to use the StemAway platform better (like how to make a post, watch categories, send private messages and so on). I was already comfortable with using Slack as I used it for classes before. I improved my communication skills and became more comfortable with asking questions in webinars and other meetings. Being in breakout rooms during the technical webinar made me discuss the paper with people I’ve never met before, which improved my team work skills.

  • Achievement highlights:
  1. Finished the R and Python exercises successfully, which helped me gain experience with R studio.

  2. Discussed with the team the first four figures in the main paper, and analyzed alone the remaining three figures.

  3. I sent the first message to my group of three people for this week’s tasks.

  • List of meetings/trainings attended:

6/1 Technical Training Webinar, 6/1 Team 4 first meeting, 6/2 R training Workshop, 6/3 Technical training Webinar, 6/5 R Training Workshop 2, 6/5 Team 4 Week 1 Happy Hour, 6/8 Team4/The Gene Team Meeting, 6/9 Python Training Session 2(Beg), 6/10 Logistical Webinar, 6/10 Technical Training Webinar, 6/11 Gene Team Meeting, 6/12 Welcome Session by Debaleena, 6/12 The Gene Team Happy Hour, 6/15 Gene Team meeting

  • Goals for the upcoming week:

To better understand the data that we will be working with, as well as all the figures in the paper. To improve my Python skills (regarding object-oriented programming and using different databases). To complete all the tasks with my group by the specified deadlines.

  • Detailed Statement of Tasks done:

This was my first time coding in R and I had some trouble because as I was writing the code in the R script, I could not save it as a R file. However, one of my team leads, Annie, helped me fix the problem. Also, the technical webinars answered a lot of my questions about the paper, which was really helpful.

  • Change of role:

I would like to request to move from Observer to Participant.

WEEK 3 (second self assessment)

  • A concise overview of things learned:

Technical area: I learned how to use the R bioconductor packages better (such as simpleAffy, affyPLM, gcrma), how to analyze QC plots and NUSE and RLE histograms and how to make a PCA plot with different labels.
Tools: how to get the right data from the GEO database; how to sign up for Asana and view my tasks and mark them as complete; how to create a private slack channel for my smaller group.
Soft skills: how to send connection requests on Linkedin (like with a not too long or too short message, to show my interest so that it is more personal); how important cultural diversity in the workplace is and how we should always say something and be responsible if we see someone doing something wrong; how to work on the same thing with teammates I have never really met before.

  • Achievement highlights:
  1. Finished all the tasks with my smaller group before the team meeting on Monday.
  2. Completed my LinkedIn profile (with suggestions from one of my leads) so that I can make those new connections with a good profile.
  3. Understood everything from the technical webinar which really helped in analyzing the results from these first tasks.
  • List of meetings/trainings attended:

6/15 Gene Team Meeting, 6/16 Asana Training, 6/16 Python and Pandas Webinar, 6/17 Technical Training Webinar, 6/18 Gene Team Meeting, 6/19 Gene Team Happy Hour, 6/20 Group 7 mini-meeting, 6/22 Group 7 meeting, 6/22 Gene Team Meeting

  • Goals for the upcoming week:
    To make a successful presentation with the other group that we were paired with. To complete the week 4 deliverables on time. To successfully connect with 10 people that I am truly interested in working with/know more about their career. To get to know my teammates better. To attend the R training (I had to miss one last week but I watched it and it was really useful!) To get more comfortable using R.

  • Detailed Statement of Tasks done:
    I had some issues while using certain R packages because the data I was trying to use was not the right type. My group, the leads, the questions on the forum and online information from the Bioconductor package helped me solve the issues. I also had some problems with R generally taking a lot of time to normalize the data or to create some plots, and that was hard because since it is still my first time using R, I wasn’t sure if my code was right and sometimes I had to wait a lot of time for nothing.

WEEK 4 (third self assessment)

  • A concise overview of things learned:

Technical area: Analyzing gene expression data in R, using the limma package. Better understood normalization using gcrma method and its importance. How to annotate gene expression data and remove the NaN samples. How to create a heatmap in R. Using matplot in Python.
Tools: R packages (Limma, pheatmap, affyPLM). GitHub.
Soft skills: How to communicate better on the stemaway forum. How to network via Linkedin without being too insistent. How to start your first day at a new job in order to get the most out of it.

  • Achievement highlights:
  1. Finished all the Python exercises from the second set successfully.
  2. Completed all Week 4 deliverables on time and successfully presented our results last Thursday.
  3. Gained a better understanding of the importance of annotation, as well as of having different normalization methods when trying to find differentially expressed genes.
  • List of meetings/trainings attended:
    6/23 Python Training, 6/24 BI leads office hours, 6/24 Group 6 & Group 7 meeting, 6/24 Industry Fireside Chat, 6/25 GitHub Webinar, 6/25 Bioinformatics Webinar (1st hour), 6/25 Gene Team Meeting, 6/26 Group 7 Meeting, 6/26 R Training, 6/29 Group 6 and Group 7 meeting, 6/29 Gene Team Meeting, 6/30 Fireside Chat 2, 6/30 Python office hours

  • Goals for the upcoming week:
    To complete the phenotypic analysis by July 1st. To finish week 5 deliverables on time. To meet with group 2 and prepare our presentation for this Thursday.

  • Detailed Statement of Tasks done:
    I communicated on slack with my smaller group (group 7) and together we troubleshooted each other’s code and understood the errors. We initially got different differentially expressed genes but then we compared our code and realized which way was the right one. I also wrote a summary of all of our meetings and after my teammates checked I submitted it in the drive (like an overall agenda of all of our meetings)


  • A concise overview of things learned:
    Technical area: Creating a gene vector with only one numeric column and symbol/gene ids as rownames and using clusterProfiler to see where the genes are involved. Using DAVID to visualize in which pathways the differentially expressed genes are involved.
    Tools: clusterProfiler, topGO, groupGO, matgrittr, tidyr, DAVID, wikiPathways
    Soft Skills: Better communicate via slack and zoom. The importance of thinking creatively in any job.

  • Achievement highlights:

  1. Successfully submitted week 4 results and the python exercises on GitHub.
  2. Attended office hours to make sure we are supposed to do different plots for up/down regulated genes and told my group.
  3. Signed up to host an office hours session for July participants
  • List of meetings/trainings attended:
    7/1 GitHub Webinar 2, 7/1 Group 2 & Group 7 Meeting, 7/1 Group 7 Meeting, 7/2 Group 2 & Group 7 Meeting, 7/2 Gene Team Meeting, 7/2 Gene Team Happy Hour, 7/6 Gene Team Results Discussion with Mentors, 7/7 BI Office Hours

  • Goals for the upcoming week:
    To finalize the week 5 deliverables with my group and interpret our results. To prepare for office hours and make it very helpful for incoming participants.

  • Detailed Statement of Tasks done:
    I had a lot of problems with obtaining any plots in R and after reading through the documentation and other helpful pages I still couldn’t figure it out, so I asked my group and one of them offered to try my code and although everything seemed ok, the plots and csv files were still empty. But then with a small change in the code, she managed to get some results although the same code still does not work for me. (I still get all csv files and plots empty, except the plotGOgraph) I will restart R and possibly update it and try again. However, in order to directly contribute to my group for the deliverables document I wrote the answers for questions 6, 8 and I am still working on 7.


  • A concise overview of things learned:
    Technical area: Understanding how to interpret DAVID results and extract biologically relevant information. Interpreting STRING networks. Understanding the importance of Gene Ontology database.
    Tools: groupGO, enrichGO, enrichKEGG, STRING, DAVID, GitHub
    Soft skills: resume building guidelines, elevator pitch, divergent thinking

  • Achievement highlights:

  1. Lead one office hours session with a step by step presentation on how to read a scientific paper, as well as other tips.
  2. Interpreted DAVID results and successfully talked about them and STRING PPI networks in Monday’s presentation.
  3. Been selected as a lead for the July session.
  • List of meetings/trainings attended:
    7/8 Gene Team Meeting, 7/8 Group 7 Meeting, 7/8 Old and New Leads Meeting, 7/10 Groups 1 & 7 Presentation Meeting, 7/10 Gene Team Happy Hour, 7/13 Gene Team Meeting

  • Goals for the upcoming week:

  1. To start the individual project as early as possible because I am participating in another program between July 15 and July 24, so I won’t have a lot of time next week.
  2. To communicate with the mentors and the other leads and be prepared for the first weeks of the July session.
  3. To start a journal in order to capture ideas (as Annie suggested us)
  • Detailed Statement of Tasks done:
    I read about Gene Ontology and contributed to the Importance and Limitations slides of our Monday presentation. I selected the most important results from DAVID and tried to find ways to better analyze them. I attended meetings with my group where we troubleshooted each other’s code and discussed our results, and then we met with Group 1 to put together the presentation. In Group 7 meetings, I helped my group understand the difference of the 3 ontologies and together we realized it might be best to make barplots and dotplots separately for each ontology.


  • A concise overview of things learned:
    Technical area: using the ComBat function, a little bit about topological analysis of PPI networks
    Tools: creating a google calendar event, sva package
    Soft skills: elevator pitch, networking, Tips for a great presentation/pitching ideas

  • Achievement highlights:

  1. Chose a different paper for the final project and learned about other tools than the ones used in the internship.
  2. Recorded a video for the July participants on downloading the two datasets and merging them.
  3. Hosted a webinar with two other leads on the biological aspects of the paper.
  • List of meetings/trainings attended:
    7/14 July leads meeting, 7/15 Gene Team meeting, 7/20 July BI Team 1 Meeting, 7/20 Biological webinar prep, 7/21 Biological Webinar, 7/21 BI Leads Technical Training, 7/21 Gene Team Meeting

  • Goals for the upcoming week:

  1. To be up to date with everything regarding the July session and attend the technical trainings, as I had to miss most meetings last week due to another program that had meetings at the same time.
  2. To prepare a good final presentation for Thursday.
  3. To complete my final self-assessment by Sunday.
  • Detailed Statement of Tasks done:
    I decided to choose a different paper because I am very interested in treating neurodegenerative diseases and I was curious what bioinformatics studies were done in such diseases. Because I chose a paper looking at healthy and Parkinsonian brains, it was a bit harder because the samples were coming from different parts of the brain and initially I was trying to use that as groups too. Then I realized that instead of trying to do exactly what they are doing in the paper, I could just try to follow a procedure similar to what we did, with the packages that I already learned how to use. It was still interesting though to learn a little bit about the other packages. Also, overall this week I was not able to spend as much time with stemaway as I wished, and I did all of my final project over the weekend, which might mean that I rushed through it a little bit, but I am hoping to have time to make sure everything is ok before the presentation and to prepare a good presentation.

Things I learned:
Technical Skills:

  • How to perform and interpret microarray analysis

  • Quality Control (simpleaffy and affyPLM)

  • Gene expression: Annotation and Gene Filtering

  • Gcrma normalization

  • Differential Gene Expression Analysis

  • Functional analysis (using DAVID)

  • PPI network and KEGG maps interpretation

  • Visualization tools: PCA plots, heatmaps

Tools skills:

  • GEO database

  • Slack, STEMaway, Asana, GitHub

  • External Tools: DAVID, STRING

  • multiple R packages (Limma, clusterProfiler, ggplot etc)

Soft skills:

  • importance of diversity in the workplace

  • importance of networking

  • Elevator pitches

  • Divergent thinking

  • Working in a team with different time zones

Achievement highlights:
I was able to use all packages in R and figure out how to use multiple functions, with the help of my leads and subgroup. I learned very useful presentation skills and successfully hold office hours for the July participants. I also completed both sets of Python exercises successfully, which helped me brush up my object-oriented programming skills.

Future Goals:
This internship made me more curious about how many things you can do with bioinformatics and I am excited to apply everything I learned to my future careerAnca_presentation.pdf (356.8 KB) .