Yumin - Machine Learning (Level 3) Pathway

Things learned from Module 1:

  • Technical Area

Web scraping that contains xml files.

Learned more about what distributional semantics is and why that can be useful.

  • Tools

BeautifulSoup

pandas

Jupyter Notebook

Spacetime - get to know each other’s schedule even when in very different time zones

  • Soft Skills

Learned the concept of journal club and what I need to pay attention to when reading a research paper.

To communicate and work with team members from around the world.

  • Achievements

Read through the paper, understand the overall workflow and the significance of what we will be doing.

Use BeautifulSoup to access the Medline abstract data we need for future tasks.

Learn about various information that can be shown from dependency parsing and try using Stanford Parser with simple sentences.

  • Goals for upcoming week

Get the list of all drug/target names, then combine with Stanford Parser to make the dependency matrix.

  • Task done

Scraped a file from Medline, parsed and saved the abstracts in it. Plan to just focus on one file for now to simplify the data preprocessing.