Sihongy1 - Machine Learning (Level 1) Pathway

Technical Area: Basic machine Learning knowledge Basic natural Language Processing knowledge Web scraping and web crawing

Tools: Beautiful Soap Selenium Pytorch Git Github Trello BERT

Soft Skills: Project management Coding skill

Achievements: Pracitce with Beautiful Soap and Selenium Know the basic of ML and NLP Tried web scraping

Goal for the next week: Create Trello for the team and manage the team with tasks

Tasks: Went over the instructions Watched the webinars in the resources Got familiar with what ML is Learned what NLP is, how it is used nowadays

Here is the link for my self-assessment 2: Machine Learning - Level 1 Module 2 - Sihong Yuan

Machine Learning Level 1 Module 2 (part2)

Technical Area:

  • Learned how to use Beautiful Soup and Selenium

  • Learned how to use webdriver

  • Learned how to Web Scrape and save data into a csv file

  • Learned how to do exploratory data analysis

Tools:

Beautiful Soap, Selenium, Jupyter Notebook, pandas, Tableau, chromedriver

Soft Skills:

  • Web scraping

  • Exploratory Data Analysis

Achievements:

  • Successfully scraped data from Flowster Forum

  • Cleaned the data by getting rid of the html

  • Transfered the data into lowercase

  • Removed stopwords of my data

Tasks Completed:

  • Scraped data from Flowster Forum

  • Explored my data after gathering it and analyzed it

  • Performed basic cleaning to remove html tags from data

  • Visualized my date

Here is the link for my self-assessment 3: Machine Learning - Level 1 Module 3 - Sihong Yuan

Machine Learning Level 1 Module 3 (part 2)

Technical Area

  • Vectorized my data using TF-IDF
  • Calculated a distance metric using cosine similarity
  • recommended 10 most similar posts based on the given post
  • Identified simple machine learning models and trained them on our data

Tools

  • sklearn
  • matplotlib
  • numpy

Soft Skills

  • Vectorized words using TF-IDF
  • Measured cosine similarity
  • Word embeddings
  • Simple classification model

Three achievement highlights

  • Transformed my data into word vectors using TF-IDF
  • Calculated cosine similarity
  • Recommended 10 most similar posts based on a given post

Goals for the upcoming week

  • start module 4
  • Prepare for the next presentation

hurdles

Had trouble to train the classification model

Machine Learning Level 1 Module 4

Technical Area:

  • BERT
  • Simple transformers
  • Logistic Regression

Tools:

  • Google Colab
  • Simple transformers
  • Sklearn

Soft Skills :

  • Simple transformers library
  • BERT model
  • Logistics Regression model

Achievements:

  • Used a pre-trained monolingual model to my dataset
  • Used DistilBERT to embeded the flowster dataset
  • Trained logistic regression model

Goals for the upcoming week:

  • Continue to finish module 4

Machine Learning Level 1 Module 4 (part 2)

Technical Area:

  • Learned how to use BERT and Roberta to update our word embeddings
  • Learned how to tested different recommenders to decide which one is the best
  • Learned how to built a classifier with group
  • Learned how to put all the work to a website

Tools:

  • Simple transformers
  • tokenizers
  • Google colab
  • Flask
  • Docker

Soft Skills:

  • How to measure the effectiveness of the recommenders
  • How to build a website using flask and Docker

Achievements:

  • Tested different kinds of recommenders and decided which one is the best
  • Put all our work into a web
  • Built a classifier and measured the effectiveness

Goals for the upcoming week:

  • Prepare for the final presentation