Urvi - Machine Learning Pathway

As part of task 1, we were required to explore multiple forums similar to the STEM-Away that comprised topics/categories/posts/tags pertinent to STEM. The forum that I worked on was Discourse Meta. Thereafter, we had to present our observations in the form of a report. It was followed by scraping the StackOverflow forum as it was found to more relevant. After generating the csv, we went through the data preprocessing stage. We’re now exploring the BERT model.

Overview of things learned -

  • Technical -
  1. Web scraping using BeautifulSoup
  2. Data mining
  3. Data cleaning
  • Tools used -
  1. Asana (for project management)
  2. Jupyter (for python coding)
  • Soft skills -
    Interacted with people coming from diverse backgrounds and level of expertise. Collaborated with teams and learnt new things from their work.

  • Achievement highlights -

  1. Got familiar with Asana
  2. Honed web scraping and data preprocessing skills
  3. Connected with the leads and team-mates and made new friends
  • Meetings attended -
  1. Attended a group meeting
  2. Attended all team meetings
  • Tasks done -
  1. Prepared a report along with my group, stating the pros and cons of using Discourse Meta for web scraping.
  2. Scraped the StackOverflow forum for one tag, namely, data science, that had around 5.5k posts.
  3. Presented a data analysis report of the csv obtained from scraping 13 categories of StackOverflow. The report consisted of the anomalies that needed to be addressed as part of the data preprocessing stage.
  4. Cleaned the dataset.
  • Goals for the upcoming week -
    Implementing the BERT model

Overview of things learned during final phase:

  • Technical -
    Implemented the DistilBERT Model.

  • Tools used -

  1. Google Colab
  2. Jupyter
  • Soft Skills -
  1. Collaborated with other team members and prepared the final presentation.
  2. Delivered my part in the same.
  3. Learned about giving a professional edge to the presentation.
  • Achievement highlights -
  1. Implemented recommender system.
  2. Delivered the final presentation.
  • Meetings attended -
    Attended all meetings

  • Tasks done -

  1. Created a machine learning recommendation model using DistilBERT which gave an accuracy of 93%.
  2. Delivered the final presentation. Presented an overview of the DistilBERT model, its working and its results.