Abhishek_Prasad - Machine Learning Pathway

Concise overview of things learned. Break it up into Technical Area, Tools, Soft Skills:
I’ve learned about selenium web-driver and Web scraping during the 1st-2nd week which I had no experience of.
Three achievement highlights:
Web scraping,
Team Work,
List of meetings/ training attended including social team event:
Zoom and Google Meeting
Goals for the upcoming week. Next self-assessment will be due on the following Tuesday 06/23
Data Cleaning, Training Model.
Detailed statement of tasks done. State each task, hurdles faced if any, and how you solved the hurdle. You need to mark whether the hurdles were solved with the help of training webinars, some help from project leads or significant help from project leads:
Scraped Title, Category, Sub-Category, Post-Contents, URL, and Post-Responses from Amazon forum.
I had faced issues in the web scrapping part where we need to extract data from the article(responses of the article) but it was solved through an internet search and help from teammates.

Overview of thing learned
Technical: I learned the concepts of Natural Language Processing, Transformers, Sequence Model.
Tools: Pandas, Torch, Transformer.
Soft Skills: I got to work on a team project which helped me learn teamwork, discussing ideas within members, effective communication.

Achievement highlights

  1. Successfully Merged the dataset of 3 teams
  2. Applied distilBERT on the dataset.

List of meetings/training attended
Meetings :

  1. Discussion of BERT
  2. distilBERT Modeling

Goals for upcoming weeks
Classification of the model and applying distilBERT on Dataset.

Tasks Done

  1. Task : Understand the concepts behind BERT and try implementation.
    Hurdles : Tokenization of dataset.
    I could comprehend it after reading some articles online.

Summarization of Full-Session:
Technical Overview:
I have learned and used new techniques such as Web-Scrapping using Selenium Web-driver then did Data-Processing using Pandas DataFrame.
After making final data we have used DistilBERT Pre-trained Model for training the raw data before applying any classification model.
Then after, we created the tensors and applied the classification model to get the accuracy of the model. We got accuracy ~85% using Logistic-Regression.

Technologies & Tools learned and applied

  • Web scrapping – Selenium WebDriver
  • Data Processing – Pandas
  • ML Algorithm
  • DistilBERT
  • Google COLAB

Soft Skills learned and applied

  • Collaboration
  • Team player
  • Willingness to learn new technologies
  • Communication

Challenges Faced:
As I was new to the web-scrapping part I have struggled a bit but eventually resolved all issues by getting help from other team members.