Name: Yutong Wang
Team: ML Team 8 (Bertinator)
Overview of Things Learned:
Technical Area: I have learned and tried to create a web crawler for the first time using Scrapy. Also, I have learned how to use the inspect tool and use the network feature to track what’s going on in a webpage. Moreover, I have learned data cleaning using pandas and bert language model for NLP.
Tools Used: Python, Pandas, JSON, Scrapy, Git, Colab, Visual Studio Code
Soft Skills: Communication with the team leader, tech leader and teammates about questions and confusion.
Achievement Highlights
- Successfully scraping the talk folksy forum through Scrapy (which included 42000+ entries and 13 features)
- Fixed the infinite scrolling problem.
- Join 7 files of datasets and clean the hyperlinks and null values in the dataframe.
- Learn and try the bert model and the idea behind the training process.
List of Meetings attended
- All team meetings except one time
- Scraping with BeautifulSoup and Scrapy Webinar
- Q&A meeting with team leader Maleeha
- Watch the other webinar recordings
Goals for the Upcoming Week
Perform bert model and implement the classification model.
Tasks Done
- Successfully scraping the talk folksy forum through Scrapy (which included 42000+ entries and 13 features)
- Fixed the infinite scrolling problem.
- Join 7 files of datasets and clean the hyperlinks and null values in the dataframe.
- Learn and try the bert model and the idea behind the training process.