Homework of Applied Data Analysis class at EPFL for the team composed by: D. Rivollet, L. Veyssière, M. Mordig.
Applied Data Analysis is a course in which the students learn the basic techniques and practical skills required to make sense out of a variety of data, with the help of the most acclaimed software tools in the data science world such as Pandas, Scikit-learn, Spark, etc...
This repository regroups 5 ADA homework targetting and using different subjects and datasets:
In this exercise, we learned how to use Pandas and handle a large amount of data.
In this exercise, we scrapped data from a web page with a limited access to the information.
This homework helped us to discover different possibilities of interactive visualization in order to ergonomically extract condensed information contained in large datasets.
This exercise was designed to explore Machine Learning techniques such as linear or non-linear regressions which allowed us to predict behavior of new samples.
In this last homework, we concentrated our work on Natural Language Processing (NLP), where we had first to clean the data in order to process sentiment and subject analysis.