How to Think Like a Data Scientist

Course Syllabus

In this course, you will be introduced to the importance of gathering, cleaning, normalizing, visualizing and analyzing data to drive informed decision-making, no matter the field of study. You will learn to use a combination of tools and techniques, including spreadsheets, SQL, and Python to work on real-world datasets using a combination of procedural and basic machine learning algorithms. You will also learn to ask good, exploratory questions and develop metrics to come up with well thought-out analyses. Presenting and discussing analyses of datasets you have chosen will be an important part of the course.Our textbooks for the class are:

Our textbooks for the class are:

How to Think Like a Data Scientist (Runestone Academy Course Name: httlads-bennsp21)  by Miller, Boggs, and Pearce

Python Data Science Handbook by Jake VanderPlas

We will be using the Bennington Physics Slack Workspace to communicate this term. Please make sure you join both of the Spring 2021 How to Think Like a Data Scientist channel: #spring2021-httlads.

We will meet as a class every Monday and Thursday from 10am to 11:50am.

The schedule for the course is below. Note that links to readings, data sets, assignments, and projects will become active as the term goes on.

Week ofTopicReadingDatasets Used
Feb 15Introduction & OverviewCh. 1Happiness Survey
Feb 22Working in SpreadsheetsCh 2
March 1Spreadsheet Data AnalysisCh. 2
March 8Python Review + Pandas IntroductionCh. 4Movie Data
March 15Filtering & Indexing Data FramesCh. 5
March 22Data Cleaning & Exploration
Altair and Plotting
Ch. 5CIA World Factbook
March 29Altair and Plotting Ch. 6
April 5Midterm Project Presentations
(No Class Friday)
April 12Text AnalysisCh. 8UN Speeches
April 19Text Analysis (cont.)Ch. 8
April 26Text Analysis (cont.)Ch. 8
May 3SQL Introduction (No Class Tuesday)Ch. 9Bike Rentals
May 10SQL (cont.)Ch. 9
May 17Final ProjectsTBAYour Own Data
May 24Final Projects TBA