**This is an inactive course webpage**
Labs: TUE 2:30-4pm (Section 1) and 4-5:30pm (Section 2) in Eads 016
Lectures: THU 4-5:30pm in Crow 201
Instructor: Marion Neumann
Office: Jolley Hall Room 222
Contact: Please use Piazza!
Office Hours: THU 3-4pm during lecture weeks, for all other times check announcements above or on Piazza. Individual appointments are possoible (request via email – allow for 2-3 days to reply/schedule).
Please, avoid random drop ins outside my office hours.
Head TA:
Jonathan – manages all Gradescope/Canvas grades –> use Piazza tag grades
TAs:
Alexis, Amanda, Erik, Harrison, Luxiao, Michael, Steven, Yushu, Zac
TA Office Hours (during lecture weeks):Monday 4:30-6:30pm in Jolley 431 (Michael and Amanda) Wednesday 11:30am-1pm in Jolley 431 (Zac) Friday 4-6pm in Jolley 431 (Steven) Sunday 4-6pm in Lopata Hall 302 (Erik)
Since this is a pilot offering of a brand new course we appreciate any feedback you have for us! Tell us what you like, don’t like, or could be improved (and how if you have any ideas). Use this Anonymous Feedback Form.
Homework assignments
Homework assignments will be assigned concurrently to the lecture/lab sessions covering the respective materials. Due dates and submission instructions will be indicated on the course webpage under homework assignments. It is every student’s responsibility to meet the submission requirements and deadlines. We cannot accept late submissions and submissions that do not follow the submission instructions for no reason (see also Late Policy below).
We will drop the lowest score homework and each remaining homework assignment will be weighted equally. The total grade achieved for the homework assignments (one drop, no make-ups) will contribute 40% towards your total course performance.
Regrade Requests
Any regrade requests, claims of missing scores, or grade discrepancies will have to made within one week of the grade announcement. We will not take any regrade requests after this one-week period for no reason. Grade announcements will be made on Piazza and grading comments will be provided in your SVN repository or via Gradescope. All grades will be maintained on Canvas. It is the student’s responsibility to verify that all grades on Canvas are accurate. Regrade submissions should be exclusively done via Gradescope and grade discrepancies should be reported via Piazza (using the grades tag).
Lab Participation
You are expected to actively participate in the labs. Lab participation contributes 20% to your total course performance. Successful (full score) lab participation goes beyond attending the lab. It will be determined based on the following components:
- active participation in small group discussions
- lab progress/completion (assed via end-of-lab demos to instructor/TA)
- lab quizzes
There are no make-ups for any missed labs or quizzes. The lowest lab score will be dropped (one drop).
Exams
There will be 2 exams contributing 20% each towards your total course performance. The dates are
- Midterm: March 7 2019 (in-class)
- Final: May 8 2019 6-8pm (scheduled by university)
Grading Summary
40% hw assignments
20% lab participation
20% midterm
20% final exam
It is not possible to achieve a higher percentage on any individual grade component than listed above through bonus or extra credit problems.
Final course grades will be assigned using the following straight scale:
Letter Grade | Cutoff Percentage |
A | 93% |
A- | 90% |
B+ | 87% |
B | 83% |
B- | 80% |
C+ | 77% |
C | 73% |
C- | 70% |
D+ | 67% |
D | 63% |
D- | 60% |
F | < 60% |
The passing grade is C- or better (70%).
Late Policy
Your homework assignments must be turned in on time. There are absolutely no makeup quizzes or assignments for any reason and/or missed deadlines.
Collaboration Policy
You are encouraged to discuss the course material with other students. Discussing the material, and the general form of solutions to the labs is a key part of the class. Since, for many of the assignments, there is no single “right” answer, talking to other students and to the TAs is a good thing. However, everything that you turn in should be your own work, unless we tell you otherwise. If you talk about assignments with another student, then you need to explicitly tell us on the hand-in by providing their name(s) and student ID(s). You are not allowed to copy answers or parts of answers from anyone else, or from material you find on the Internet. This will be considered as willful cheating, and will be dealt with according to the official collaboration policy:
Academic Integrity
Unless explicitly instructed otherwise, everything that you turn in for this course must be your own work. If you willfully misrepresent someone else’s work as your own, you are guilty of cheating. Cheating, in any form, will not be tolerated in this class.
Checkout these questions and answers in the CSE FAQ.
There is zero tolerance of Academic Dishonesty. I will be actively searching for academic dishonesty on all homework assignments, quizzes, and exams. If you are guilty of cheating on any assignment or exam, you will receive and F in the course and be referred to the School of Engineering Discipline Committee. In severe cases, this can lead to expulsion from the University, as well as possible deportation for international students. If you copy from anyone in the class both parties will be penalized, regardless of which direction the information flowed. This is your only warning.
Please refer to the University Undergraduate Academic Integrity Policy, for more information. If you suspect that you may be entering an ambiguous situation, it is your responsibility to clarify it before the professor or TAs detect it. If in doubt, please ask.
Providing/Posting Solutions
Providing your course work (written or code) in any form to others is a violation of the academic integrity policy. If you provide your solutions to someone else in the course or post them publicly online, you are guilty of violating our academic integrity policy. Such a case will be treated the same way as described above and prosecution will also take place after finishing the course or even graduating form Wash U.
Mental Health
Mental Health Services professional staff members work with students to resolve personal and interpersonal difficulties, many of which can affect the academic experience. These include conflicts with or worry about friends or family, concerns about eating or drinking patterns, and feelings of anxiety and depression. See: http://shs.wustl.edu/MentalHealth
Accommodations based upon sexual assault
The University is committed to offering reasonable academic accommodations to students who are victims of sexual assault. Students are eligible for accommodation regardless of whether they seek criminal or disciplinary action. Depending on the specific nature of the allegation, such measures may include but are not limited to: implementation of a no-contact order, course/classroom assignment changes, and other academic support services and accommodations. If you need to request such accommodations, please direct your request to Kim Webb (kim_webb@wustl.edu), Director of the Relationship and Sexual Violence Prevention Center. Ms. Webb is a confidential resource; however, requests for accommodations will be shared with the appropriate University administration and faculty. The University will maintain as confidential any accommodations or protective measures provided to an individual student so long as it does not impair the ability to provide such measures.
If a student comes to me to discuss or disclose an instance of sexual assault, sex discrimination, sexual harassment, dating violence, domestic violence or stalking, or if I otherwise observe or become aware of such an allegation, I will keep the information as private as I can, but as a faculty member of Washington University, I am required to immediately report it to my Department Chair or Dean or directly to Ms. Jessica Kennedy, the Universitys Title IX Coordinator. If you would like to speak with the Title IX Coordinator directly, Ms. Kennedy can be reached at (314) 935-3118, jwkennedy@wustl.edu, or by visiting the Title IX office in Umrath Hall. Additionally, you can report incidents or complaints to Tamara King, Associate Dean for Students and Director of Student Conduct, or by contacting WUPD at (314) 935-5555 or your local law enforcement agency. See: Title IX
You can also speak confidentially and learn more about available resources at the Relationship and Sexual Violence Prevention Center by calling (314) 935-8761 or visiting the 4th floor of Seigle Hall. See: RSVP Center
Bias Reporting
The University has a process through which students, faculty, staff and commu- nity members who have experienced or witnessed incidents of bias, prejudice or discrimination against a student can report their experiences to the Universitys Bias Report and Support System (BRSS) team. See: http://brss.wustl.edu
Center for Diversity and Inclusion (CDI):
The Center of Diversity and Inclusion (CDI) supports and advocates for undergraduate, graduate, and professional school students from underrepresented and/or marginalized populations, creates collaborative partnerships with campus and community partners, and promotes dialogue and social change. One of the CDI’s strategic priorities is to cultivate and foster a supportive campus climate for students of all backgrounds, cultures and identities.
See: diversityinclusion.wustl.edu/
_______ |
Topic |
Materials |
---|---|---|
15 Jan | Syllabus
Group Activity: What is data science? |
|
17 Jan | Lecture 1 – Data Science
|
|
22 Jan | Lab 1 – Plant Species Classification
|
|
24 Jan | Lecture 2 – Exploratory Data Analysis
|
|
29 Jan | Lab 2 – Analyzing the MoMA Data
|
|
31 Jan | Lecture 3 – Sentiment Analysis
|
|
5 Feb | Lab 3 – Analyzing Movie Reviews
|
|
7 Feb | Lecture 4 – Regression
|
|
12 Feb | Lab 4 – Predicting Housing Prices
|
|
14 Feb | Lecture 5 – Logistic Regression
|
|
19 Feb | Lab 5 – Detecting Breast Cancer
|
|
21 Feb | Lecture 5 – Logistic Regression Revisited
Lecture 6 – Evaluation and Learning Principles
|
|
26 Feb | Lab 5 – Detecting Breast Cancer
|
|
28 Feb | no class | |
5 Mar | Study for Midterm Exam
What to study?
|
How to study?
|
7 Mar | Midterm EXAM in Crow 204
closed book – no notes – no crib/cheat sheet |
…CAUTION: room change!!!! |
12 Mar 14 Mar |
Spring Break | |
19 Mar | Lab 6 – Ethical Thinking for Data Science
|
|
21 Mar | Lecture 7 – Clustering
|
|
26 Mar | Lab 7 – Clustering
|
|
28 Mar | Lecture 8 – Similarity-based Learning
|
|
2 Apr | Lab 8 – k-NN
|
|
4 Apr | Lecture 9 – Feature Engineering
|
|
9 Apr | Lab 9 – Feature Learning
|
|
11 Apr | Lecture 10 – Data Engineering
|
|
16 Apr | Lab10 – Gesture Recognition |
|
18 Apr | Lab 10 – Wrap-up
Lecture 11 – Topic Models |
|
23 Apr | Lab11 – Organizing Text Data
Course evaluation: HERE
|
|
25 Apr | Semester Review
Towards Data Science: What to Study Next? |
|
8 May | Final EXAM
|
SUBMISSION INSTRUCTIONS
- Code submission
- use the following filename: hwX_<your wustlkey>.ipynb
- for example: hw1_mneumann.ipynb
- submit the Python notebook (.ipynb file) via file upload
- do not create a zip file
- do not add or delete cells in the notebook unless instructed otherwise
- use the following filename: hwX_<your wustlkey>.ipynb
ASSIGNMENTS AND DEADLINES
- 04/19 hw10
- due: TUE 04/30 at 2:30pm
- submit via Gradescope
- 04/09 hw9
- due: TUE 04/16 at 2:30pm
- submit via Gradescope
- 04/02 hw8
- due: TUE 04/09 at 2:30pm
- submit via Gradescope
- 03/26 hw7
- to get started finish part 2 and 3 in Lab7 (those are relevant for/part of hw7)
- due: TUE 04/02 at 2:30pm
- extended: WED 04/03 at 2:30pm
- submit via Gradescope
- 03/19 hw6
- due: TUE 03/26 at 2:30pm
- submit via Gradescope
- 02/26 hw5
- due: TUE 03/05 at 2:30pm
- [PDSH] Ch5 Machine Learning
- Introducing Scikit-Learn: Ch5 (p343-359)
- [DSFS] Ch17: Decision Trees
- What is a Decision Tree? (p201-203)
- [PDSH] Ch5 Machine Learning
- Motivating Random Forests: Decision Trees (p421-426)
- submit via Gradescope
- 02/12 hw4
- due: TUE 02/19 at 2:30pm
- [PDSH] Ch5 Machine Learning
- Introducing Scikit-Learn: Ch5 (p343-359)
- submit via Gradescope
- 02/05 hw3
- due: TUE 02/12 at 2:30pm
- [DSFS] Ch9 Getting Data
- Reading Files (p105-108)
- Using APIs (p114-117)
- Example: Using the Twitter APIs (p117-120)
- submit via Gradescope
- 01/24 hw2
- due: TUE 02/05 at 2:30pm
- [PDSH] Ch2 NumPy (p33-63, p78-85)
- Data Types in Python
- Basic of NumPy Arrays
- Computation on NumPy Arrays
- Aggregations
- Fancy Indexing
- submit via Gradescope
- 01/15 hw1
- due: TUE 01/22 at 2:30pm
- [DSFS] Ch2 Python Crash Course
- The Basics (p15-26)
- [PDSH] Ch1 IPython and Jupiter Notebooks
- all about notebooks, skip stuff about the shell
- [PDSH] Ch2 NumPy (p33-63, p78-85)
- Data Types in Python
- Basic of NumPy Arrays
- Computation on NumPy Arrays
- Aggregations
- Fancy Indexing
- submit via Gradescope
Books
There isn’t really a course book for this class. But the following books will be useful. Check them out whenever you see references in the slides or on the course calendar.
- [PDSH] Python Data Science Handbook by VanderPlas, O’Reilly Media, 2016.
- electronic copy through the Wash U library for viewing online
- [DSFS] Data Science from Scratch by Joel Grus, O’Reilly Media, 2015. (2nd edition from 2019)
- electronic copy through the Wash U library for viewing online
Python
We will be using Python and Numpy, Scipy, Scikit-learn, Pandas, and Matplotlib for the course. All those packages are included in the Anaconda package.
DOWNLOAD ANACONDA PACKAGE
Downloading the Anaconda package will give you access to all packages and toolboxes that will be used in this class. It’s recommended that you go with the newest version.
- Download the Anaconda Distribution with the latest version of Python: https://www.anaconda.com/download/#macos
Getting Started with Jupyter Notebooks
You can get the notebook server running with the following methods
- You can use the user interface to open the notebook
- Or you can open the notebook via terminal by running the following command:jupyter notebook
NAVIGATING THROUGH JUPYTER NOTEBOOKS
PYTHON TUTORIALS AND RESOURCES
- Python tutorials for beginners
- Learn Python course on Codecademy
- Intro to Python for Data Science from DataCamp
- The official Python tutorial is quite comprehensive. There is also a useful glossary.
- The Wash U library has electronic copies of these useful O’Reilly books available for viewing online:
- Learning Python – Find it here.
- Python Data Science Handbook – Find it here.
Please ask any questions related to the course materials and homework problems on Piazza. Other students might have the same questions or are able to provide a quick answer.
Any public postings of (partial or full) solutions to homework problems (written or in form of source or pseudo code) will result in a grade of zero for that particular problem for ALL students in the course.