Course Information for Spring 2017
Prerequisites (no substitutes will be accepted in the pilot offering)
- Data 8 (Stat/CS/Info C8). The basics of probability are in Sections 8.3 through 9.3 as well as Chapters 12 and 17 of the Data 8 textbook. All of the inference in Data 8 relies on probability; we’ll touch on all of it in Prob140 one way or another. And of course you should be familiar with Python and the datascience library.
- A year of calculus, at the level of Math 1A-1B or higher. Math 53 is ideal; you can take it simultaneously with Prob140. Here are some examples to work on. As students in Stat 134 have noticed, what’s required is a kind of mathematical maturity rather than knowing lots of computational formulas. You will rarely have to work out complicated integrals or derivatives by hand, but you will constantly work with abstraction – functions, domains, ranges, inverses, limits, and so on – as well as bounds, approximations, orders of magnitude, etc.
If you love logic, math, and Data 8, this course is for you.
Texts
Required:
- Probability for Data Science by Ani Adhikari and Jim Pitman. This will be available on the course website, like the Data 8 text.
- Probability by Jim Pitman, published by Springer NY. Available for Berkeley students on SpringerLink at no cost or low cost (for a printed version).
- Theory Meets Data by Ani Adhikari (Dibya Ghosh, Editor), with contributions from students in the pilot offering of the Data 8 connector Stat 88: Probability and Mathematical Statistics in Data Science. The PDF will be available on the course website.
Excellent references
The Required Components of Your Work
-
Weekly homework, which you will do in Juypter notebooks and turn in on Gradescope. We hope to make this process less annoying than it was in Data 8 this past Fall. You will learn some basic LaTex so that you can “write” math in your notebooks. It’s easy and fun. Homework will be posted on Wednesday evenings and will be due the following Tuesday by 11:59 pm.
-
One weekly lab, which will be designed so that you can complete it (or almost complete it) during lab. Each lab is due on the same Friday by 11:59 p.m.
- Quizzes, four times during the term, in discussion section. No computers involved. Quiz Dates are
- Wednesday 2/1 (3rd week)
- Wednesday 2/15 (5th week)
- Wednesday 3/22 (10th week)
- Wednesday 4/12 (12th week)
- Midterm in class on Thursday March 2. No substitutes except as required by university rules. No computers involved.
- Final Exam Friday May 12, 7 pm to 10 pm, Exam Group 20. The very last slot in finals week – I don’t like it any more than you do but we have no choice. No substitutes except as required by university rules. No computers involved. The final is required for getting a passing grade.
Data science is not a solitary activity; please expect to attend lectures, discussion section, and lab. Lectures will not be webcast, but the online text will contain what is covered apart from discussions generated by questions asked in class.
Grading
We will drop
- Your two lowest homeworks
- Your two lowest labs
- Your lowest quiz
Course grades will be assigned using the following weighted components:
- Homework 20%
- Labs 20%
- Quizzes 15%
- Midterm 15%
- Final 30%
Collaboration and Honesty
You are encouraged to discuss practice problems, homework, and labs with your fellow students and with course staff. Arguing with friends about exercises is an excellent and time-honored way to learn. However, you must write up your own assignments and your own code. Copying assignments from others is not only dishonest, it also doesn’t help anyone. Students must work independently on quizzes and exams – no collaboration allowed – and that independence is developed by working on assignments. I am extremely tough with dishonest students but I don’t expect to be in that situation in Prob140. I expect that you will work with integrity and with respect for other members of the class, just as the course staff will work with integrity and with respect for you.