Data 140 (previously Stat 140, a.k.a. Prob 140) is a probability course for undergraduates who have taken Data 8, have a math background, and wish to go deeper into the theory of data science.
It is a course in probability theory, not data analysis. Python labs are used to better understand the theory, but the work in the course is primarily mathematical.
Data 140 aims to give students a good theoretical background for modern data analysis. Contents have been selected after consultation with faculty who teach Stat and CS courses in advanced statistical topics including machine learning. The main topics are univariate and multivariate distributions, conditioning, and some stochastic processes. Primary examples include Bayesian estimation, Markov Chain Monte Carlo, multiple regression and the geometry of the multivariate normal.
The course satisfies requirements for several undergraduate majors and minors.
Data Science: Data 140 satisfies the probability requirement for the Data Science major. It is one of the two probability classes that are preferred as preparation for Data 102 (Data Inference, and Decisions) which satisfies the Modeling, Learning, and Decision Making requirement for the major. Data 140 can also be used to satisfy the probability requirement for the Data Science minor although the requirement can also be satisfied by taking lower division probability.
Statistics: Data 140 is cross-listed as Data/Stat C140. For the major and minor, and for Statistics courses numbered 135 and above, Data 140 satisfies the same requirements as Stat 134 does. If a Statistics course currently requires Stat 134, then Data 140 will fulfill that requirement too. A letter grade of B- or better in Data 140 will satisfy the corresponding Stat 134 grade requirement for entry into the Statistics major.
Other Majors: Data 140 satisfies the statistics requirement for entry into the Economics major, and elective requirements for other majors including L&S CS. Students can petition others. Please direct your inquiries to the other major and include the link to the course website if needed.
Data 140 is restricted to undergraduates who:
The course has been designed for students who have the specific background above in math, programming, and statistical inference. All prerequisites and corequisites are enforced by CalCentral.
The campus offers four upper division probability courses including Data 140. Here is Prof. Adhikari’s take on the other three.
If you are interested in data science and have taken CS 61A, CS 70, multivariable calculus, and linear algebra, but not Data 8/100, I recommend EECS 126. It builds on the probability content of CS 70 to cover properties of discrete and continuous distributions, both univariate and multivariate; the theory underlying fundamental methods of statistical inference; and stochastic (that is, random) processes. Python labs cover a variety of applications to data science. It will prepare you well for classes in machine learning and other ways of making decisions based on data.
STAT 134 has the fewest prerequisites (just one year of calculus) and is aimed at a more general audience than the other three classes. The content overlaps extensively with Data 140 and INDENG 172, and the general approach to distributions and conditioning is quite similar to that in Data 140. Unlike Data 140 and EECS 126 there is no particular emphasis on inference, and there is no computing. Stat 134 prepares students well for upper division statistics courses and provides a good foundation for classes in machine learning and decision making.
INDENG 172 requires calculus as well as programming experience and is aimed primarily at INDENG and ORMS majors. Of the three other classes, it resembles Stat 134 most closely, and the programming component varies by semester.