About Data 140

Ani Adhikari

Table of Contents

Data 140 (previously Stat 140, a.k.a. Prob 140) is a probability course for undergraduates who have taken Data 8, have a math background, and wish to go deeper into the theory of data science.

It is a course in probability theory, not data analysis. Python labs are used to better understand the theory but the work in the course is primarily mathematical.

Data 140 aims to give students a good theoretical background for modern data analysis. Contents have been selected after consultation with faculty who teach Stat and CS courses in advanced statistical topics including machine learning. The main topics are univariate and multivariate distributions, conditioning, and some stochastic processes. Primary examples include Bayesian estimation, Markov Chain Monte Carlo, multiple regression and the geometry of the multivariate normal.

Requirement for Majors and Minors

The course satisfies requirements for several undergraduate majors and minors.

  • Data Science: Data 140 satisfies the probability requirement for the Data Science major. It is one of the two probability classes that are preferred as preparation for Data 102 (Data, Inference, and Decisions) which satisfies the Modeling, Learning, and Decision Making requirement for the major. Data 140 can also be used to satisfy the probability requirement for the Data Science minor although the requirement can also be satisfied by taking lower division probability.

  • Statistics: Data 140 is cross-listed as Data/Stat C140. For the Statistics major and minor, and for Statistics courses numbered 135 and above, Data 140 satisfies the same requirements as Stat 134 does. If a Statistics course currently requires Stat 134, then Data 140 will fulfill that requirement too.

  • Other Majors: Data 140 satisfies the statistics requirement for entry into the Economics and Business majors, and elective requirements for other majors including Applied Math and L&S CS. Students can petition others. Please direct your inquiries to the other major and include the link to the course website if needed.

Prerequisites and Enrollment

Data 140 is restricted to undergraduates who satisfy all the following requirements:

  • Have not taken Stat 134 or EECS 126; students cannot get credit for Data 140 after taking Stat 134 or EECS 126
  • Have taken a year of calculus at the level of Math 1A-1B and preferably higher; Data 140 involves some double integration and partial derivatives
  • Have taken or are concurrently taking linear algebra in Math 54 or Math 56 or EE 16B or Math 110, or have taken an equivalent linear algebra course at another college (please note the change that EE 16A is no longer accepted)
  • Have taken Data 8 or both Stat 20 and CS 61A or both Stat 20 and Data 88C (formerly CS 88)

The course has been designed for students who have the specific background above in math, programming, and statistical inference. All prerequisites and corequisites are enforced by CalCentral.

Other Upper Division Probability Courses

The campus offers four upper division probability courses including Data 140. Here is Prof. Adhikari’s take on the other three.

  • If you are interested in data science and have taken CS 61A, CS 70, multivariable calculus, and linear algebra, but not Data 8/100, I recommend EECS 126. It builds on the probability content of CS 70 to cover properties of discrete and continuous distributions, both univariate and multivariate; the theory underlying fundamental methods of statistical inference; and stochastic (that is, random) processes. Python labs cover a variety of applications to data science. It will prepare you well for classes in machine learning and other ways of making decisions based on data.

  • STAT 134 has the fewest prerequisites (just one year of calculus) and is aimed at a more general audience than the other three classes. The content overlaps extensively with Data 140 and INDENG 172, and the general approach to distributions and conditioning is quite similar to that in Data 140. Unlike Data 140 and EECS 126, there is no almost no inference and no computing. Stat 134 prepares students well for Stat 135 and several 150-level statistics courses, and provides a good foundation for classes in machine learning and decision making.

  • INDENG 172 requires calculus as well as programming experience and is aimed primarily at INDENG and ORMS majors. Of the three other classes, it resembles Stat 134 most closely. The programming component varies by semester; often there is none.