See the schedule for a list of topics by week, with links to slides.

Course and Instructor Information

  • Math 345 CRN 11689, Stats for Data Science
  • Dsci 345 CRN 13329, Stats for Data Science

Instructor: Benjamin Young, Fenton 211, bjy@uoregon.edu. Pronouns: He/They.

This is a slightly summarized version of the syllabus - I’ve taken out any links that I didn’t want to post on the public internet. Official version is on the Canvas site.

Office Hours

  • 10am-11am Tuesdays and Fridays, or by appointment; please see Canvas for my appointment booking link.

    Course Description

This course covers a theoretical basis in probability and statistics that is foundational for work in data science. Randomness and uncertainty are fundamental at all stages of data science work, from understanding strengths and limitations of data sources, through modeling and inference, to algorithmic aspects of learning and prediction (e.g., random forests; stochastic gradient decent). This course is aimed at students who have done some work looking at real data and want to have a deeper (and more quantitative) understanding of what we are doing when we make predictions using data. The course covers both tools for modeling randomness and calculating properties of those models, and the process of estimating quantities from data. An important thread throughout the course is on simulating data: being able to construct and simulate from sophisticated models for random data generation.

Course Objectives

  • Students will become fluent in basic topics of probability: randomness, uncertainty, estimation, and prediction; Probability and expectation, conditional probabilities, and random variables; mean, variance, expectation.
  • Students should be able to mathematically model real world situations, using randomness to model uncertainty. Students will do this through simulation, application of common probability distributions, stochastic gradient descent.
  • Students will learn how to pick realistic simulation parameters by fitting models to data. This includes: the central limit theorem, the method of moments fitting, minimum variance estimartors, and discussions of outliers, overdispersion, scale mixtures, goodness of fit.
  • Students will become familiar with hypothesis testing, including: likihood, P-values, power, false positives, false discovery rates.
  • Students will be able to apply standard multivariate models, understand and model correlation, and understand principal component analysis, both with linear models and several generalization.
  • Students will learn to detect and correct for overfitting using regularization and cross-validation.

    Prerequisites:

Students should be familiar with the basics of calculus and linear algebra, have experience with python, and have worked with describing/visualizing data. UO course prerequisites: Math 342 (linear algebra), CS 211 (computer science II), and either DSCI 101/102 or some other exposure to data science/statistics.

Assignments

There will be weekly assignments, each a Jupyter notebook; download each one, complete it on your own, and submit your completed notebook on Canvas by Fridays at 11:59pm. Please do not ask me for extensions; the grader has to do the grading over the weekend, period.

You are encouraged to work with other students (please say who you worked with) and use what internet sources you like (please cite your sources); please don’t work with people outside the class and please do the technical writing on your own, avoiding use of generative AI on material you submit.

Quizzes

There will be 9 short quizzes, at start of class on Monday. Do these during class time, on your own; after class, photograph the quiz and upload to canvas that day. I’ll typically grade them on Tuesday. The purpose of the quizzes is:

  • provide a grade-based motivation for students to keep up with the readings;
  • assess whether students can do modelling without assistance (the difference between a B and C grade)
  • help the instructor understand the state of knowledge of students in the class.

Take-home final

There is no in-class final exam; instead there is a take-home final exam, which is due Friday Dec 13, 11:59pm (i.e. end of day, Friday of exam week). The only difference between the assignments and the take-home final exam are:

  • I (the instructor) grade the take-home final exam, not a grader;
  • the take-home final exam is slightly longer, and covers the whole class;
  • Students are asked not to collaborate with each other. (use of outside sources is still OK with proper citation)

Grade breakdown:

  • 20% Quizzes (lowest score dropped)
  • 50% Assignments (lowest score dropped)
  • 30% Take-Home Final

Textbooks

We will be assigning reading from these (freely available) books:

There is a list of other useful reading materials on the page of references.

Software:

Class demonstrations, exercises, and homeworks will be done using jupyter notebooks, so you should have on your computer a working python installation with

Course Modality

This class is offered in person. I’d say “face to face” but I’ll probably wear a KN95 mask most of the time - I, and others in my immediate family, take extra care to avoid even minor colds and the like.

I aim to make class available remotely, by broadcasting lectures on zoom and making the recordings available afterwards on Canvas, however: in the past there’ve been technical issues making this impossible for more than a few lectures. Moreover, for most students in-class participation is more engaging and results in better learning outcomes, so I strongly encourage in-person attendance. However, I know that things happen in life that might make you unable to come – for instance, please do not come to class if you are sick.

Course Policies

How will I communicate with you?

Our class will communicate through our Canvas site. Announcements and emails are archived there, automatically forwarded to your UO email, and can even reach you by text. Check and adjust your settings under Account > Notifications.  

I get in touch with individual students when needed through email. When giving feedback on assignments, I do so in Canvas.

How and why can you communicate with me?

Please reach out to me by email or by attending office hours!  I can also take a few questions immediately after class.

I enjoy talking with students about our course material! Are you confused or excited about something? Wondering how what we’re learning relates to current events, career choices, or other classes you can take UO? Please be in touch! Please also be in touch to tell me how you are doing in the course. If you are having trouble with some aspect of it, I would like to strategize with you. I believe every student can succeed in this course, and I care about your success.  

Classroom Community Expectations

All members of the class (both students and instructor) can expect to:

  • Participate and Contribute:  All students are expected to participate by sharing ideas and contributing to the learning environment. This entails preparing, following instructions, and engaging respectfully and thoughtfully with others. 

  • Expect and Respect Diversity: All classes at the University of Oregon welcome and respect diverse experiences, perspectives, and approaches. What is not welcome are behaviors or contributions that undermine, demean, or marginalize others based on race, ethnicity, gender, sex, age, sexual orientation, religion, ability, or socioeconomic status. We will value differences and communicate disagreements with respect. We may establish more specific guidelines and protocols to ensure inclusion and equity for all members of our learning community.    

  • Help Everyone Learn: Part of how we learn together is by learning from one another. To do this effectively, we need to be patient with each other, identify ways we can assist others, and be open-minded to receiving help and feedback from others. Don’t hesitate to contact me to ask for assistance or offer suggestions that might help us learn better.    

Absences

Attendance is important because we will develop our knowledge through in-class activities that require your active engagement. We’ll have discussions, small-group activities, and do other work during class that will be richer for your presence, and that you won’t be able to benefit from if you are not there. Excessive absences make it impossible to learn well and succeed in the course. While there is not an automatic grade deduction for missing classes, it is unlikely that students who miss significant numbers of classes will pass the course.

That said, if you are feeling ill, please stay home to heal and avoid infecting your classmates. Please take absences only when necessary, so when they are necessary, your prior attendance will have positioned you for success.

Student Workload and Time Use

This is a 4-credit hour course, so you should expect to complete 120 hours of work for the course—an average of about 12 hours each week (this includes time in-class). My estimate for time usage for activities and assignments in an average week is below – some weeks may have shorter or longer time commitments either by design or due to course scheduling. • In-class meetings: 3 hours • Pre-class reading: 2 hours - Assignments: 7 Hours I encourage working ahead of due dates.

Course Deadlines and Late Work

Assignments in this course are always due on Fridays at 11:59pm. Although deadlines are firm, your lowest assignment score will be dropped.

Work submitted late is penalized 10% per day late, where a “day” is the 24-hour period past the deadline. If work is more than 3 days late, it is not accepted. Work submitted even a few minutes late is, nonetheless, late, and will be penalized, so please submit your work well in advance of the deadline. We’re going to stick to this policy quite rigidly, so that we can focus on return the graded assignments to you promptly.

Grading Policies

Grading scheme: A = 85-100%, B = 75-84%, C = 65-74%, D = 55-64%, F = 0-55% .

Note that these grades translate directly and unambiguously to grades out of 10.

The math department’s undergraduate grading standards will tell you what the various letter grades should mean -  please see the rubric for “applied” classes. The relevant language is:

A: Consistently chooses appropriate models, uses correct techniques, and carries calculations through to a correct answer. Able to estimate error when appropriate, and able to recognize conditions needed to apply models as appropriate. B: Usually chooses appropriate models and uses correct techniques, and makes few calculational errors. Able to estimate error when prompted, and able to recognize conditions needed to apply models when prompted. C: Makes calculations correctly or substantially correctly, but requires guidance on choosing models and technique. Able to estimate error when prompted and able to recognize conditions needed to apply models when prompted. D: Makes calculations correctly or substantially correctly, but unable to do modeling (though please note: this class is about modeling so I probably will not be giving out a lot of grades in this range) F: Can neither choose appropriate models, or techniques, nor carry through calculations

Pluses and Minuses are given out by instructor discretion, generally only to scores at the very top or bottom scores of these ranges (so that an A+ really means truly exceptional work).

I anticipate not curving the scores at all. If you get 8 out of 10 on a problem, that means you did a B’s worth of work on it.

Generative Artificial Intelligence Use

Students can use GenAI tools in this class to help with certain aspects of course work and assignments. This includes brainstorming ideas, creating a paper outline, or summarizing research findings of articles. However, you cannot use content such as text or graphics created by GenAI tools in your work; rather, you must be the author/creator of your work submissions. For example, you can use a GenAI tool to suggest a paper outline based on a draft you provide it, but you cannot submit an assignment with text generated by GenAI as if the text is your own writing. 

For me, there are two essential reasons for this:

  • currently GenAI tools cannot correctly attribute their sources - see the section on Academic Integrity below. If you consult a large language model to figure something out, you need to find out where it got its info, read that yourself, and cite it - otherwise, you’re plagiarizing from the original author. Plagiarism is a form of academic misconduct, see the section on Academic Misconduct below. Similarly, code assistants and the like may help you program, but you have to figure out where they learned their tricks from, and cite that - otherwise, you’re once again plagiarizing from the author.
  • To assign you a letter grade, I need to assess whether you can do mathematical modelling and whether you can assess the errors in your models - not whether your GenAI tool can do it.

Be advised, in accordance with UO policy, if I believe you’ve handed in work created whole or in part by GenAI tools, I may submit a report of suspected academic misconduct to the Office of Student Conduct and Community Standards for that office to make a determination of responsibility and, if warranted, assess a grade penalty. So, if you are in doubt or have questions about a particular GenAI tool and if its use is okay, check in with me and let’s discuss.

University Policies

Access and Accommodations

The University of Oregon and I are dedicated to fostering inclusive learning environments for all students and welcomes students with disabilities into all of the University’s educational programs. The Accessible Education Center (AEC) assists students with disabilities in reducing campus-wide and classroom-related barriers. If you have or think you have a disability (https://aec.uoregon.edu/content/what-disability) and experience academic barriers, please contact the AEC to discuss appropriate accommodations or support. Visit 360 Oregon Hall or aec.uoregon.edu for more information. You can contact AEC at 541-346-1155 or via email at uoaec@uoregon.edu.

Accommodations for Religious Observances

The University of Oregon respects the right of all students to observe their religious holidays, and will make reasonable accommodations, upon request, for these observances. If you need to be absent from a class period this term because of a religious obligation or observance, please fill out the Student Religious Accommodation Request fillable PDF form and send it to me within the first weeks of the course so we can make arrangements in advance.

Your Wellbeing

   Life at college can be very complicated. Students often feel overwhelmed or stressed, experience anxiety or depression, struggle with relationships, or just need help navigating challenges in their life. If you’re facing such challenges, you don’t need to handle them on your own–there’s help and support on campus.  

As your instructor if I believe you may need additional support, I will express my concerns, the reasons for them, and refer you to resources that might be helpful. It is not my intention to know the details of what might be bothering you, but simply to let you know I care and that help is available. Getting help is a courageous thing to do—for yourself and those you care about. 

University Health Services helps students cope with difficult emotions and life stressors. If you need general resources on coping with stress or want to talk with another student who has been in the same place as you, visit the Duck Nest (located in the EMU on the ground floor) and get help from one of the specially trained Peer Wellness Advocates.  

University Counseling Services (UCS) has a team of dedicated staff members to support you with your concerns, many of whom can provide identity-based support. All clinical services are free and confidential. Find out more at counseling.uoregon.edu or by calling 541-346-3227 (anytime UCS is closed, the After-Hours Support and Crisis Line is available by calling this same number). 

Basic Needs

Being able to meet your basic needs is foundational to your success as a student at the University of Oregon. If you are having difficulty affording food, don’t have a stable, safe place to live, or are struggling to meet another need, visit the UO Basic Needs Resource page for information on how to get support. They have information food, housing, healthcare, childcare, transportation, technology, finances (including emergency funds), and legal support.

If your need is urgent, please contact the Care and Advocacy Program by calling 541-346-3216, filling out the Community Care and Support form, or by scheduling an appointment with an advocate.

Respect for Diversity

You can expect to be treated with respect in this course. Both students and your instructor(s) enter with many identities, backgrounds, and beliefs. Students of all racial identities, ethnicities, genders, gender identities, gender expressions, national origins, religious affiliations, sexual orientations, citizenship statuses, ability and other visible and non-visible differences belong in and contribute to this class and this discipline. All students are expected to contribute to a respectful, welcoming and inclusive environment for every other member of the class.  

Class rosters are provided to instructors with students’ legal names. Please let me know if the name or pronouns I have for you are not accurate. It is important to me to address you properly. 

Please let me know if aspects of the instruction, course design, or class activities undermine these principles in any way. You may also notify the Department of Data Science. For additional assistance and resources, you may also consider contacting the Division of Equity and Inclusion through their website or by phone (at 541-346-3175), or the Center for Multicultural Academic Excellence through their website or by phone (at 541-346-3479).

Academic Integrity

Collaboration between students is typically allowed on assignments and take-home final, but you must state which students you worked together with. Consultation with existing outside sources is again typically allowed on assignments and the take-home final, but you must cite your sources.

Collaboration with non-students is also typically not allowed (so, don’t post the homework in a forum and get someone to do it for you).

The University Student Conduct Code defines academic misconduct, which includes using unauthorized help on assignments and examinations, the use of sources without acknowledgment, and recording class without “the express written permission of the instructor(s).” Academic misconduct is prohibited at UO. I will report all suspected misconduct to the Office of Student Conduct and Community Standards. If the Office finds a student has committed misconduct, consequences can include of the relevant assignment or exam, or of the course.

While unauthorized help and use of sources without citation is prohibited, learning together and citing sources is crucial! UO Libraries’ Citation Guides research guide.

If at any point in the term you are unsure about whether a behavior aligns with academic integrity in our course, please contact me. I view student questions about academic integrity as a desire to act with integrity, so I welcome your questions.

Reporting Obligations

I am a designated reporter. For information about my reporting obligations as an employee, please see Employee Reporting Obligations on the Office of Investigations and Civil Rights Compliance (OICRC) website. Students experiencing sex- or gender-based discrimination, harassment or violence should call the 24-7 hotline 541-346-SAFE (7244) or visit safe.uoregon.edu for help. Students experiencing all forms of prohibited discrimination or harassment may contact the Dean of Students Office at 541-346-3216 or the non-confidential Title IX Coordinator/OICRC at 541-346-3123 to request information and resources. Students are not required to participate in an investigation to receive support, including requesting academic supportive measures. Additional resources are available at investigations.uoregon.edu/how-get-support.

I am also a mandatory reporter of child abuse. Please find more information at Mandatory Reporting of Child Abuse and Neglect.

Pregnancy Modifications. Pregnant and parenting students are eligible for academic and work modifications related to pregnancy, childbirth, loss of pregnancy, termination of pregnancy, lactation, and related medical conditions. To request pregnancy-related modifications, students should complete the Request for Pregnancy Modifications form on the OICRC website. OICRC coordinates academic and other modifications for pregnant and parenting students to ensure students can continue to access their education and university programs and activities.

Academic Disruption due to Campus Emergency

In the event of a campus emergency that disrupts academic activities, course requirements, deadlines, and grading percentages are subject to change. Information about changes in this course will be communicated as soon as possible by email, and on Canvas. If we are not able to meet face-to-face, students should immediately log onto Canvas and read any announcements and/or access alternative assignments. Students are also expected to continue coursework as outlined in this syllabus or other instructions on Canvas.

Inclement Weather

It is generally expected that class will meet unless the University is officially closed for inclement weather. If it becomes necessary to cancel class while the University remains open, this will be announced on Canvas and by email. Updates on inclement weather and closure are also communicated as described on the Inclement Weather webpage.