STAT 301 Introduction#
These materials accompany the teaching of STAT 301, Elementary Statistical Methods, at Purdue University. STAT 301 is a university-wide undergraduate course that serves a large and diverse student body. It is offered throughout the year across all semesters, with both online and in-person versions, taught by multiple instructors.
This short book provides comprehensive materials to meet the needs of different kinds of students. It’s understandable that, for some students, this course might be the only statistics course you take, and your goal may simply be to pass the course. Passing the course is one of the minimum expectations I will assume. Therefore, I will explicitly lay out what needs to be done to pass the course. You need to spend enough time to complete the minimum assignments and tasks required to pass the course. Otherwise, you will fail unless you have mastered these materials and developed the corresponding statistical skills.[1]
I hope that for most of you, your goal is to build a solid foundation in statistics so that you develop an interest in the subject and might proceed to take more advanced statistics courses. This can be considered as the second level of achievement. You will find relevant materials here to help you overcome the obstacles to understanding the content of this course, enabling you to build a strong foundation. However, this will also require you to put in extra effort to read and digest the materials provided.
Of course, learning does not stop at just building a solid foundation. If your goal-the third level of achievement-is to fully master the materials in this course, such as understanding the connections between topics and the sequence of the content, so that you gain a clear picture of the roadmap of the materials, you may find some teaching materials related to this goal. However, there won’t be too many because of the organization and structure of this course. Additionally, the assignments and exams are not designed to focus on this level of mastery, which means the tests will not be tricky-a relief for most students.
The three goals mentioned above can also be interpreted as three progressive levels of understanding the course materials. I will elaborate on this point in the later sections of this introductory chapter.
STAT 301 is an introductory-level statistics course that focuses on introducing some of the basic statistical concepts and procedures commonly encountered in many areas of studies or disciplines. The aim is to help students become familiar with these concepts and procedures. As a result, the assignments and exams are designed in a way that I would describe as “draw a ladle by copying a gourd,” an old Chinese idiom. This idiom means that you simply need to follow the instructions and examples step by step to complete most of the assignments in this course.
In terms of the learning experience, I hope everyone aims to achieve the second goal: building a solid foundation. This will allow you to complete the assignments and exams more efficiently and achieve better grades.
One more thing I want to mention is that, for introductory-level statistics courses, there are usually two major components. One is the probability part (probabilistic reasoning), and the other is the statistics part (statistical reasoning). The probability part is fascinating-perhaps God created this world by randomly throwing dice! However, STAT 301 will primarily focus on the latter.
Let’s use my favorite dice-rolling example to explain these two parts. Suppose we have a standard six-faced die labeled with numbers 1 to 6. We can ask the following questions:
What is the probability of getting a 1 if we roll the die once?
What is the probability of getting a 1 on the first roll and a 2 on the second roll if we roll the die twice?
These questions are related to the probability part. We start with a real-world situation or application (a model) and then ask questions about how this world or application generates our data (observations or outcomes). In this case, the data is “1” for the first question and “1, 2” for the second question.
We can also ask questions like the following after observing the results of dice rolls:
After three rolls, we see the results 1, 1, 1. Do we still believe this is a fair die, or could it be a die that favors 1?[2]
After one hundred rolls, we have not seen the result 6. Do we still believe this is a standard six-faced die?
These questions are related to the statistical reasoning part, which involves using data to ask questions about our real-world application or model. Once we have a dataset, we aim to make claims about the model or the real world. Specifically, we want to understand the statistical procedures for interpreting the data and determining how confident we can be about the accuracy of our claims. This course will primarily focus on this part, which is why we will begin with the data side.
Table of Contents#
Below is the table of contents for this book.
At the beginning
Working with data
Making statistical inference
Modeling linear regression
Additional Topics