Incorporating Laboratory Activities into Statistics Courses

Beth L. Chance, University of the Pacific, bchance@uop.edu

Abstract

Use of computer lab exercises can extensively enrich learning experiences in statistics courses. The use of computers allows us to utilize larger, richer data sets relevant to students' experiences, and to provide visual explanations of key concepts, such as sampling distributions. In this paper, I present motivation for these lab exercises, examples of student experiences, and suggestions for implementation and assessment of these computer activities.

1. Introduction

Statistics should be taught as a laboratory science, along the lines of physics and chemistry rather than traditional mathematics. Students must get their hands dirty with data. The laboratory must be a requirement and must contain more than a few computers. . . having students discuss and write about their understandings and interpretations of the problems: - R. Scheaffer, in Cobb (1992)

Recent calls for change in statistics education (e.g. Singer & Willet, 1990; Tanner & Wardrop, 1992; Cobb, 1992) have emphasized that statistics students should be working with real data in a hands-on environment, learning first hand how and why data are collected. In this way, students are doing statistics instead of just reading and hearing about statistics. I have found that incorporating computer lab activities into my statistics courses accomplishes just this. Computers allow us to provide students with access to larger, more relevant data sets, and to visual, interactive demonstrations of key concepts. Furthermore, I require students to incorporate the computer results into written explanations of statistical concepts, providing them with valuable technical writing experience and practice using the language of statistics. Students are required to explain their understanding of the concepts and to discuss issues in data collection and model assumptions, giving me more insightful assessment of their understanding. I further assess these skills by including a take-home, open-ended component on the final exam, requiring individual exploration and development of solutions using the computer. Such assessment techniques directly measure student ability to do statistics. Through these exercises students not only achieve a high level of computer literacy, but also appreciation, and a higher level of understanding of statistics. Still, these exercises need to be carefully implemented into the course to be successful.

2. Use of Real Data and Data Collection Experiences

As Cobb (1992) states, students should view statistics as a process of scientific inquiry, discovering information from real life data. If the students are able to connect to the data, they will better retain the information from the example. Also, by collecting data themselves, students actively participate in the process and are better able to understand the issues involved. Thus, I wanted to create laboratory exercises that would utilize actual data sets of immediate relevance to the students and exercises requiring students to personally collect data.

To incorporate the first type of data sets, I have adapted several data sets for use by the students. For example, a set of baseball statistics compiled by Chris Albright (albright@big.bus.indiana.edu) is used to allow the students to explore the different behavior of means and medians in the presence of skewness or outliers. Students are asked to identify individual players that appear as outliers on such variables as home runs and batting average. Outliers are thus connected with actual people they have heard about. Another data set was compiled by Robin Lock from AAUW and US News and World Report data on US colleges and universities (available from Journal of Statistics Education data archive). Due to memory limitations with our computer program (student version of Minitab) I reduced this list to a set of 100 western colleges and universities. Still, students are interested in this data and are able to see how their university compares to others. Thus, students want to know the answers and see the computer as a means to obtain those answers. This increases student involvement and appreciation, as they are eager to explore the relationships among the variables. Students also have definite preconceptions they can bring to the data, and hypotheses they want to test.

Secondly, in several of the labs students gather the data themselves. These activities range from measuring the diameter of tennis ball to understand variability, to a taste test between Coke and Pepsi, to recording prices on a set of items at two different grocery stores, to observing the proportion of each M&M color. Since they have collected the data themselves, they will feel an ownership for the data, establishing an even stronger curiosity about the results. Students also need to make certain decisions along the way. For example, one lab asks students to time the wait of cars at two corners, one with a stop light and one with a stop sign. Students must precisely define when a car has begun and ended its wait. They learn that many of these decisions are not automatic and can have an effect on the final results. They are also directly involved with issues about independence and randomization. These shorter experiences are good models for their longer term project. Thus, by using realistic and self-collected data students gain a better understanding of the use of statistics and its relevance to their own experiences.

3. Use of Visual Explorations

Since many students are visual learners, computers also play an essential role in allowing students to efficiently investigate concepts visually. To provide the students with visual explanations, I use a series of programs developed by Robert delMas, University of Minnesota. These programs currently run on Macintosh computers. Several similar programs are also being developed, such as ExplorStat (Wackerly, University of Florida), HyperStat (Lane, Rice University), and Teaching Statistics Visually(Tracy, Doane, & Mathieson, Oakland University). These programs allow students to explore concepts and discover properties on their own. For example, the Standard Normal Distribution program by delMas allows students to adjust the population mean and standard deviation to see how the z-score and probabilities respond. They can adjust the X values, Z value, and probabilities independently, seeing for themselves how each affects the other. Another delMas program allows them to simulate sampling distributions while changing the population curve, sample size, and number of samples. Students directly experience the Central Limit Theorem, providing them with a visual image to refer back to. Another delMas program, a simulation of the Monte Hall problem, allows students to experience long-run probabilities in a fun setting where the answer is not initially intuitive for them. Many statistical properties are most apparent "in the long run" and computers allow students to quickly execute these simulations on a larger scale in order to properly discover the properties. I see these simulations as a compliment to smaller scale simulations done in class, whereas often smaller in-class examples on their own can be misleading.

These are just a few examples of how these programs can enhance student understanding of concepts by giving them a visual representation to accompany the verbal representation and to discover properties of these ideas on their own. By constructing their own knowledge in this way students achieve a deeper understanding and longer retention of the ideas.

4. Student Writeups

It is crucial for students to not only see these activities, but to be asked to explain what they learned, further internalizing the experience and making them responsible for the knowledge. After each activity, students have one-week to complete a writeup about the activity. These writeups are either answers to a series of questions I pose, or a "Full Writeup" technical report. In these questions, students are asked to explain the concepts in their own words. This provides me with the most accurate picture of what they understand and allows me to provide feedback on their interpretations. They also learn that there can be multiple interpretations for the same experience. In the Full Write-ups, students are to provide an introduction, explanation of data collection procedures, presentation of results, discussion of the results, and their final conclusions. Students are also asked to verify any assumptions required by statistical procedures. In this way, students must develop a logical argument based solely on the data they observed. Guidelines for these lab writeups (similar to those given by Egge et. al.,1995 and Spurrier, Edwards, and Thombs, 1995) are also given. I want students to learn how to effectively summarize their results and develop their own interpretations. Students are also asked to provide recommendations for future studies so they may learn to critically evaluate the procedures that were used. Students also see that gathering information from data is a continuous process. These writeups allow us to clearly see what the student learned from the activity, as well as providing the students with valuable practice utilizing the language of statistics which is often quite foreign to them initially.

5. Implementation

To ensure successful implementation of lab activities, care has to be taken in their construction and use, especially with a computer phobic audience. For our labs, I have utilized the MINITAB statistical software in conjunction with Microsoft Word for the reports. Students find the Minitab software easy to learn and I feel it provides them with sufficient background to transfer to other computer packages. Students are also able to easily incorporate Minitab output and graphics into a word processing program. The student version of Minitab can have some memory limitations, but I feel the statistical procedures are sufficient for my introductory statistics courses and is able to surpass many student anxieties.

Furthermore, the instructions need to be well laid out for the students. I require students to purchase a laboratory manual I put together that includes detailed instructions for using Minitab and Word, as well as instructions for the labs for the semester. This gives students a permanent reference manual and has facilitated their learning of the software and given them a clear idea of what is expected over the course of the semester. The manual also includes pictures of the menus they will need to use as they execute commands. Instructions are quite detailed initially, but then become less directed, as I try to encourage the students to become more independent analysts. Students do often exhibit poor memory of earlier labs, and thus an index, glossary of commands, and a frequently asked questions section have been added. Thus, I want the instructions to be clear enough students can follow them independently, but I also want to insure that the students are learning to help themselves solve problems as well.

Our lab currently has 25 machines. To ensure each student has individual access to a machine I break the class into two sections of 20-25 students each to meet in the lab with the instructor once per week. The immediate access to the instructor has also eased computer anxiety as students work through the lab instructions. The lab is also open in the afternoons and evenings to provide students with the opportunity to finish their lab writeups. Ideally the lab assistant employed at this time will have an understanding of the programs involved. It is important that the students don't become overwhelmed by the computer exercises. My goal is for them to finish the software instructions during the hour we met together so that this is not their stumbling block.

After each lab has been graded, an example lab, by one of the students, is posted for students to refer to. The goal is for students to review these papers to better understand what is expected of them, feel pride if their paper is selected, see the work of their peers (and thus what they are also capable of), and see that there can be multiple interpretations/conclusions for the same activity. This type of guidance and feedback is important. I chose to do it after the lab is turned in so that students generate their own reports and develop their own style of technical exposition instead of following a template.

Student participation in development of these activities is also important. I encourage student feedback and constantly update the list of instructions. I have also employed a student assistant to help in the development of the activities and to ensure clarity of the instructions. It is important to understand that development, implementation, and assessment of these lab activities add a significant time commitment on the part of the instructor and that these activities need to be well-integrated with the rest of the course to be successful.

6. Assessment

When grading these writeups, less than 50% of the grade stems from computer output, allowing me to demonstrate to the students that the computer output is a small part of the analysis. Instead more of the grade is composed of their explanations, their ability to interpret the computer output, and their effective communication of their knowledge. Students are encouraged to work in pairs so they will discuss and debate the concepts with each other. I feel that fostering this discussion greatly enhances the students' learning experience. Still, since part of the final involves computer work, students know that each of them is individually responsible for knowing how to use the computer programs.

This component to the final is a take-home exam. Students are given a series of questions, each requiring different statistical analyses. They must identify and carry out the appropriate analyses, graphs, and diagnostic checks, and interpret the results. They are expected to identify the appropriate procedures based on the question asked and types of variables involved. This part of the exam is given out one week before a traditional in-class final exam giving sufficient time to access the computer lab and think about several approaches to the problem. They are encouraged to ask the instructor questions about the computer commands, especially if they need a technique that was not covered in the lab manual. However, if they ask for statistical knowledge they are "charged" by losing points if I tell them the answer. I indicate that this is similar to having to pay an external consultant when they need further statistical expertise. Students are graded on their ability to identify and justify an appropriate analysis technique, perform this technique and necessary diagnostic checks, and interpret their results to formulate a conclusion about the problem, providing me with an authentic performance assessment. I find this type of complete, detailed analysis by the student is not feasible on an in-class exam. To ensure individual assessment, I am developing a list of suitable data sets (rich with multiple procedures) and hope that a bank of such test questions will continue to be developed. Current sources of test questions include the Journal of Statistics Education and Statlib data archives and the casebook by Chatterjee, Handcock, and Simonoff (1995).

7. Conclusion

I feel these lab exercises are invaluable experiences for statistics students at all levels, enhancing their understanding and increasing their retention of statistics. Students learn the material mathematically and concretely simultaneously, linking the two representations together. These activities also enhance their writing and critical evaluation skills. Students use the computer to answers questions in which they have a vested interest and they also experience the messiness of real life data. Thus they feel an ownership for the data and see the relevance of statistics to their world, while learning them to appreciate the power of computers as a useful tool.

8. References

Chatterjee, S., Handcock, M., and Simonoff, J. (1996), A Casebook for a First Course in Statistics and Data Analysis, New York: John Wiley and Sons.

Cobb, G. (1992), ``Teaching Statistics,'' in Heeding the Call for Change: Suggestions for Curricular Action, ed. L. Steen. MAA Notes, No. 22.

Egge, E., Foley, S., Haskins, L., Johnson, R. (1995), ``Statistics Lab Manual', Carleton University, Mathematics and Computer Science Department, 3rd edition.

Mackisack, M. (1994), ``What is the Use of Experiments Conducted by Statistics Students?,'' Journal of Statistics Education, 2(1).

Singer, J. and Willet, J. (1990), ``Improving the Teaching of Statistics: Putting the Data Back into Data Analysis,'' The American Statistician, 44(3), 223-230.

Spurrier, J.D., Edwards, D., and Thombs, L.A. (1995), Elementary Statistics Lab Manual, Belmont: Wadsworth Publishing Co.

ÿ