Tutorial 4: Modeling Probability

In the tutorial Simulating Data, you learned to use a sampler to model data. In this tutorial, you will learn to:

  • Model a probability experiment
  • Use a formula to calculate an attribute
  • Use color to see the different simple outcomes that make up an event
  • Use a counter device to determine a sample space

You may also wish to watch the movies "Probability Simulation" and "Creating Sample Spaces."

Modeling a Probability Experiment

  1. Choose File | New to open a new TinkerPlots document.
  2. Drag a sampler into the document.
  3. Use what you learned in the tutorial Simulating Data to model rolling two dice. Label the first attribute DieA, and the second attribute DieB.
  4. Click RUN. Does the sampler produce the expected results? If not, modify your sampler as needed. Here is one possibility:

Exploring a Calculated Attribute

Because you want to look at the difference between the values of the two dice, you'll need to add an attribute. Then you will use a formula to calculate the difference.

  1. Expand your results table to show all of the columns. Click in the results table and name a new attribute Difference.
  2. Click the results table's Options menu and choose Show Formulas.
  3. For this experiment, you will explore the positive difference between the two dice values, so you'll use absolute-value. Double-click the formula cell under the Difference column header to open the formula editor. Enter |DieA–DieB|. The absolute-value key is in the bottom-left corner of the formula editor.
  4. Drag a plot into the document and plot Difference on the horizontal axis. Fully separate and stack the values. (If you're not sure how to do this, review the tutorial TinkerPlots Basics.)
  5. Change the Repeat value to 25 and click RUN. (You can speed up the process by dragging the Run Speed slider in the upper left of the sampler.) Does any outcome occur more often than others? Click RUN several times to generate more samples.
  6. You can add data each time you click RUN, rather than replacing data. To do this, click the results table's Options menu and choose Sampler Options. On the Sampler Options panel, uncheck "Replace Results Cases."

Now you'll look at a larger sample.

  1. Change Repeat to 500 and click RUN. If you collect several samples, you should see more consistent, although still variable, results.

It may surprise you that 1 is the most common difference. To help see why this is, we'll color the data and look at the potential simple outcomes for each result.

  1. Click the Join attribute in the results table to color the values.
  2. Click the Order cases vertically button to group colors together.
  3. Click on a few different colors and observe which cases are highlighted in the results table. What Difference is associated with green? With yellow? With pink?

Building a Sample Space

You may notice that a difference of 1 is associated with more colors than the other differences, because there are more simple outcomes that produce a difference of 1.

You can verify this by using a sampler to systematically generate the sample space (the set of all possible outcomes) for rolling two dice. Spinners and mixers won't do this because they generate attributes randomly. Instead, you'll use a counter.

  1. Replace each device in your sampler with a counter. The counter draws out each value in order. Drag the right side of the counter up or down to verify that each counter shows values from 1 to 6.
  2. Change Repeat to 15.
  3. Set the Run Speed slider to Medium, and run the sampler. As the sampler runs, observe how the data is collected. The DieA counter stays on the first value while the DieB counter cycles through all six values, and then the DieA counter moves to the second value and stays on 2 while DieB cycles through all six values, and so on. Depending on how you set up your spinner or mixer, the order of the values may vary.

You can also show the sample space in a plot.

  1. Drag a new plot into the document. Drag DieA from the results table to the horizontal axis, and DieB to the vertical axis. Fully separate the values and stack them vertically and horizontally.
  2. Double-click each axis end value. Enter values in the dialogue box so that each axis starts at 1 and ends at 6, and set the bin width to 1.

  3. To find all the possibilities and fill the sample space, you'll need exactly one case icon in each space in the plot. Change the Repeat value until you determine the number of elements in the sample space. You might increase the sampler speed while you experiment.
  4. Once you have determined the sample space, you might make the sampler smaller to make more space. Click the Difference attribute in the results table, and click the Key button to show which color corresponds to each Difference value in the plot. (You can click in the Key and drag it to move it out of the way, or resize it by dragging its edges.)
  5. How many outcomes in the sample space produce each difference? What is the probability of a difference of 1? Of 6? Of 0? Do you see now why a difference of 1 is most likely?