Random Variables
In the material you have seen so far, you've dealt with actual data. In general, statistics is about the entire process of data generation and analysis.
Consequently, we define a random variable to be a type of measurement on a random process. For example, the height of a person is a random variable. We don't imply that your height changes in a some sort of random fashion, but rather, in general, we can't predict the value of a persons height until we actually measure it. In other words, it is a `A numerical measure of a random outcome'.
Random variables are 'data' before we actually measure the value. So random variables have many of the same characteristics of data, i.e. they have a scale (nominal, ordinal, interval, or ratio); they can be discrete, continuous, or discretize continuous; etc.
Probability distribution table.
A probability distribution table is simply a listing of each potential value of the random variable along with its probability. For example, the probability distribution of the outcomes of flipping a fair coin is:
Outcome Probability H .5 T .5
The probability distribution of the outcomes of the
number of girls in 3 child families is: (don't worry
where these numbers come from):
| Girls | Probability |
| 0 | .125 |
| 1 | .375 |
| 2 | .375 |
| 3 | .125 |
Notice that:
Sometimes the probability table can be expressed as a mathematical function. Examples of this will appear later in the course.
Probability functions can be drawn as a graph by simply placing a bar or spike above each value to represent the probability of the event.
For example, suppose that one had the following distribution table:
| x | prob |
| 0 | 0.1681 |
| 1 | 0.3601 |
| 2 | 0.3087 |
| 3 | 0.1323 |
| 4 | 0.0283 |
| 5 | 0.0024 |
This can be put into a ``graph'' as shown below: