3 Discrete Random Variables and Probability Distributions Copyright Cengage Learning. All rights reserved. 1 3.2

Probability Distributions for Discrete Random Variables Copyright Cengage Learning. All rights reserved. 2 Probability Distributions for Discrete Random Variables Probabilities assigned to various outcomes in in turn determine probabilities associated with the values of any particular rv X.

The probability distribution of X says how the total probability of 1 is distributed among (allocated to) the various possible X values. Suppose, for example, that a business has just purchased four laser printers, and let X be the number among these that require service during the warranty period. 3 Probability Distributions for Discrete Random Variables Possible X values are then 0, 1, 2, 3, and 4. The probability distribution will tell us how the probability of 1 is subdivided

among these five possible values how much probability is associated with the X value 0, how much is apportioned to the X value 1, and so on. We will use the following notation for the probabilities in the distribution: p (0) = the probability of the X value 0 = P(X = 0) p (1) = the probability of the X value 1 = P(X = 1) and so on. In general, p (x) will denote the probability assigned to the value x. 4

Example 3.7 The Cal Poly Department of Statistics has a lab with six computers reserved for statistics majors. Let X denote the number of these computers that are in use at a particular time of day. Suppose that the probability distribution of X is as given in the following table; the first row of the table lists the possible X values and the second row gives the probability of each such value.

5 Example 3.7 contd We can now use elementary probability properties to calculate other probabilities of interest. For example, the probability that at most 2 computers are in use is P(X 2) = P(X = 0 or 1 or 2) = p(0) + p(1) + p(2)

= .05 + .10 + .15 = .30 6 Example 3.7 contd Since the event at least 3 computers are in use is complementary to at most 2 computers are in use,

P(X 3) = 1 P(X 2) = 1 .30 = .70 which can, of course, also be obtained by adding together probabilities for the values, 3, 4, 5, and 6. 7 Example 3.7 contd

The probability that between 2 and 5 computers inclusive are in use is P(2 X 5) = P(X = 2, 3, 4, or 5) = .15 + .25 + .20 + .15 = .75 whereas the probability that the number of computers in use is strictly between 2 and 5 is P(2 < X < 5) = P(X = 3 or 4) = .25 + .20 = .45
8 Probability Distributions for Discrete Random Variables Definition 9 Probability Distributions for Discrete Random Variables In words, for every possible value x of the random variable, the pmf specifies the probability of observing that value

when the experiment is performed. The conditions p (x) 0 and all possible x p (x) = 1 are required of any pmf. The pmf of X in the previous example was simply given in the problem description. We now consider several examples in which various probability properties are exploited to obtain the desired distribution. 10 A Parameter of a Probability

Distribution 11 A Parameter of a Probability Distribution The pmf of the Bernoulli rv X in Example 3.9was p(0) = .8 and p(1) = .2 because 20% of all purchasers selected a desktop computer. At another store, it may be the case that p(0) = .9 and p(1) = .1. More generally, the pmf of any Bernoulli rv can be

expressed in the form p (1) = and p (0) = 1 , where 0 < < 1. Because the pmf depends on the particular value of we often write p (x; ) rather than just p (x): (3.1) 12 A Parameter of a Probability Distribution Then each choice of a in Expression (3.1) yields a different pmf. Definition

13 A Parameter of a Probability Distribution The quantity in Expression (3.1) is a parameter. Each different number between 0 and 1 determines a different member of the Bernoulli family of distributions. 14 Example 3.12 Starting at a fixed time, we observe the gender of each newborn child at a certain hospital until a boy (B) is born.

Let p = P (B), assume that successive births are independent, and define the rv X by x = number of births observed. Then p(1) = P(X = 1) = P(B) =p 15 Example 3.12

contd p(2) = P(X = 2) = P(GB) = P(G) P(B) = (1 p)p and p(3) = P(X = 3) = P(GGB) = P(G) P(G) P(B) = (1 p)2p

16 Example 3.12 contd Continuing in this way, a general formula emerges: (3.2) The parameter p can assume any value between 0 and 1.

Expression (3.2) describes the family of geometric distributions. In the gender example, p = .51 might be appropriate, but if we were looking for the first child with Rh-positive blood, then we might have p = .85. 17 The Cumulative Distribution Function 18

The Cumulative Distribution Function For some fixed value x, we often wish to compute the probability that the observed value of X will be at most x. For example, let X be the number of number of beds occupied in a hospitals emergency room at a certain time of day; suppose the pmf of X is given by Then the probability that at most two beds are occupied is 19

The Cumulative Distribution Function Furthermore, since X 2.7 if and only if X 2, we also have P(X 2.7) = .75, and similarly P(X 2.999) = .75. Since 0 is the smallest possible X value, P(X -1.5) = 0, P(X -10) = 0, and in fact for any negative number x, P(X x) = 0. And because 4 is the largest possible value of X, P(X 4) = 1, P(X 9.8) = 1, and so on. 20

The Cumulative Distribution Function Very importantly, because the latter probability includes the probability mass at the x value 2 whereas the former probability does not. More generally, P(X x) P(X x) whenever x is a possible value of X. Furthermore, P(X x) is a well-defined and computable probability for any number x. 21

The Cumulative Distribution Function Definition 22 Example 3.13 A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB of memory. The accompanying table gives the distribution of Y = the amount of memory in a purchased drive:

23 Example 3.13 contd Lets first determine F (y) for each of the five possible values of Y: F (1) = P (Y 1) = P (Y = 1)

= p (1) = .05 F (2) = P (Y 2) = P (Y = 1 or 2) = p (1) + p (2) = .15 24 Example 3.13 contd

F(4) = P(Y 4) = P(Y = 1 or 2 or 4) = p(1) + p(2) + p(4) = .50 F(8) = P(Y 8) = p(1) + p(2) + p(4) + p(8) = .90 F(16) = P(Y 16) =1 25

Example 3.13 contd Now for any other number y, F (y) will equal the value of F at the closest possible value of Y to the left of y. For example, F(2.7) = P(Y 2.7) = P(Y 2) = F(2)

= .15 F(7.999) = P(Y 7.999) = P(Y 4) = F(4) = .50 26 Example 3.13 contd If y is less than 1, F (y) = 0 [e.g. F(.58) = 0], and if y is at

least 16, F (y) = 1[e.g. F(25) = 1]. The cdf is thus 27 Example 3.13 contd A graph of this cdf is shown in Figure 3.5. A graph of the cdf of Example 3.13

Figure 3.13 28 The Cumulative Distribution Function For X a discrete rv, the graph of F (x) will have a jump at every possible value of X and will be flat between possible values. Such a graph is called a step function. Proposition 29

The Cumulative Distribution Function The reason for subtracting F (a)rather than F (a) is that we want to include P(X = a) F (b) F (a); gives P (a < X b). This proposition will be used extensively when computing binomial and Poisson probabilities in Sections 3.4 and 3.6. 30 Example 3.15 Let X = the number of days of sick leave taken by a
randomly selected employee of a large company during a particular year. If the maximum number of allowable sick days per year is 14, possible values of X are 0, 1, . . . , 14. 31 Example 15 contd

With F(0) = .58, F(1) = .72, F(2) = .76, F(3) = .81, F(4) = .88, F(5) = .94, P(2 X 5) = P(X = 2, 3, 4, or 5) = F(5) F(1) = .22 and P(X = 3) = F(3) F(2) = .05 32