Solution Manual for Introduction to Probability and Statistics, 15th Edition

Preview Extract
Complete Solutions Manual to Accompany ยฉ Cengage Learning. All rights reserved. No distribution allowed without express authorization. Introduction to Probability and Statistics 15th Edition William Mendenhall, III 1925-2009 Robert J. Beaver University of California, Riverside, Emeritus Barbara M. Beaver University of California, Riverside, Emerita Prepared by Barbara M. Beaver Australia โ€ข Brazil โ€ข Mexico โ€ข Singapore โ€ข United Kingdom โ€ข United States ISBN-13: 978-1-337-55829-7 ISBN-10: 1-337-55829-X ยฉ 2020 Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher except as may be permitted by the license terms below. For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706. For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions Further permissions questions can be emailed to [email protected]. Cengage Learning 20 Channel Center Street Boston, MA 02210 USA Cengage Learning is a leading provider of customized learning solutions with office locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at: www.cengage.com/global. Cengage Learning products are represented in Canada by Nelson Education, Ltd. To learn more about Cengage Learning Solutions or to purchase any of our products at our preferred online store, visit www.cengage.com. NOTE: UNDER NO CIRCUMSTANCES MAY THIS MATERIAL OR ANY PORTION THEREOF BE SOLD, LICENSED, AUCTIONED, OR OTHERWISE REDISTRIBUTED EXCEPT AS MAY BE PERMITTED BY THE LICENSE TERMS HEREIN. READ IMPORTANT LICENSE INFORMATION Dear Professor or Other Supplement Recipient: Cengage Learning has provided you with this product (the โ€œSupplementโ€) for your review and, to the extent that you adopt the associated textbook for use in connection with your course (the โ€œCourseโ€), you and your students who purchase the textbook may use the Supplement as described below. Cengage Learning has established these use limitations in response to concerns raised by authors, professors, and other users regarding the pedagogical problems stemming from unlimited distribution of Supplements. Cengage Learning hereby grants you a nontransferable license to use the Supplement in connection with the Course, subject to the following conditions. The Supplement is for your personal, noncommercial use only and may not be reproduced, or distributed, except that portions of the Supplement may be provided to your students in connection with your instruction of the Course, so long as such students are advised that they may not copy or distribute any portion of the Supplement to any third party. Test banks, and other testing materials may be made available in the classroom and collected at the end of each class session, or posted electronically as described herein. Any material posted electronically must be through a passwordprotected site, with all copy and download functionality disabled, and accessible solely by your students who have purchased the associated textbook for the Course. You may not sell, license, auction, or otherwise redistribute the Supplement in any form. We ask that you take reasonable steps to protect the Supplement from unauthorized use, reproduction, or distribution. Your use of the Supplement indicates your acceptance of the conditions set forth in this Agreement. If you do not accept these conditions, you must return the Supplement unused within 30 days of receipt. All rights (including without limitation, copyrights, patents, and trade secrets) in the Supplement are and will remain the sole and exclusive property of Cengage Learning and/or its licensors. The Supplement is furnished by Cengage Learning on an โ€œas isโ€ basis without any warranties, express or implied. This Agreement will be governed by and construed pursuant to the laws of the State of New York, without regard to such Stateโ€™s conflict of law rules. Thank you for your assistance in helping to safeguard the integrity of the content contained in this Supplement. We trust you find the Supplement a useful teaching tool. Contents Chapter 1: Describing Data with Graphsโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ…1 Chapter 2: Describing Data with Numerical Measuresโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ….30 Chapter 3: Describing Bivariate Dataโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ……….68 Chapter 4: Probabilityโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ…………..93 Chapter 5: Discrete Probability Distributionโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ..121 Chapter 6: The Normal Probability Distributionโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ…156 Chapter 7: Sampling Distributionsโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ.186 Chapter 8: Large-Sample Estimationโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ.210 Chapter 9: Large-Sample Test of Hypothesesโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ…240 Chapter 10: Inference from Small Samplesโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ…271 Chapter 11: The Analysis of Varianceโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ…324 Chapter 12: Linear Regression and Correlationโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ.364 Chapter 13: Multiple Regression Analysisโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ415 Chapter 14: The Analysis of Categorical Data………………………………………………………………..439 Chapter 15: Nonparametric Statisticsโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ.469 1: Describing Data with Graphs Section 1.1 1.1.1 The experimental unit, the individual or object on which a variable is measured, is the student. 1.1.2 The experimental unit on which the number of errors is measured is the exam. 1.1.3 The experimental unit is the patient. 1.1.4 The experimental unit is the azalea plant. 1.1.5 The experimental unit is the car. 1.1.6 โ€œTime to assembleโ€ is a quantitative variable because a numerical quantity (1 hour, 1.5 hours, etc.) is measured. 1.1.7 โ€œNumber of studentsโ€ is a quantitative variable because a numerical quantity (1, 2, etc.) is measured. 1.1.8 โ€œRating of a politicianโ€ is a qualitative variable since a quality (excellent, good, fair, poor) is measured. 1.1.9 โ€œState of residenceโ€ is a qualitative variable since a quality (CA, MT, AL, etc.) is measured. 1.1.10 โ€œPopulationโ€ is a discrete variable because it can take on only integer values. 1.1.11 โ€œWeightโ€ is a continuous variable, taking on any values associated with an interval on the real line. 1.1.12 Number of claims is a discrete variable because it can take on only integer values. 1.1.13 โ€œNumber of consumersโ€ is integer-valued and hence discrete. 1.1.14 โ€œNumber of boating accidentsโ€ is integer-valued and hence discrete. 1.1.15 โ€œTimeโ€ is a continuous variable. 1.1.16 โ€œCost of a head of lettuceโ€ is a discrete variable since money can be measured only in dollars and cents. 1.1.17 โ€œNumber of brothers and sistersโ€ is integer-valued and hence discrete. 1.1.18 โ€œYield in bushelsโ€ is a continuous variable, taking on any values associated with an interval on the real line. 1.1.19 The statewide database contains a record of all drivers in the state of Michigan. The data collected represents the population of interest to the researcher. 1.1.20 The researcher is interested in the opinions of all citizens, not just the 1000 citizens that have been interviewed. The responses of these 1000 citizens represent a sample. 1.1.21 The researcher is interested in the weight gain of all animals that might be put on this diet, not just the twenty animals that have been observed. The responses of these twenty animals is a sample. 1.1.22 The data from the Internal Revenue Service contains the records of all wage earners in the United States. The data collected represents the population of interest to the researcher. 1.1.23 a The experimental unit, the item or object on which variables are measured, is the vehicle. b Type (qualitative); make (qualitative); carpool or not? (qualitative); one-way commute distance (quantitative continuous); age of vehicle (quantitative continuous) c 1.1.24 Since five variables have been measured, this is multivariate data. a The set of ages at death represents a population, because there have only been 38 different presidents in the United States history. b The variable being measured is the continuous variable โ€œageโ€. c โ€œAgeโ€ is a quantitative variable. 1 1.1.25 a The population of interest consists of voter opinions (for or against the candidate) at the time of the election for all persons voting in the election. b Note that when a sample is taken (at some time prior or the election), we are not actually sampling from the population of interest. As time passes, voter opinions change. Hence, the population of voter opinions changes with time, and the sample may not be representative of the population of interest. 1.1.26 a-b The variable โ€œsurvival timeโ€ is a quantitative continuous variable. c The population of interest is the population of survival times for all patients having a particular type of cancer and having undergone a particular type of radiotherapy. d-e Note that there is a problem with sampling in this situation. If we sample from all patients having cancer and radiotherapy, some may still be living and their survival time will not be measurable. Hence, we cannot sample directly from the population of interest, but must arrive at some reasonable alternate population from which to sample. 1.1.27 a The variable โ€œreading scoreโ€ is a quantitative variable, which is probably integer-valued and hence discrete. b The individual on which the variable is measured is the student. c The population is hypothetical โ€“ it does not exist in fact โ€“ but consists of the reading scores for all students who could possibly be taught by this method. Section 1.2 1.2.1 The pie chart is constructed by partitioning the circle into five parts, according to the total contributed by each part. Since the total number of students is 100, the total number receiving a final grade of A represents 31 100 = 0.31 or 31% of the total. Thus, this category will be represented by a sector angle of 0.31(360) = 111.6๏‚ฐ . The other sector angles are shown next, along with the pie chart. Final Grade Frequency Fraction of Total Sector Angle A 31 .31 111.6 B 36 .36 129.6 C 21 .21 75.6 D 9 .09 32.4 F 3 .03 10.8 D 9.0% F 3.0% A 31.0% C 21.0% B 36.0% 2 The bar chart represents each category as a bar with height equal to the frequency of occurrence of that category and is shown in the figure that follows. 40 Frequency 30 20 10 0 A B C D F Final Grade 1.2.2 Construct a statistical table to summarize the data. The pie and bar charts are shown in the figures that follow. Status Frequency Fraction of Total Sector Angle Freshman 32 .32 115.2 Sophomore 34 .34 122.4 Junior 17 .17 61.2 Senior 9 .09 32.4 Grad Student 8 .08 28.8 35 Grad Student 8.0% 30 Senior 9.0% Freshman 32.0% Frequency 25 Junior 17.0% 20 15 10 5 0 Sophomore 34.0% 1.2.3 Freshman Sophomore Junior Senior Grad Student Status Construct a statistical table to summarize the data. The pie and bar charts are shown in the figures that follow. Status Humanities, Arts & Sciences Natural/Agricultural Sciences Business Other Frequency 43 32 17 8 3 Fraction of Total .43 .32 .17 .08 Sector Angle 154.8 115.2 61.2 28.8 40 Frequency other 8.0% Business 17.0% Humanities, Arts & Sciences 43.0% 30 20 10 0 Natural/Agricultural Sciences 32.0% m Hu rt ,A es i ti an s& es nc ie Sc N es nc c ie lS a r tu ul ric Ag al/ r u at s es sin Bu r he Ot College 1.2.4 a The pie chart is constructed by partitioning the circle into four parts, according to the total contributed by each part. Since the total number of people is 50, the total number in category A represents 11 50 = 0.22 or 22% of the total. Thus, this category will be represented by a sector angle of 0.22(360) = 79.2o . The other sector angles are shown below. The pie chart is shown in the figure that follows. Category Frequency Fraction of Total Sector Angle A 11 .22 79.2 B 14 .28 100.8 C 20 .40 144.0 D 5 .10 36.0 D 10.0% 20 A 22.0% Frequency 15 C 40.0% 10 5 B 28.0% 0 A B C D Category b The bar chart represents each category as a bar with height equal to the frequency of occurrence of that category and is shown in the figure above. c Yes, the shape will change depending on the order of presentation. The order is unimportant. d The proportion of people in categories B, C, or D is found by summing the frequencies in those three categories, and dividing by n = 50. That is, (14 + 20 + 5) 50 = 0.78 . e Since there are 14 people in category B, there are 50 โˆ’14 = 36 who are not, and the percentage is calculated as ( 36 50 )100 = 72% . 1.2.5 a-b Construct a statistical table to summarize the data. The pie and bar charts are shown in the figures that follow. 4 State CA AZ TX Frequency 9 8 8 Fraction of Total .36 .32 .32 Sector Angle 129.6 115.2 115.2 9 8 7 TX 32.0% CA 36.0% Frequency 6 5 4 3 2 1 0 AZ 32.0% CA AZ TX State c From the table or the chart, Texas produced 8 25 = 0.32 of the jeans. d The highest bar represents California, which produced the most pairs of jeans. e Since the bars and the sectors are almost equal in size, the three states produced roughly the same number of pairs of jeans. 1.2.6-9 The bar charts represent each category as a bar with height equal to the frequency of occurrence of that category. Exercise 7 Exercise 6 90 70 80 60 70 50 Percent Percent 60 50 40 30 40 30 20 20 10 10 0 Republicans Independents 0 Democrats 18 to 34 35 to 54 55+ Age Party ID Exercise 8 Exercise 9 100 80 70 80 50 Percent Percent 60 60 40 40 30 20 20 10 0 Republicans Independents 0 Democrats 18 to 34 35 to 54 55+ Age Party ID 1.2.10 Answers will vary. 1.2.11 a The percentages given in the exercise only add to 94%. We should add another category called โ€œOtherโ€, which will account for the other 6% of the responses. b Either type of chart is appropriate. Since the data is already presented as percentages of the whole group, we choose to use a pie chart, shown in the figure that follows. 5 Too much arguing 5.0% other 6.0% Not good at it 14.0% Other plans 40.0% Too much work 15.0% Too much pressure 20.0% c-d Answers will vary. 1.2.12-14 The percentages falling in each of the four categories in 2017 are shown next (in parentheses), and the pie chart for 2017 and bar charts for 2010 and 2017 follow. Region 2010 2017 United States/Canada 99 183 (13.8%) Europe 107 271 (20.4%) Asia 64 453 (34.2%) Rest of the World 58 419 (31.6%) Total 328 1326 (100%) Exercise 12 (2017) U.S./Canada 13.8% Rest of the World 31.6% Europe 20.4% Asia 34.2% Exercise 14 (2017) Exercise 13 (2010) 500 100 Average Daily Users (millions) Average Daily Users (millions) 120 80 60 40 20 0 U.S./Canada Europe Asia Rest of the World Region 400 300 200 100 0 U.S./Canada Europe Asia Region 6 Rest of the World 1.2.15 Users in Asia and the rest of the world have increased more rapidly than those in the U.S., Canada or Europe over the seven-year period. 1.2.16 a The total percentage of responses given in the table is only (40 + 34 + 19)% = 93% . Hence there are 7% of the opinions not recorded, which should go into a category called โ€œOtherโ€ or โ€œMore than a few daysโ€. b Yes. The bars are very close to the correct proportions. c Similar to previous exercises. The pie chart is shown next. The bar chart is probably more interesting to look at. Mor than a Few Days 7.0% No Time 19.0% One Day 40.0% A Few Days 34.0% 1.2.17-18 Answers will vary from student to student. Since the graph gives a range of values for Zimbabweโ€™s share, we have chosen to use the 13% figure, and have used 3% in the โ€œOtherโ€ category. The pie chart and bar charts are shown next. other 3.0% 25 Botswana 26.0% 20 Percent Share Russia 20.0% Canada 18.0% Zimbabwe 13.0% 10 5 0 Angola 10.0% South Africa 10.0% 15 Botswana Zimbabwe Angola South Africa Canada Russia Other Country 1.2.19-20 The Pareto chart is shown below. The Pareto chart is more effective than the bar chart or the pie chart. 25 Percent Share 20 15 10 5 0 Botswana Russia Canada Zimbabwe Angola South Africa Other Country 1.2.21 The data should be displayed with either a bar chart or a pie chart. The pie chart is shown next. 7 Green 1.0% Beige/Brown 4.0% Yellow/Gold 2.0% other 1.0% Silver 13.9% White/White pearl 20.8% Black/Black effect 20.8% Red 10.9% Gray 16.8% Blue 8.9% Section 1.3 1.3.1 The dotplot is shown next; the data is skewed right, with one outlier, x = 2.0. 1.0 1.2 1.4 1.6 1.8 2.0 Exercise 1 1.3.2 The dotplot is shown next; the data is relatively mound-shaped, with no outliers. 54 56 58 60 62 Exercise 2 1.3.3-5 The most obvious choice of a stem is to use the ones digit. The portion of the observation to the right of the ones digit constitutes the leaf. Observations are classified by row according to stem and also within each stem according to relative magnitude. The stem and leaf display is shown next. 1 6 8 2 1 2 5 5 5 7 8 8 9 9 3 1 1 4 5 5 6 6 6 7 7 7 7 8 9 9 9 leaf digit = 0.1 4 0 0 0 1 2 2 3 4 5 6 7 8 9 9 9 1 2 represents 1.2 5 1 1 6 6 7 6 12 3. The stem and leaf display has a mound shaped distribution, with no outliers. 4. From the stem and leaf display, the smallest observation is 1.6 (1 6). 5. The eight and ninth largest observations are both 4.9 (4 9). 8 1.3.6 The stem is chosen as the ones digit, and the portion of the observation to the right of the ones digit is the leaf. 3 | 2 3 4 5 5 5 6 6 7 9 9 9 9 4 | 0 0 2 2 3 3 3 4 4 5 8 leaf digit = 0.1 1 2 represents 1.2 1.3.7-8 The stems are split, with the leaf digits 0 to 4 belonging to the first part of the stem and the leaf digits 5 to 9 belonging to the second. The stem and leaf display shown below improves the presentation of the data. 3 | 2 3 4 3 | 5 5 5 6 6 7 9 9 9 9 leaf digit = 0.1 1 2 represents 1.2 4 | 0 0 2 2 3 3 3 4 4 4 | 5 8 1.3.9 The scale is drawn on the horizontal axis and the measurements are represented by dots. 0 1 2 Exercise 9 1.3.10 Since there is only one digit in each measurement, the ones digit must be the stem, and the leaf will be a zero digit for each measurement. 0 | 0 0 0 0 0 1 | 0 0 0 0 0 0 0 0 0 2 | 0 0 0 0 0 0 1.3.11 The distribution is relatively mound-shaped, with no outliers. 1.3.12 The two plots convey the same information if the stem and leaf plot is turned 90 o and stretched to resemble the dotplot. 1.3.13 The line chart plots โ€œdayโ€ on the horizontal axis and โ€œtimeโ€ on the vertical axis. The line chart shown next reveals that learning is taking place, since the time decreases each successive day. 45 Time (seconds) 40 35 30 25 1 2 3 4 5 Day 1.3.14 The line graph is shown next. Notice the change in y as x increases. The measurements are decreasing over time. 9 63 62 Measurement 61 60 59 58 57 56 0 2 4 6 8 10 Year 1.3.15 The dotplot is shown next. 1 2 3 4 5 6 7 Number of Cheeseburgers 1.3.16 a The distribution is somewhat mound-shaped (as much as a small set can be); there are no outliers. b 2 10 = 0.2 a The test scores are graphed using a stem and leaf plot generated by Minitab. b-c The distribution is not mound-shaped, but is rather has two peaks centered around the scores 65 and 85. This might indicate that the students are divided into two groups โ€“ those who understand the material and do well on exams, and those who do not have a thorough command of the material. 1.3.17 a We choose a stem and leaf plot, using the ones and tenths place as the stem, and a zero digit as the leaf. The Minitab printout is shown next. 10 Dotplot of Calcium b The data set is relatively mound-shaped, centered at 5.2. c The value x = 5.7 does not fall within the range of the other cell counts, and would be considered somewhat unusual. 1.3.18 a-b The dotplot and the stem and leaf plot are drawn using Minitab. 2.68 2.70 2.72 2.74 2.76 2.78 2.80 2.82 Calcium c The measurements all seem to be within the same range of variability. There do not appear to be any outliers. 1.3.19 a Stem and leaf displays may vary from student to student. The most obvious choice is to use the tens digit as the stem and the ones digit as the leaf. 7| 8 9 8| 0 1 7 9| 0 1 2 4 4 5 6 6 6 8 8 10 | 1 7 9 11 | 2 b The display is fairly mound-shaped, with a large peak in the middle. 1.3.20 a The sizes and volumes of the food items do increase as the number of calories increase, but not in the correct proportion to the actual calories. The differences in calorie content are not accurately portrayed in the graph. b The bar graph which accurately portrays the number of calories in the six food items is shown next. 11 900 800 Number of Calories 700 600 500 400 300 200 100 0 Hershey’s Kiss Oreo 12 oz Coke 12 oz Beer Pizza Whopper Food Item 1.3.21 a-b The bar charts for the median weekly earnings and unemployment rates for eight different levels of education are shown next. 2000 8 Median wkly earnings Unemployment rate 7 6 5 4 3 2 1500 1000 500 1 0 c Do ee gr de al r to Pr sio es of e re eg ld na 0 a a ee ee ee ee gr gr gr gr om om pl pl de de de de di di o ‘s ‘s l l r’s r n e e t o o o , t a el ho ho ge as ci ch sc sc M lle so Ba co As gh gh e hi Hi m n o a S th ss Le Do d al or ct e re eg of Pr si es gr de al on ee M r’s te as gr de ee c Ba r’ s lo he ee gr de c so As e re eg sd e’ t ia m So e co ,n ge lle o ee gr de Educational Attainment Educational Attainment g Hi h h sc L ld oo l ip a om n ha st es hi gh h sc ld oo l ip a om c The unemployment rate drops and the median weekly earnings rise as the level of educational attainment increases. 1.3.22 a Similar to previous exercises. The pie chart is shown next. Judaism 0.2% Chinese Traditional 6.8% Sikhism 0.4% other 1.1% Buddhism 6.5% Primal Indigenous & African Traditional 6.9% Christianity 36.4% Islam 26.0% Hinduism 15.6% b The bar chart is shown next. 12 Members (millions) 2000 1500 1000 500 0 d Bu i dh sm s ti ri Ch i an ty n Hi Pr c i du im sm d In al Isl i us no ge & am f ri A n ca iti ad Tr al on m is da Ju Si m is kh i Ch se ne on iti ad Tr al h Ot er Religion The Pareto chart is a bar chart with the heights of the bars ordered from large to small. This display is more effective than the pie chart. Members (millions) 2000 1500 1000 500 0 ty ni tia ris h C Isl am I al im Pr n u nd Hi m is s& ou en dig A n ca fri al on it i ad Tr e es in Ch on iti ad Tr al h dd Bu m is r he Ot m his Sik Ju m is da Religion 1.3.23 a The distribution is skewed to the right, with a several unusually large measurements. The five states marked as HI are California, New Jersey, New York and Pennsylvania. b Three of the four states are quite large in area, which might explain the large number of hazardous waste sites. However, New Jersey is relatively small, and other large states do not have unusually large number of waste sites. The pattern is not clear. 1.3.24 a The distribution is skewed to the right, with two outliers. b The dotplot is shown next. It conveys nearly the same information, but the stem-and-leaf plot may be more informative. 13 1.4 2.8 4.2 5.6 7.0 8.4 9.8 Weekend Gross 1.3.25 a Answers will vary. b The stem and leaf plot is constructed using the tens place as the stem and the ones place as the leaf. Notice that the distribution is roughly mound-shaped. c-d Three of the five youngest presidents โ€“ Kennedy, Lincoln and Garfield โ€“ were assassinated while in office. This would explain the fact that their ages at death were in the lower tail of the distribution. Section 1.4 1.4.1 The relative frequency histogram displays the relative frequency as the height of the bar over the appropriate class interval and is shown next. The distribution is relatively mound-shaped. .50 Relative Frequency .40 .30 .20 .10 0 100 120 140 160 180 200 x . 1.4.2 Since the variable of interest can only take integer values, the classes can be chosen as the values 0, 1, 2, 3, 4, 5 and 6. The table containing the classes, their corresponding frequencies and their relative frequencies and the relative frequency histogram are shown next. The distribution is skewed to the right. 14 Number of Household Pets 0 1 2 3 4 5 6 Total Frequency 13 19 12 4 1 0 1 50 Relative Frequency 13/50 = .26 19/50 = .38 12/50 = .24 4/50 = .08 1/50 = .02 0/50 = .00 1/50 = .02 50/50 = 1.00 .40 Relative Frequency .30 .20 .10 0 0 1 2 3 4 5 6 Number of Pets 1.4.3-8 The proportion of measurements falling in each interval is equal to the sum of the heights of the bars over that interval. Remember that the lower class boundary is included, but not the upper class boundary. 3. .20 + .40 + .15 = .75 4. .05 + .15 + .20 = .40 5. .05 6. .40 + .15 = .55 7. .15 8. .05 + .15 + .20 = .40 1.4.9 Answers will vary. The range of the data is 110 โˆ’10 = 90 and we need to use seven classes. Calculate 90 / 7 = 12.86 which we choose to round up to 15. Convenient class boundaries are created, starting at 10: 10 to < 25, 25 to < 40, โ€ฆ, 100 to < 115. 1.4.10 Answers will vary. The range of the data is 76.8 โˆ’ 25.5 = 51.3 and we need to use six classes. Calculate 51.3 / 6 = 8.55 which we choose to round up to 9. Convenient class boundaries are created, starting at 25: 25 to < 34, 34 to < 43, โ€ฆ, 70 to < 79. 1.4.11 Answers will vary. The range of the data is 1.73 โˆ’ .31 = 1.42 and we need to use ten classes. Calculate 1.42 /10 = .142 which we choose to round up to .15. Convenient class boundaries are created, starting at .30: .30 to < .45, .45 to < .60, โ€ฆ, 1.65 to < 1.80. 1.4.12 Answers will vary. The range of the data is 192 โˆ’ 0 = 192 and we need to use eight classes. Calculate 192 / 8 = 24 which we choose to round up to 25. Convenient class boundaries are created, starting at 0: 0 to < 25, 25 to < 50, โ€ฆ, 175 to < 200. 1.4.13-16 The table containing the classes, their corresponding frequencies and their relative frequencies and the relative frequency histogram are shown next. 15 Class i Class Boundaries Tally fi Relative frequency, fi/n 1 2 3 4 5 1.6 to < 2.1 2.1 to < 2.6 2.6 to < 3.1 3.1 to < 3.6 3.6 to < 4.1 11 11111 11111 11111 11111 11111 1111 2 5 5 5 14 .04 .10 .10 .10 .28 6 7 8 9 10 4.1 to < 4.6 4.6 to < 5.1 5.1 to < 5.6 5.6 to < 6.1 6.1 to < 6.6 11111 11 11111 11 111 11 7 5 2 3 2 .14 .10 .04 .06 .04 Relative Frequency .30 .20 .10 0 1.6 2.1 2.6 3.1 3.6 4.1 4.6 5.1 5.6 6.1 6.6 DATA 13. The distribution is roughly mound-shaped. 14. The fraction less than 5.1 is that fraction lying in classes 1-7, or ( 2 + 5 + 15. The fraction larger than 3.6 lies in classes 5-10, or (14 + 7 + + 7 + 5) 50 = 43 50 = 0.86 . + 3 + 2 ) 50 = 33 50 = 0.66 . 16. The fraction from 2.6 up to but not including 4.6 lies in classes 3-6, or ( 5 + 5 + 14 + 7 ) 50 = 31 50 = 0.62 . 1.4.17-20 Since the variable of interest can only take the values 0, 1, or 2, the classes can be chosen as the integer values 0, 1, and 2. The table shows the classes, their corresponding frequencies and their relative frequencies. The relative frequency histogram follows the table. Value 0 1 2 Frequency 5 9 6 Relative Frequency .25 .45 .30 16 0.5 Relative Frequency 0.4 0.3 0.2 0.1 0.0 0 1 2 17. Using the table above, the proportion of measurements greater than 1 is the same as the proportion of โ€œ2โ€s, or 0.30. 18. The proportion of measurements less than 2 is the same as the proportion of โ€œ0โ€s and โ€œ1โ€s, or 0.25 + 0.45 = .70 . 19. The probability of selecting a โ€œ2โ€ in a random selection from these twenty measurements is 6 20 = .30 . 20. There are no outliers in this relatively symmetric, mound-shaped distribution. 1.4.21-23 Answers will vary. The range of the data is 94 โˆ’ 55 = 39 and we choose to use 5 classes. Calculate 39 / 5 = 7.8 which we choose to round up to 10. Convenient class boundaries are created, starting at 50 and the table and relative frequency histogram are created. Class Boundaries 50 to < 60 60 to < 70 70 to < 80 80 to < 90 90 to < 100 Frequency 2 6 3 6 3 Relative Frequency .10 .30 .15 .30 .15 .30 Relative Frequency .25 .20 .15 .10 .05 0 50 60 70 80 90 100 Scores 21. The distribution has two peaks at about 65 and 85. Depending on the way in which the student constructs the histogram, these peaks may or may not be clearly seen. 17 22. The shape is unusual. It might indicate that the students are divided into two groups โ€“ those who understand the material and do well on exams, and those who do not have a thorough command of the material. 23. The shapes are roughly the same, but this may not be the case if the student constructs the histogram using different class boundaries. 1.4.24 a There are a few extremely small numbers, indicating that the distribution is probably skewed to the left. b The range of the data 165 โˆ’ 8 = 157 . We choose to use seven class intervals of length 25, with subintervals 0 to < 25, 25 to < 50, 50 to < 75, and so on. The tally and relative frequency histogram are shown next. Class i 1 2 3 4 5 6 7 Class Boundaries 0 to < 25 25 to < 50 50 to < 75 75 to < 100 100 to < 125 125 to < 150 150 to < 175 Tally 11 111 111 11 11111 11 111 fi 2 0 3 3 2 7 3 Relative frequency, fi/n 2/20 0/20 3/20 3/20 2/20 7/20 3/20 .40 Relative Frequency .30 .20 .10 0 0 25 50 75 100 125 150 175 Times c 1.4.25 The distribution is indeed skewed left with two possible outliers: x = 8 and x = 11. a The range of the data 32.3 โˆ’ 0.2 = 32.1 . We choose to use eleven class intervals of length 3 ( 32.1 11 = 2.9 , which when rounded to the next largest integer is 3). The subintervals 0.1 to < 3.1, 3.1 to < 6.1, 6.1 to < 9.1, and so on, are convenient and the tally and relative frequency histogram are shown next. Class i 1 2 3 4 5 6 7 8 9 10 11 Class Boundaries 0.1 to < 3.1 3.1 to < 6.1 6.1 to < 9.1 9.1 to < 12.1 12.1 to < 15.1 15.1 to < 18.1 18.1 to < 21.1 21.1 to < 24.1 24.1 to < 37.1 27.1 to < 30.1 30.1 to < 33.1 Tally 11111 11111 11111 11111 1111 11111 11111 111 1111 111 11 11 1 1 18 fi 15 9 10 3 4 3 2 2 1 0 1 Relative frequency, fi/n 15/50 9/50 10/50 3/50 4/50 3/50 2/50 2/50 1/50 0/50 1/50 0.30 Relative Frequency 0.25 0.20 0.15 0.10 0.05 0 0.1 6.1 12.1 18.1 24.1 30.1 TIME b The data is skewed to the right, with a few unusually large measurements. c Looking at the data, we see that 36 patients had a disease recurrence within 10 months. Therefore, the fraction of recurrence times less than or equal to 10 is 36 50 = 0.72 . 1.4.26 a We use class intervals of length 5, beginning with the subinterval 30 to < 35. The tally and the relative frequency histogram are shown next. Class i 1 2 3 4 5 6 Class Boundaries 30 to < 35 35 to < 40 40 to < 45 45 to < 50 50 to < 55 55 to < 60 Tally 11111 11111 11 11111 11111 11111 11111 11111 11 11111 111 11 1 fi 12 15 12 8 2 1 Relative frequency, fi/n 12/50 15/50 12/50 8/50 2/50 1/50 .30 Relative Frequency .25 .20 .15 .10 .05 0 30 35 40 45 50 55 60 Ages b Use the table or the relative frequency histogram. The proportion of children in the interval 35 to < 45 is (15 + 12)/50 = .54. c 1.4.27 The proportion of children aged less than 50 months is (12 + 15 + 12 + 8)/50 = .94. a The data ranges from .2 to 5.2, or 5.0 units. Since the number of class intervals should be between five and twelve, we choose to use eleven class intervals, with each class interval having length 0.50 ( 5.0 11 = .45 , which, rounded to the nearest convenient fraction, is .50). We must now select interval boundaries such that no measurement can fall on a boundary point. The subintervals .1 to < .6, .6 to < 1.1, and so on, are convenient and a tally is constructed. 19 Class i Class Boundaries Tally 1 0.1 to < 0.6 11111 11111 2 0.6 to < 1.1 11111 11111 11111 3 1.1 to < 1.6 11111 11111 11111 4 1.6 to < 2.1 11111 11111 5 2.1 to < 2.6 1111 6 2.6 to < 3.1 1 7 3.1 to < 3.6 11 8 3.6 to < 4.1 1 9 4.1 to < 4.6 1 10 4.6 to < 5.1 11 5.1 to < 5.6 1 The relative frequency histogram is shown next. fi 10 15 15 10 4 1 2 1 1 0 1 Relative frequency, fi/n .167 .250 .250 .167 .067 .017 .033 .017 .017 .000 .017 .25 Relative Frequency .20 .15 .10 .05 0 0.1 1.1 2.1 3.1 4.1 5.1 Times b The distribution is skewed to the right, with several unusually large observations. c For some reason, one person had to wait 5.2 minutes. Perhaps the supermarket was understaffed that day, or there may have been an unusually large number of customers in the store. 1.4.28 a Histograms will vary from student to student. A typical histogram generated by Minitab is shown next. .25 Relative Frequency .20 .15 .10 .05 0 0.34 0.36 0.38 0.40 0.42 Batting Avg b 1.4.29 Since 1 of the 20 players has an average above 0.400, the chance is 1 out of 20 or 1 20 = 0.05 . a-b Answers will vary from student to student. The students should notice that the distribution is skewed to the right with a few pennies being unusually old. A typical histogram is shown next. 20 .50 Relative Frequency .40 .30 .20 .10 0 0 8 16 24 32 Age (Years) 1.4.30 a Answers will vary from student to student. A typical histogram is shown next. It looks very similar to the histogram from Exercise 1.4.29. 40 Relative Frequency 30 20 10 0 0 4 8 12 16 20 24 28 32 36 40 44 Age (Years) b 1.4.31 There is one outlier, x = 41. a Answers will vary from student to student. The relative frequency histogram below was constructed using classes of length 1.0 starting at x = 4 . The value x = 35.1 is not shown in the table but appears on the graph shown next. Class i 1 2 3 4 5 6 7 8 Class Boundaries 4.0 to < 5.0 5.0 to < 6.0 6.0 to < 7.0 7.0 to < 8.0 8.0 to < 9.0 9.0 to < 10.0 10.0 to < 11.0 11.0 to < 12.0 Tally 1 0 11111 1 11111 11111 11111 11111 111 11111 11111 111 11111 11 111 21 fi 1 0 6 15 8 13 7 3 Relative frequency, fi/n 1/54 0/54 6/54 15/54 8/54 13/54 7/54 3/54 .30 Relative Frequency .25 .20 .15 .10 .05 0 5 10 15 20 25 30 35 Wind speed b Since Mt. Washington is a very mountainous area, it is not unusual that the average wind speed would be very high. c The value x = 9.9 does not lie far from the center of the distribution (excluding x = 35.1 ). It would not be considered unusually high. 1.4.32 a-b The data is somewhat mound-shaped, but it appears to have two local peaks โ€“ high points from which the frequencies drop off on either side. c Since these are student heights, the data can be divided into two groups โ€“ heights of males and heights of females. Both groups will have an approximate mound-shape, but the average female height will be lower than the average male height. When the two groups are combined into one data set, it causes a โ€œmixtureโ€ of two mound-shaped distributions and produces the two peaks seen in the histogram. 1.4.33 a The relative frequency histogram below was constructed using classes of length 1.0 starting at x = 0.0 . Class i 1 2 3 4 5 6 7 8 9 10 Class Boundaries 0.0 to < 1.0 1.0 to < 2.0 2.0 to < 3.0 3.0 to < 4.0 4.0 to < 5.0 5.0 to < 6.0 6.0 to < 7.0 7.0 to < 8.0 8.0 to < 9.0 9.0 to < 10.0 Tally 11 11 1 111 111 11111 111 11111 11111 111 11111 1 22 fi 2 2 1 3 4 5 3 5 8 6 Relative frequency, fi/n 2/39 2/39 1/39 3/39 4/39 5/39 3/39 5/39 8/39 6/39 Relative Frequency .20 .15 .10 .05 0 0 2 4 6 8 10 Distance a The distribution is skewed to the left, with slightly higher frequency in the first two classes (within two miles of UCR). b As the distance from UCR increases, each successive area increases in size, thus allowing for more Starbucks stores in that region. Reviewing What Youโ€™ve Learned 1.R.1 a โ€œEthnic originโ€ is a qualitative variable since a quality (ethnic origin) is measured. b โ€œScoreโ€ is a quantitative variable since a numerical quantity (0-100) is measured. c โ€œType of establishmentโ€ is a qualitative variable since a category (Carlโ€™s Jr., McDonaldโ€™s or Burger King) is measured. d 1.R.2 โ€œMercury concentrationโ€ is a quantitative variable since a numerical quantity is measured. To determine whether a distribution is likely to be skewed, look for the likelihood of observing extremely large or extremely small values of the variable of interest. a The price of an 8-oz can of peas is not likely to contain unusually large or small values. b Not likely to be skewed. c If a package is dropped, it is likely that all the shells will be broken. Hence, a few large number of broken shells is possible. The distribution will be skewed. d If an animal has one tick, he is likely to have more than one. There will be some โ€œ0โ€s with uninfected rabbits, and then a larger number of large values. The distribution will not be symmetric. 1.R.3 a The length of time between arrivals at an outpatient clinic is a continuous random variable, since it can be any of the infinite number of positive real values. b The time required to finish an examination is a continuous random variable as was the random variable described in part a. 1.R.4 c Weight is continuous, taking any positive real value. d Body temperature is continuous, taking any real value. e Number of people is discrete, taking the values 0, 1, 2, โ€ฆ a Number of properties is discrete, taking the values 0, 1, 2, โ€ฆ b Depth is continuous, taking any non-negative real value. c Length of time is continuous, taking any non-negative real value. d Number of aircraft is discrete. 23 1.R.5 a b .25 Relative Frequency .20 .15 .10 .05 0 50 100 150 200 250 300 350 400 Length c 1.R.6 These data are skewed right. a The five quantitative variables are measured over time two months after the oil spill. Some sort of comparative bar charts (side-by-side or stacked) or a line chart should be used. b As the time after the spill increases, the values of all five variables increase. c-d The line chart for number of personnel and the bar chart for fishing areas closed are shown next. 35 30 20 25 Areas closed % Number of Personnel (thousands) 25 15 10 20 15 10 5 5 0 10 20 30 40 0 50 13 Day e 26 39 51 Day The line chart for amount of dispersants is shown next. There appears to be a straight-line trend. Dispersants Used (1000 gallons) 1200 1000 800 600 400 200 10 20 30 40 50 Day 1.R.7 a The popular vote within each state should vary depending on the size of the state. Since there are several very large states (in population) in the United States, the distribution should be skewed to the right. b-c Histograms will vary from student to student but should resemble the histogram generated by Minitab in the next figure. The distribution is indeed skewed to the right, with three โ€œoutliersโ€ โ€“ California, Florida and Texas. 24 14/50 Relative Frequency 12/50 10/50 8/50 6/50 4/50 2/50 0 0 1000 2000 3000 4000 Popular vote 1.R.8 a-b Once the size of the state is removed by calculating the percentage of the popular vote, the unusually large values in the Exercise 7 data set will disappear, and each state will be measured on an equal basis. Student histograms should resemble the histogram shown next. Notice the relatively mound-shape and the lack of any outliers. 10/50 Relative Frequency 8/50 6/50 4/50 2/50 0 30 40 50 60 70 Percentage 1.R.9 a-b Popular vote is skewed to the right while the percentage of popular vote is roughly mound-shaped. While the distribution of popular vote has outliers (California, Florida and Texas), there are no outliers in the distribution of percentage of popular vote. When the stem and leaf plots are turned 90 o, the shapes are very similar to the histograms. c Once the size of the state is removed by calculating the percentage of the popular vote, the unusually large values in the set of โ€œpopular votesโ€ will disappear, and each state will be measured on an equal basis. The data then distribute themselves in a mound-shape around the average percentage of the popular vote. 1.R.10 a The measurements are obtained by counting the number of beats for 30 seconds, and then multiplying by 2. Thus, the measurements should all be even numbers. b The stem and leaf plot is shown next. 25 c Answers will vary. A typical histogram, generated by Minitab, is shown next. .30 Relative Frequency .25 .20 .15 .10 .05 0 40 50 60 70 80 90 100 110 Pulse d The distribution of pulse rates is mound-shaped and relatively symmetric around a central location of 75 beats per minute. There are no outliers. 1.R.11 a-b Answers will vary from student to student. A typical histogram is shown nextโ€”the distribution is skewed to the right, with an extreme outlier (Texas). .70 Relative Frequency .60 .50 .40 .30 .20 .10 0 0 10000 20000 30000 40000 50000 60000 70000 Capacity c Answers will vary. 1.R.12 a-b Answers will vary. A typical histogram is shown next. Notice the gaps and the bimodal nature of the histogram, probably due to the fact that the samples were collected at different locations. .20 Relative Frequency .15 .10 .05 0 10 12 14 16 18 AL 26 20 c The dotplot is shown as follows. The locations are indeed responsible for the unusual gaps and peaks in the relative frequency histogram. 11.2 12.6 14.0 15.4 16.8 18.2 19.6 21.0 AL Site L I C A 1.R.13 a-b The Minitab stem and leaf plot is shown next. The distribution is slightly skewed to the right. c Pennsylvania (58.20) has an unusually high gas tax. 1.R.14 a-b Answers will vary. The Minitab stem and leaf plot is shown next. The distribution is skewed to the right. 1.R.15 a-b The distribution is approximately mound-shaped, with one unusual measurement, in the class with midpoint at 100.8ยฐ. Perhaps the person whose temperature was 100.8 has some sort of illness coming on? c The value 98.6ยฐ is slightly to the right of center. On Your Own 1.R.16 Answers will vary from student to student. The students should notice that the distribution is skewed to the right with a few presidents (Truman, Cleveland, and F.D. Roosevelt) casting an unusually large number of vetoes. 27

Document Preview (30 of 506 Pages)

User generated content is uploaded by users for the purposes of learning and should be used following SchloarOn's honor code & terms of service.
You are viewing preview pages of the document. Purchase to get full access instantly.

Shop by Category See All


Shopping Cart (0)

Your bag is empty

Don't miss out on great deals! Start shopping or Sign in to view products added.

Shop What's New Sign in