好文档就是一把金锄头!
欢迎来到金锄头文库![会员中心]
电子文档交易市场
安卓APP | ios版本
电子文档交易市场
安卓APP | ios版本

应用统计学英文课件BusinessStatisticsCh03NumericalDescriptiveMeasures.ppt

73页
  • 卖家[上传人]:cn****1
  • 文档编号:586418271
  • 上传时间:2024-09-04
  • 文档格式:PPT
  • 文档大小:1.10MB
  • / 73 举报 版权申诉 马上下载
  • 文本预览
  • 下载提示
  • 常见问题
    • Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-1Chapter 3Numerical Descriptive MeasuresBusiness Statistics:A First CourseFifth Edition Choice is yours, part 2 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-3In this chapter, you learn: nTo describe the properties of central tendency, variation, and shape in numerical datanTo calculate descriptive summary measures for a populationnTo construct and interpret a boxplotnTo calculate the covariance and the coefficient of correlationLearning Objectives Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-4Summary Definitions§The central tendency is the extent to which all the data values group around a typical or central value.§The variation is the amount of dispersion, or scattering, of values §The shape is the pattern of the distribution of values from the lowest value to the highest value. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-5Measures of Central Tendency:The MeannThe arithmetic mean (often just called “mean”) is the most common measure of central tendencynFor a sample of size n:Sample sizeObserved valuesThe ith valuePronounced x-bar Measures of Central Tendency:The MeannExample volume of Coke Listed below are the volumes (in ounces) of the Coke in five different cans. Find the mean for this sample.12.3 12.1 12.2 12.3 12.2 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-7Measures of Central Tendency:The MeannThe most common measure of central tendencynMean = sum of values divided by the number of valuesnAffected by extreme values (outliers)(continued)0 1 2 3 4 5 6 7 8 9 10Mean = 3 0 1 2 3 4 5 6 7 8 9 10Mean = 4 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-8Measures of Central Tendency:Locating the MediannThe location of the median when the values are in numerical order (smallest to largest):nIf the number of values is odd, the median is the middle number Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-9Measures of Central Tendency:Locating the MediannIf the number of values is even, the median is the average of the two middle numbersNote that is not the value of the median, only the position of the median in the ranked data Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-10Measures of Central Tendency:The MediannIn an ordered array, the median is the “middle” number (50% above, 50% below) nNot affected by extreme values0 1 2 3 4 5 6 7 8 9 10Median = 3 0 1 2 3 4 5 6 7 8 9 10Median = 3 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-11Measures of Central Tendency:The ModenValue that occurs most oftennNot affected by extreme valuesnUsed for either numerical or categorical datanThere may be no modenThere may be several modes0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 90 1 2 3 4 5 6No Mode Measures of Central Tendency:The Moden Mean Mode Mode Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-13Measures of Central Tendency:Review ExampleHouse Prices: $2,000,000 $500,000 $300,000 $100,000 $100,000Sum $3,000,000§Mean: ($3,000,000/5) = $600,000§Median: middle value of ranked data = $300,000§Mode: most frequent value = $100,000 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-14Measures of Central Tendency:Which Measure to Choose?§The mean is generally used, unless extreme values (outliers) exist.§The median is often used, since the median is not sensitive to extreme values. For example, median home prices may be reported for a region; it is less sensitive to outliers.§In some situations it makes sense to report both the mean and the median. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-15Measures of Central Tendency:SummaryCentral TendencyArithmetic MeanMedianModeMiddle value in the ordered arrayMost frequently observed value Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-16Same center, different variationMeasures of VariationnMeasures of variation give information on the spread or variability or dispersion of the data values.VariationStandard DeviationCoefficient of VariationRangeVariance Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-17Measures of Variation:The Range§Simplest measure of variation§Difference between the largest and the smallest values:Range = Xlargest – Xsmallest0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range = 13 - 1 = 12Example: Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-18Measures of Variation:Why The Range Can Be Misleading§Ignores the way in which data are distributed§Sensitive to outliers7 8 9 10 11 12Range = 12 - 7 = 57 8 9 10 11 12Range = 12 - 7 = 51,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,51,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120Range = 5 - 1 = 4Range = 120 - 1 = 119 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-19nAverage (approximately) of squared deviations of values from the meannSample variance:Measures of Variation:The VarianceWhere = arithmetic meann = sample sizeXi = ith value of the variable X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-20Measures of Variation:The Standard DeviationnMost commonly used measure of variationnShows variation about the meannIs the square root of the variancenHas the same units as the original datanSample standard deviation: Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-21Measures of Variation:The Standard DeviationSteps for Computing Standard Deviation1.Compute the difference between each value and the mean.2.Square each difference.3.Add the squared differences.4.Divide this total by n-1 to get the sample variance.5.Take the square root of the sample variance to get the sample standard deviation. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-22Measures of Variation:Sample Standard DeviationSample Data (Xi) : 10 12 14 15 17 18 18 24 n = 8 Mean = X = 16A measure of the “average” scatter around the mean Variance of the Getting-Ready Time Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-24Measures of Variation:Comparing Standard DeviationsMean = 15.5 S = 3.338 11 12 13 14 15 16 17 18 19 20 2111 12 13 14 15 16 17 18 19 20 21Data BData AMean = 15.5 S = 0.92611 12 13 14 15 16 17 18 19 20 21Mean = 15.5 S = 4.570Data C Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-25Measures of Variation:Comparing Standard DeviationsSmaller standard deviationLarger standard deviation Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-26Measures of Variation:Summary Characteristics§The more the data are spread out, the greater the range, variance, and standard deviation.§The more the data are concentrated, the smaller the range, variance, and standard deviation.§If the values are all the same (no variation), all these measures will be zero.§None of these measures are ever negative. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-27Measures of Variation:The Coefficient of VariationnMeasures relative variationnAlways in percentage (%)nShows variation relative to meannCan be used to compare the variability of two or more sets of data measured in different units Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-28Measures of Variation:Comparing Coefficients of VariationnStock A:nAverage price last year = $50nStandard deviation = $5nStock B:nAverage price last year = $100nStandard deviation = $5Both stocks have the same standard deviation, but stock B is less variable relative to its price Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-29Locating Extreme Outliers:Z-Score§To compute the Z-score of a data value, subtract the mean and divide by the standard deviation.§The Z-score is the number of standard deviations a data value is from the mean.§A data value is considered an extreme outlier if its Z-score is less than -3.0 or greater than +3.0.§The larger the absolute value of the Z-score, the farther the data value is from the mean. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-30Locating Extreme Outliers:Z-Scorewhere X represents the data value X is the sample mean S is the sample standard deviation Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-31Locating Extreme Outliers:Z-Score§Suppose the mean math SAT score is 490, with a standard deviation of 100.§Compute the Z-score for a test score of 620.A score of 620 is 1.3 standard deviations above the mean and would not be considered an outlier. Z Score for the 10 Getting Ready Time Shape of a DistributionnDescribes how data are distributednMeasures of shapenSymmetric or skewedMean = Median Mean < Median Median < MeanRight-SkewedLeft-SkewedSymmetric Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-34General Descriptive Stats Using Microsoft Excel1.Select Tools.2.Select Data Analysis.3.Select Descriptive Statistics and click OK. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-35General Descriptive Stats Using Microsoft Excel4. Enter the cell range.5. Check the Summary Statistics box.6. Click OK Excel outputMicrosoft Excel descriptive statistics output, using the house price data:House Prices: $2,000,000 500,000 300,000 100,000 100,000Chap 3-36Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Minitab OutputBusiness Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-37Descriptive Statistics: House Price TotalVariable Count Mean SE Mean StDev Variance Sum MinimumHouse Price 5 600000 357771 800000 6.40000E+11 3000000 100000 N forVariable Median Maximum Range Mode Skewness KurtosisHouse Price 300000 2000000 1900000 100000 2.01 4.13 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-38Numerical Descriptive Measures for a Population§Descriptive statistics discussed previously described a sample, not the population.§Summary measures describing a population, called parameters, are denoted with Greek letters.§Important population parameters are the population mean, variance, and standard deviation. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-39Numerical Descriptive Measures for a Population: The mean µnThe population mean is the sum of the values in the population divided by the population size, Nμ = population meanN = population sizeXi = ith value of the variable XWhere Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-40nAverage of squared deviations of values from the meannPopulation variance:Numerical Descriptive Measures For A Population: The Variance σ2Where μ = population meanN = population sizeXi = ith value of the variable X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-41Numerical Descriptive Measures For A Population: The Standard Deviation σnMost commonly used measure of variationnShows variation about the meannIs the square root of the population variancenHas the same units as the original datanPopulation standard deviation: Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-42Sample statistics versus population parametersMeasurePopulation ParameterSample StatisticMeanVarianceStandard Deviation Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-43nThe empirical rule approximates the variation of data in a bell-shaped distributionnApproximately 68% of the data in a bell shaped distribution is within 1 standard deviation of the mean or The Empirical Rule68% Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-44nApproximately 95% of the data in a bell-shaped distribution lies within two standard deviations of the mean, or µ ± 2σnApproximately 99.7% of the data in a bell-shaped distribution lies within three standard deviations of the mean, or µ ± 3σThe Empirical Rule99.7%95% Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-45Using the Empirical Rule§Suppose that the variable Math SAT scores is bell-shaped with a mean of 500 and a standard deviation of 90. Then,§68% of all test takers scored between 410 and 590 (500 ± 90).§95% of all test takers scored between 320 and 680 (500 ± 180).§99.7% of all test takers scored between 230 and 770 (500 ± 270). Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-46nRegardless of how the data are distributed, at least (1 - 1/k2) x 100% of the values will fall within k standard deviations of the mean (for k > 1)n Examples:(1 - 1/22) x 100% = 75% …........ k=2 (μ ± 2σ)(1 - 1/32) x 100% = 89% ………. k=3 (μ ± 3σ)Chebyshev RulewithinAt least How Data Vary Around the Mean Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-48Quartile MeasuresnQuartiles split the ranked data into 4 segments with an equal number of values per segment25%nThe first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are largernQ2 is the same as the median (50% of the observations are smaller and 50% are larger)nOnly 25% of the observations are greater than the third quartile Q3Q1Q2Q325%25%25% Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-49Quartile Measures:Locating QuartilesFind a quartile by determining the value in the appropriate position in the ranked data, where First quartile position: Q1 = (n+1)/4 ranked value Second quartile position: Q2 = (n+1)/2 ranked value Third quartile position: Q3 = 3(n+1)/4 ranked value where n is the number of observed values Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-50Quartile Measures:Calculation RulesnWhen calculating the ranked position use the following rulesnIf the result is a whole number then it is the ranked position to usenIf the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.) then average the two corresponding data values.nIf the result is not a whole number or a fractional half then round the result to the nearest integer to find the ranked position. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-51 (n = 9) Q1 is in the (9+1)/4 = 2.5 position of the ranked dataso use the value half way between the 2nd and 3rd values,so Q1 = 12.5Quartile Measures:Locating QuartilesSample Data in Ordered Array: 11 12 13 16 16 17 18 21 22 Q1 and Q3 are measures of non-central location Q2 = median, is a measure of central tendency Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-52 (n = 9)Q1 is in the (9+1)/4 = 2.5 position of the ranked data,so Q1 = (12+13)/2 = 12.5Q2 is in the (9+1)/2 = 5th position of the ranked data,so Q2 = median = 16Q3 is in the 3(9+1)/4 = 7.5 position of the ranked data,so Q3 = (18+21)/2 = 19.5Quartile MeasuresCalculating The Quartiles: ExampleSample Data in Ordered Array: 11 12 13 16 16 17 18 21 22 Q1 and Q3 are measures of non-central location Q2 = median, is a measure of central tendency Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-53Quartile Measures:The Interquartile Range (IQR)nThe IQR is Q3 – Q1 and measures the spread in the middle 50% of the datanThe IQR is also called the midspread because it covers the middle 50% of the datanThe IQR is a measure of variability that is not influenced by outliers or extreme valuesnMeasures like Q1, Q3, and IQR that are not influenced by outliers are called resistant measures Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-54The Five Number SummaryThe five numbers that help describe the center, spread and shape of data are:§Xsmallest§First Quartile (Q1)§Median (Q2)§Third Quartile (Q3)§Xlargest Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-55Calculating The Interquartile RangeMedian(Q2)XmaximumXminimumQ1Q3Example:25% 25% 25% 25%11 12.5 16 19.5 22Interquartile range = 19.5 – 12.5 = 7 Five Number Summary andThe BoxplotnThe Boxplot: A Graphical display of the data based on the five-number summary:Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-56Example:Xsmallest -- Q1 -- Median -- Q3 -- Xlargest 25% of data 25% 25% 25% of data of data of dataXsmallest Q1 Median Q3 Xlargest Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-57Five Number Summary:Shape of BoxplotsnIf data are symmetric around the median then the box and central line are centered between the endpointsnA Boxplot can be shown in either a vertical or horizontal orientationXsmallest Q1 Median Q3 Xlargest Boxplots for Funds 2019 Return Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-59Distribution Shape and The BoxplotRight-SkewedLeft-SkewedSymmetricQ1Q2Q3Q1Q2Q3Q1Q2Q3 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-60Boxplot ExamplenBelow is a Boxplot for the following data: 0 2 2 2 3 3 4 5 5 9 27nThe data are right skewed, as the plot depicts0 2 3 5 27Xsmallest Q1 Q2 Q3 Xlargest Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-61Boxplot example showing an outlier•The boxplot below of the same data shows the outlier value of 27 plotted separately•A value is considered an outlier if it is more than 1.5 times the interquartile range below Q1 or above Q3 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-62The CovariancenThe covariance measures the strength of the linear relationship between two numerical variables (X & Y)nThe sample covariance:nOnly concerned with the strength of the relationship nNo causal effect is implied Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-63nCovariance between two variables:cov(X,Y) > 0 X and Y tend to move in the same directioncov(X,Y) < 0 X and Y tend to move in opposite directionscov(X,Y) = 0 X and Y are independentnThe covariance has a major flaw:nIt is not possible to determine the relative strength of the relationship from the size of the covarianceInterpreting Covariance Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-64Coefficient of CorrelationnMeasures the relative strength of the linear relationship between two numerical variablesnSample coefficient of correlation: where Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-65Features of theCoefficient of CorrelationnThe population coefficient of correlation is referred as ρ.nThe sample coefficient of correlation is referred to as r.nEither ρ or r have the following features:nUnit freenRanges between –1 and 1nThe closer to –1, the stronger the negative linear relationshipnThe closer to 1, the stronger the positive linear relationshipnThe closer to 0, the weaker the linear relationship Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-66Scatter Plots of Sample Data with Various Coefficients of CorrelationYXYXYXYXr = -1r = -.6r = +.3r = +1YXr = 0 Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-67The Coefficient of CorrelationUsing Microsoft Excel1.Select Tools/Data Analysis2.Choose Correlation from the selection menu3.Click OK . . . Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-68The Coefficient of CorrelationUsing Microsoft Excel4.Input data range and select appropriate options5.Click OK to get output Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-69Interpreting the Coefficient of CorrelationUsing Microsoft Excel§r = .733§There is a relatively strong positive linear relationship between test score #1 and test score #2.§Students who scored high on the first test tended to score high on second test. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-70Pitfalls in Numerical Descriptive MeasuresnData analysis is objectivenShould report the summary measures that best describe and communicate the important aspects of the data setnData interpretation is subjectivenShould be done in fair, neutral and clear manner Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-71Ethical ConsiderationsNumerical descriptive measures:nShould document both good and bad resultsnShould be presented in a fair, objective and neutral mannernShould not use inappropriate summary measures to distort facts Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-72Chapter SummarynDescribed measures of central tendencynMean, median, modenDescribed measures of variationnRange, interquartile range, variance and standard deviation, coefficient of variation, Z-scoresnIllustrated shape of distributionnSymmetric, skewednDescribed data using the 5-number summarynBoxplots Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.Chap 3-73Chapter SummarynDiscussed covariance and correlation coefficientnAddressed pitfalls in numerical descriptive measures and ethical considerations(continued) 。

      点击阅读更多内容
      关于金锄头网 - 版权申诉 - 免责声明 - 诚邀英才 - 联系我们
      手机版 | 川公网安备 51140202000112号 | 经营许可证(蜀ICP备13022795号)
      ©2008-2016 by Sichuan Goldhoe Inc. All Rights Reserved.