July 26th, 2016

What it is:

A distribution is a graphic representation of numerous measurements of one characteristic that describes something. The measurements are placed upon a continuous scale from small to large. The numerous measurements, generally 50 or more, become a graphic representation called the distribution by grouping similar values together. The number of measurements that fall into each group of similar values is characterized by the count of those values, often referred to as the frequency. The count of values for each group is plotted to form the graphic representation, or distribution of values.

What it does:

The distribution graphic provides an overview of the characteristic’s description of something. We visually see the range of values for the characteristic from the smallest to the largest. We see the formation of the central tendency of the characteristic value and its most frequent values. The shape points to the type of distribution, such as normal or others. The graphic visually yields an estimate of the average value for the characteristic as well as the variation about that average. In the example above the measurement was volume in ounces and there were 100 samples and each were measured and recorded. The shape appears normally distributed, the Bell Curve, with an average value of 16 ounces with a range of values from 15.75 to 16.25 ounces by observation from the distribution graphic.

How it’s made:

Select the characteristic about something, that can be measured on a continuous scale, and collect 50 or more samples for measurement.

Rules are used based upon the number of measurements and the smallest and largest values measured to determine the range for each group of similar values. Let’s refer to these groups as a bucket where the similar values can be collected and counted to determine the frequency of occurrence within the dimension of the bucket walls, or range of small to large of the similar values.

The distribution graphic has an X and Y axis. The X axis, or horizontal, is the continuous scale of the measurement of the characteristic. The Y axis, or vertical, is the frequency, or count of the values that fell into each bucket of similar values. Along the X axis the buckets are arranged on the continuous scale from smallest to largest values of the measured characteristic. The height of each bucket is based upon the count of the measurements that fell within its walls.

Why it’s important:

It is the first statistical view of a data set for a particular measurement that graphically provides significant insight with minimal calculations. It provides an easy way to view before and after results from making changes to a process. The graphic shows us the type of distribution so we will know the correct calculations to use for mean and standard deviation. Knowing the type of distribution also provides insight into the expected values over time if no changes are made to the process.