# Back to basics: Common Statistical Tools

Statistics help you identify the right problem, quantify the magnitude of the problem, measure the impact of the changes, and enable fact-based decision making. The tools we will cover in this blog are:

### Pareto

Pareto Charts are a method of organizing errors, problems, or defects to help focus on problem solving efforts. Juron popularized Pareto’s work when he suggested that 80% of the problems are a result of only 20% of the causes. By displaying the longest bars on the left and the shortest on the right, the chart is a good depiction of which situations are the most significant.

### Histogram

Histograms show the range of values of a measurement and the frequency with which each value occurs. They show the most frequently occurring readings as well as the variations in the measurements. Useful when you want to look at frequencies of an occurrence of specific values.

### Box Plot

Box plots graphically depict the differences in groups of numeric data around the mean and median. Rather than just frequencies (as in a histogram), logically groups the data into percentiles of a population. Unlike many other graphical tools, Boxplots show outliers.

### Scatter Plots

Scatter Plots show the relationship between two variables. If the two variables are closely related, the data points will form a tight band. If a random pattern results, the variables are unrelated.

### Control Charts

• Are Special Causes of Variation Present? Special Causes exist if any of the points are found to lie outside of the control limits or if the points display any type of systematic pattern.
• If special causes are detected,  actions are taken to eliminate the causes
• Once special causes are removed, measure process capability
• Measure variation. If variation is excessive, analyze the process and improve capability.

The objective of control chart is to discriminate between common and special causes of variation. In order to ensure the effectiveness of the chart, the rational sampling should be employed. Subgroup samples for control charts should be selected in such a way that observations within a subgroup are all produced under very similar process conditions over a very short interval of time. This way, the chance of variation within subgroups is minimized and is likely that the variation within subgroups will be due mainly to common causes. This is different from random sampling and results in control charts with the greatest sensitivity to detect special causes of variation.

It is important to understand how control limits are set. Usually, the control limits are set three standard deviations above and below the center line of the chart. The control limits define the amount of natural or common cause variation inherent to the process. Any points within the limits represent random variability and any of the points falling outside provides a signal that the process is unstable and that the amount of variability exceeds the usual, predictable amount. For the normal distribution, the mean plus / minus 3 standard deviations cover 99.7 % of the distribution and implies a stable process. The chance of observing a value that differs from the mean more than 3 standard deviations is very rare. The chance is so small that if we observe such an extreme value we would attribute its existence to a special cause.

### Run Charts

Run Charts

Run Charts are used to display process results observed over time. Data displayed represents output of a performance. Run Charts help to detect shifts in trends and patterns.