scatter diagram

A scatter diagram is one of the七个基本质量工具that many professionals struggle with.

散点图也称为散点图,scatter graph, orcorrelation chart.

Other diagrams use lines or bars to display data; a scatter diagram uses dots. This may appear to be a confusing approach at first, but it is often easier than others to understand.

An English scientist, John Fredrick W. Herschel, presented the scatter diagram in 1833 in his study of Orbits of Double Stars.

In 1886, the scatter diagram was popularized by an English Victorian-era polymath named Francis Galton. He is also known as the creator of statistical concepts of correlation.

In this blog post, I will explain the scatter diagram.

Scatter Diagram (Scatter Plot)

通过包括两个变量来绘制散点图。第一个变量是独立的,第二个变量取决于第一个变量。

scatter-diagram

The scatter diagram is considered the simplest way to study the correlation between these two variables. After determining how they are related, you can predict the behavior of the dependent variable based on the independent variable.

当一个变量可测量而另一个变量不可测量时,散点图很有用。

Definition: According to the PMBOK Guide, a scatter diagram is “a graph that shows the relationship between two variables. Scatter diagrams can show a relationship between elements of a process, environment, or activity on one axis and a quality defect on the other axis.”

Example of Using a Scatter Diagram

You are analyzing accident patterns on a highway. You select the two variables, motor speed and the number of accidents, and draw up the diagram.

画完成后,你注意到number of accidents increases as the speed of vehicles increases. This reveals the correlation between the two.

In most cases, the independent variable is plotted along the horizontal (x-axis), and the dependent variable is plotted on the vertical (y-axis). The independent variable operates as the control parameter because it influences the behavior of the dependent variable.

It is not necessary to have a controlling parameter to draw a scatter diagram. There can also be two independent variables. In that case, you can use any axis for any variable.

Many professionals believe that a scatter diagram is like a鱼骨图因为后者包括两个参数:因果关系。

Note that these two diagrams are different. The fishbone diagram shows you the effect of a cause; however, it does not show the relationship between these two. The scatter plot helps you analyze the correlation between the two variables.

However, the fishbone or Ishikawa diagram can help you draw a scatter diagram. For example, you can use the fishbone diagram to find the two variables (cause and effect) and then use the scatter diagram to analyze their relationship.

Types of Scatter Diagrams

You can classify scatter diagrams in many ways. I will discuss the two most popular based on correlation and slope of the trend. These are the most common in project management.

根据相关性,您可以将散点图分为以下类别:

  • Scatter Diagram with No Correlation
  • Scatter Diagram with Moderate Correlation
  • Scatter Diagram with Strong Correlation

Scatter Diagram with No Correlation

This diagram is known as the “Scatter Diagram with Zero Degree of Correlation.”

scatter-diagram-with-no-correlation

Here, the data point spread is so random that you cannot draw a line through them.

因此,您可以得出结论,这些变量不相关。

Scatter Diagram with Moderate Correlation

This plot is known as a “Scatter Diagram with a Low Degree of Correlation.”

scatter-diagram-with-moderate-correlation

Here, the data points are a little closer, and you can see a relationship between these variables.

Scatter Diagram with Strong Correlation

该图被称为“具有高度相关性的散点图”。

在此图中,数据点是接近的,您可以按照其模式来绘制线路。

scatter-diagram-with-strong-correlation

在这种情况下,您得出的结论是这些变量密切相关。

As discussed earlier, you can categorize the scatter diagram according to the slope, or trend, of the data points:

  • 具有强正相关的散点图
  • 散布图弱正相关
  • 散点图与强相关性很强
  • Scatter Diagram with Weak Negative Correlation
  • 与最弱(或否)相关性的散点图

强烈的正相关意味着从左到右的可见向上趋势;强大的负相关意味着从左到右的可见下降趋势。弱相关性意味着趋势不太清楚。扁平线是从左到右的最弱相关性,因为它既不是正面也不是负面的。无相关性的散点图表明,自变量不会影响因变量。

具有强正相关的散点图

scatter-diagram-with-strong-positive-correlation

该图被称为“带正倾斜的散点图”。

在正倾斜中,相关性为正,即,随着x的值增加,y的值将增加。您可以说,沿数据点绘制的直线的斜率将上升。该模式类似于直线。

For example, cold drink sales will increase if the weather gets hotter.

散布图弱正相关

scatter-diagram-with-weak-positive-correlation

As the value of X increases, the value of Y also increases, but the pattern does not resemble a straight line.

散点图与强相关性很强

scatter-diagram-with-strong-negative-correlation

该图被称为“具有负倾斜的散点图”。

在负倾斜中,相关性为负,即,随着x值的增加,y的值将下降。沿数据点绘制的直线的斜率将下降。

For example, if the temperature increases, the sale of winter coats goes down.

Scatter Diagram with Weak Negative Correlation

scatter-diagram-with-weak-negative-correlation

随着x值的增加,y的值将减小,但模式尚不清楚。

Scatter Diagram with No Correlation

There isn’t any relationship between the two variables to be seen. It might be a series of points with no visible trend or a straight, flat row of points. In either case, the independent variable does not affect the second variable; it is not dependent.

散点图的局限性

  • Scatter diagrams cannot give you the exact extent of potential correlation.
  • A scatter diagram does not show a quantitative measurement of the relationship between the variables. It only shows the quantitative expression of quantitative change.
  • This chart does not show you the relationship for more than two variables.

Benefits of a Scatter Diagram

  • It shows the relationship between two variables.
  • It is the best method to map out a non-linear pattern.
  • The range of data flow, like the maximum and minimum values, can be determined.
  • Patterns are easy to observe.
  • Plotting the diagram is simple.

When You Should Use a Scatter Diagram

You should use the scatter diagram in the following cases:

  • If two variables pair well together, you can draw a scatter plot to see their relation and correlation. For example, working hours versus earnings.
  • To figure out if two variables share a relation. For example, if there is any relation between the temperature rise with the equipment malfunctioning.

Points to Remember While Plotting Scatter Diagram

  • It is not always guaranteed that two variables share a relationship if the chart shows a correlation. It can be a coincidence or caused by a third variable.
  • You can plot the scatter diagram when you have a large amount of data.
  • The more the data resemble a straight line, the stronger the correlation.
  • Data coverage should be wide for plotting a scatter chart.

Summary

Scatter diagrams are useful in determining the relationship between two variables. This relationship can be between two causes or a cause and an effect. It can be positive, negative, or not correlated at all.

第一个变量是独立的,第二个变量取决于第一个变量。To analyze the pattern of the relationship, you change the independent variable and monitor the changes in the dependent one. A scatter diagram can have two independent variables.

A scatter diagram is an important concept from a PMP exam point of view. Please understand it well.