A Dot Plot is an Easy Way to Represent the Relationship Between Two Variables
Use dot plots to display the distribution of your sample data when you have continuous variables. These graphs stack dots along the horizontal X-axis to represent the frequencies of different values. More dots indicate greater frequency. Each dot represents a set number of observations.
Dot plots help you visualize the shape and spread of sample data and are especially useful for comparing frequency distributions. A frequency distribution indicates how often values in a dataset occurs. Dot plots present the same types of information as histograms.
Use dot plots to do the following:
- Locate the central tendency of your data.
- Highlight the variability of your data.
- Determine whether the distribution of values is symmetrical or skewed.
- Compare distributions.
- Find outliers.
When drawing a dot plot, your statistical software divides the values in your dataset into many intervals called bins. The stacked dots represent the number of observations falling within each bin. When possible, these graphs use one dot for each observation. However, that isn't possible for larger sample sizes.
At a minimum, dot plots require one continuous variable. To learn about other graphs, read my Guide to Data Types and How to Graph Them.
Example Dot Plot
Imagine that a government agency is testing an education program that aims to increase calcium intake in children. To assess the effectiveness of their program, the researchers graph the calcium intake of subjects that they randomly assigned to either the control group (no education) or the education group.
Dot plots typically contain the following elements:
- X-axis divided into ranges of values (bins) for the variable.
- Dot height representing the frequency of observed values falling within each bin.
- Dots representing observations.
- Optionally, dot plots can display multiple distributions, allowing you to compare them.
For the calcium intake data, the dot plot shows that the distributions for the two groups appear to be different. The control group centers on approximately 800 milligrams of average daily calcium intake. There also seems to be several outliers with extremely low values that the analysts should investigate. The education group centers on a higher value and has a tighter distribution than the control group. Hypothesis testing is required to determine the statistical significance of these differences.
Interpreting Dot Plots and Assessing the Distribution of your Data
Dot plots display the distribution of your data. Look at the central tendency, variation, and overall shape of the distribution. You might create a dot plot before or in conjunction with an analysis to help confirm assumptions and guide further study.
Center and Variability
The tallest stacks of dots represent the most common values in your dataset. This region is where most values tend to fall. It's the central tendency of your dataset. The width of the distribution indicates the amount of variability. Broader distributions signify greater variability.
In the dot plot below, the center is near 50. Most values are close to 50, and values further away are rarer.
Develop an idea of the data's variability by looking at the distance between the minimum and maximum bins. In the dot plot below, the two distributions both center on 50. However, the spread of values is notably different. The values for group A mostly fall between 40 – 60, while for group B that range is 20 – 90.
Related posts: Measures of Central Tendency and Measures of Variability
Skewed Distributions
Determine whether your data tapers off symmetrically from the center or if it is skewed. The height data below follow a roughly symmetric distribution.
In right-skewed distributions, most values fall on the left side of the distribution, and a long tail stretches to the right, as shown below. Most of the body fat percentages are relatively low, with a few that are unusually high.
Conversely, for left-skewed distributions, most values fall on the right side of the distribution, and a long tail reaches to the left.
Outliers
Dot plots are a simple but effective way to identify outliers. Just look for values that stand out from the others! After you find them, you'll need to decide what to do with them.
Related post: Guidelines for Handling Outliers
Multimodal Distributions
Multimodal distributions have more than one peak. This type of distribution stands out in a dot plot, but it can be easy to miss if you focus on summary statistics, such as the mean and standard deviation.
Use Dot Plots with the Appropriate Hypothesis Tests
You can use dot plots to assess a distribution's central tendency and variability. When you have multiple distributions, you can compare the differences between these properties. However, if you want to use your sample to infer the properties of a larger population, be sure to perform the appropriate hypothesis tests to determine statistical significance.
Related post: Descriptive versus Inferential Statistics
Graphs are somewhat subjective because statistical software allows you to edit the number of bins, the size of their intervals, and the axes' scaling. Changing these settings can alter a dot plot's appearance and the conclusions you draw from it. Conversely, hypothesis tests provide an objective assessment regarding statistical significance. These tests also account for the possibility that random error explains the patterns and differences you observe.
The primary hypothesis tests that you can use with dot plots are the following:
- Distribution tests to identify the distribution of your data
- Tests that compare group means
- t-Tests for one or two groups
- ANOVA for at least three groups
- Variances tests that assess differences in variability between groups
Source: https://statisticsbyjim.com/graphs/dot-plots/
0 Response to "A Dot Plot is an Easy Way to Represent the Relationship Between Two Variables"
Postar um comentário