Module 4 - Data Classification - Data Classification Lab

 This module focused on the different types of data classification methods used map making. I was provided with census tract data for Miami-Dade County, FL from the FGDL and prompted to make maps of the county using some of these classification methods. The methods I used were Equal Interval, Quantile, Normal Break, and Standard Deviation classification. I was asked to make a map using each method to show distribution of the percentage of senior citizens (people above the age of 65) per census tract as well as the total population count of senior citizens. I used ArcGIS Pro to achieve this and I will discuss my final maps below:

Above is the first compilation out of two. Above are four maps using census tract information from the FGDL. A north arrow and scale bar is provided in the first map in the top left and can be used for all four maps as orientation and scale is consistent. The map in the top left shows the distribution of senior citizen population using an Equal Interval classification method, the top right uses Natural Breaks, to bottom right is a Quantile, and the bottom right uses Standard Distribution. Each map uses color schemes and design layouts based on Gestalt's principles of cartographic design. No labels are used as the map and legend titles are sufficient to orient the audience, and data is the key focus.

Above is the second compilation. It is identical to the first in design and purpose, however it uses total population count of senior citizens as the data symbolized. All data has been symbolized using its respective method and the data has been normalized to the area of each census tract.


Here is a bit more regarding the different types of data classification methods used in this exercise: 

  • The Equal Interval classification method divides classes using equal ranges in the data. This is achieved by dividing the total range of the data by the number of classes desired. This type of classification is easy for the reader to interpret and will never contain gaps or missing values in the legend. However, there can be missing classes in the map display itself. Such is the case in this exercise, where one or more class values are not present in both the first compilation and the second. This affected the population percentage and total of the classes higher in their respective ranges. 

  • The Quantile classification method ensures there is an equal number of observations spread amongst 5 classes. This method will never have empty classes or classes with few values; however, it is possible to have classes containing similar values. It is also possible for there to be largely different values in the same class. For both compilations in this exercise, the pattern was very similar. The clusters of higher values in the range for both percentage of senior citizens and total population is highest in the Northeastern area of Miami-Dade and lowest in the Western and Southeastern areas.

  • The Standard Deviation classification method forms classes based on adding and subtracting the standard deviation from the mean of the dataset. This method considers how data values are distributed along a number line and will inherently contain no gaps in its legend. This is a much less approachable classification method as it not only requires the creator to ensure that data is normally distributed, but also that the map viewer has a basic understanding of statistics. There is no real pattern between standard deviation maps in this exercise, and only the map based on total population of senior citizens has been normalized. 

  • The Natural Break classification method aims, using algorithms, to create class ranges based on making values in the same class be as similar as possible. This minimizes in-class variance while maximizing inter-class variance. This will naturally group outliers in their own classes, which helps emphasize extreme values visually. The caveat is that it can also group a large amount of data values into just one or two classes. Between compilations in this exercise, there is a relatively loose pattern between low and values in both percentage of senior citizens and total population of seniors, trending low from the Western part of Miami-Dade to high in the Northeastern area of the county. 




I believe the best method to quickly understand distribution of the data for this exercise was the Quantile classification methods. Both maps display the density of senior citizen population in almost identical patterns. The data is easy for the average audience to understand, and there are no missing values on the map.



  

Comments