Understanding Types of Data
A brief overview of the categories and types of data
Having a good understanding of different data types and the most common statistics used when analyzing data is essential for any data analytics professional. In this article, I want to focus on how we can identify the different kinds of data so that it becomes easier to decide how to proceed correctly with further analysis.
What is Data? Why is it important?
Throughout the internet, we can find a few different definitions of data.
Data can be defined as follows:
“Facts and statistics collected together for reference or analysis.”
“A collection of facts, such as numbers, words, measurements, observations or just descriptions of things.”
Data could be in the form of any shape or size — like numbers in a spreadsheet or a table within a database, a text file, a collection of images or videos, etc., which can be used to understand and improve nearly every facet of our lives and make better decisions to accomplish our goals.
Types of Data
If we were to stand by the side of a road and watch the cars passing by –
- We could count the number of cars that passed by.
- We could notice the different types of cars that passed by.
- Wonder if more cars passed on weekends or weekdays?
- Do these numbers differ from Mondays to other days of the week?
- What are the most common makes or models of cars that pass?
- Would these numbers change if we were watching another road?
All of the above questions introduce us to two main data types:
Quantitative Data — Quantitative data takes on numeric values that allow us to perform mathematical operations.
e.g. Number of Cars (from the example above)
Other possible examples could include Age of a person, Income, Height, etc.
Categorical Data — Categorical data are used to label a group or set of items.
e.g. Types of Cars (from the example above)
Other possible examples could include the Zip Code of an area, Marital Status of a person, etc.
Types of Categorical Data
- Ordinal — Data that follows a type of ranked ordering. The variables have natural, ordered categories and the distances between the categories are not known.
For example — Based on our research or knowledge, we could rate these cars based on how popular they are in the market. Ratings for cars could range from Very Poor to Very Good, or 1 Star to 5 Star.
- Nominal — Data that does not have an order or ranking. It is a type of data that is used to label variables without providing any quantitative value. It is the simplest form of a scale of measure.
For example — Make of cars.
Types of Quantitative Data
Quantitative Data can be either Continuous or Discrete.
Continuous Data — can be divided into several smaller units. Continuous data can have any numeric value like integers, decimal values, negative real numbers, etc. It has an infinite number of possible values within.
For example — the age of a car can be measured in years, months, days, hours, seconds and even then, there are more measurable units that are smaller.
Discrete Data — only has countable values. There may potentially be an infinite number of those values, but each is distinct and there’s no grey area in between.
For example — the number of cars we see on the road is an example of a discrete data type. We can’t break that count further into smaller units of anything.
- Identifying data types is very important as it allows us to understand the types of analyses that we can perform and the plots that we can build.
- There are two main types of data, each with their sub-categories:
This article was originally published on Medium.