When running an experiment or conducting a survey we can potentially end up with many hundreds, thousands or
even millions of values in the resulting data set. Too much data can be overwhelming and we need to
reduce them or represent them in a way that is easier to understand and communicate.
Statistics is about summarising data. The methods of statistics allow us to represent the essential
information in a data set while disregarding the unimportant information. We have to be careful to make
sure that we do not accidentally throw away some of the important aspects of a data set.
By applying statistics properly we can highlight the important aspects of data and make the data easier to
interpret. By applying statistics poorly or dishonestly we can also hide important information and let
people draw the wrong conclusions.
In this chapter we will look at a few numerical and graphical ways in which data sets can be represented, to
make them easier to interpret.
10.1 Collecting data (EMA6X)
- Data
-
Data refers to the pieces of information that have been observed and recorded, from
an experiment or a survey.
The word data is the plural of the word datum, and
therefore one should say, “the data are” and not “the data is”.
We distinguish between two main types of data: quantitative and qualitative.
- Quantitative data
-
Quantitative data are data that can be written as numbers.
Quantitative data can be discrete or continuous. Discrete quantitative data can be represented by
integers and usually occur when we count things, for example, the number of learners in a class,
the number of molecules in a chemical solution, or the number of SMS messages sent in one day.
Continuous quantitative data can be represented by real numbers, for example, the height or mass of a
person, the distance travelled by a car, or the duration of a phone call.
- Qualitative data
-
Qualitative data are data that cannot be written as numbers.
Two common types of qualitative data are categorical and anecdotal data. Categorical data can come
from one of a limited number of possibilities, for example, your favourite cooldrink, the colour
of your cell phone, or the language that you learnt to speak at home.
Anecdotal data take the form of an interview or a story, for example, when you ask someone what their
personal experience was when using a product, or what they think of someone else's behaviour.
Categorical qualitative data are sometimes turned into quantitative data by counting the number of
times that each category appears. For example, in a class with \(\text{30}\) learners, we ask
everyone what the colours of their cell phones are and get the following responses:
black
|
black
|
black
|
white
|
purple
|
red
|
red
|
black
|
black
|
black
|
white
|
white
|
black
|
black
|
black
|
black
|
purple
|
black
|
black
|
white
|
purple
|
black
|
red
|
red
|
white
|
black
|
orange
|
orange
|
black
|
white
|
This is a categorical qualitative data set since each of the responses comes from one of a small
number of possible colours.
We can represent exactly the same data in a different way, by counting how many times each colour
appears.
Colour
|
Count
|
black
|
\(\text{15}\)
|
white
|
\(\text{6}\)
|
red
|
\(\text{4}\)
|
purple
|
\(\text{3}\)
|
orange
|
\(\text{2}\)
|
This is a discrete quantitative data set since each count is an integer.
Worked example 1: Qualitative and quantitative data
Thembisile is interested in becoming an airtime reseller to his classmates. He would
like to know how much business he can expect from them. He asked each of his
\(\text{20}\) classmates how many SMS messages they sent during the previous
day. The results were:
\(\text{20}\)
|
\(\text{3}\)
|
\(\text{0}\)
|
\(\text{14}\)
|
\(\text{30}\)
|
\(\text{9}\)
|
\(\text{11}\)
|
\(\text{13}\)
|
\(\text{13}\)
|
\(\text{15}\)
|
\(\text{9}\)
|
\(\text{13}\)
|
\(\text{16}\)
|
\(\text{12}\)
|
\(\text{13}\)
|
\(\text{7}\)
|
\(\text{17}\)
|
\(\text{14}\)
|
\(\text{9}\)
|
\(\text{13}\)
|
Is this data set qualitative or quantitative? Explain your answer.
The number of SMS messages is a count represented by an integer, which means that it is
quantitative and discrete.
Worked example 2: Qualitative and quantitative data
Thembisile would like to know who the most popular cellular provider is among
learners in his school. This time Thembisile randomly selects \(\text{20}\)
learners from the entire school and asks them which cellular provider they
currently use. The results were:
Cell C
|
Vodacom
|
Vodacom
|
MTN
|
Vodacom
|
MTN
|
MTN
|
Virgin Mobile
|
Cell C
|
8-ta
|
Vodacom
|
MTN
|
Vodacom
|
Vodacom
|
MTN
|
Vodacom
|
Vodacom
|
Vodacom
|
Virgin Mobile
|
MTN
|
Is this data set qualitative or quantitative? Explain your answer.
Since each response is not a number, but one of a small number of possibilities, these are
categorical qualitative data.
Textbook Exercise 10.1
The following data set of dreams that learners have was collected
from Grade 12 learners just after their final exams.
\(\{\text{"I want to build a bridge!"; "I want to help the sick."; "I
want running water!"}\}\)
Categorise the data set.
This data set cannot be written as numbers and so must be
qualitative.
This data set is anecdotal since it takes the form of a story.
Therefore the data set is qualitative anecdotal.
The following data set of sweets in a packet was collected from
visitors to a sweet shop.
\(\{23; 25; 22; 26; 27; 25; 21; 28\}\)
Categorise the data set.
This data set is a set of numbers and so must be quantitative.
This data set is discrete since it can be represented by integers and
is a count of the number of sweets.
Therefore the data set is quantitative discrete.
The following data set of questions answered correctly was collected
from a class of maths learners.
\(\{3; 5; 2; 6; 7; 5; 1; 2\}\)
Categorise the data set.
This data set is a set of numbers and so must be quantitative.
This data set is discrete since it can be represented by integers and
is a count of the number of questions answered correctly.
Therefore the data set is quantitative discrete.