
One of the things researchers are interested in is the relationship
that variables have with each other. For instance, there
is a relationship between the age of a chicken and the
number of eggs that the chicken will lay in a week. If
you were in the egg selling business, you would want to
know what that relationship is. You would want to know if
older chickens lay more eggs or fewer eggs and you would
want to know if that egg laying pattern changes at a
particular age. The relationship between
the age of chickens and the egg laying behavior of
chickens is what researchers refer to as a correlation.
In other words, there is a correlation between
"chicken age" and "egg production".
This correlation will be represented by a number called a
correlation coefficient. The actual coefficient value is calculated from
a mathematical formula using data collected in the study of the
variables involved. Our interest in this number is to give us a
better idea of how particular variables are related and whether the
relationship is strong enough to warrant further consideration.
Caution:
The correlation only
describes the relationship between the
variables, how the variables behave in relationship to
each other. You can not use a correlation to infer cause
and effect, even with a strong correlation.
NOTE:
The correlation between variables
in a study is represented by a statistical calculation
(correlation coefficient) expressed as a number which has
a value ranging between 1 and +1.
Interpreting
the correlation
The
sign of the correlation ( or +)
tells us the direction of the relationship between the
variables. In other words:

A negative
correlation ( sign) tells us
that as one variable X increases
in value, the other variable Y
decreases in value, as in the negative
correlation plot at left.. Using our chicken
example, it means that as chickens get older they
tend to lay fewer eggs.
Inverse Relationship 

A positive
correlation (+ sign) means that
as one variable X increases in
value, the other variable Y also
increases in value... or as one variable
decreases in value, the other variable also
decreases in value. Using our chicken example, it
means that as chickens get older they tend to lay more
eggs.
Direct Relationship 
In other words, a positive correlation means that the
variables have a direct relationship (changing in the
same direction, as X increases in value Y
also tends to increase in value) and a negative
correlation means that the variables have an inverse
relationship (changing in opposite directions, as X
increases in value Y tends to decrease
in value).
The number associated with the correlation
(always a decimal number such as .95 or .40) tells us the
strength of the correlation. Strong correlations in
research are better than weak ones. Therefore, we always
look at both parts of the correlation to get a better
understanding of the relationship between the variables.
Look at the sign, is it positive or negative.... and look
at the value (or number) of the correlation, is it close
to the value 1 or close to the value 0 ?? The closer to
the value of 1, the stronger the correlation.
Stronger
correlations (.80, .90, .95) are represented by less
variability between data points. That means that we are
better able to predict the value of Y
when given the value of X.

In our chicken example, that
would be important, because if there is a strong
negative correlation between "chicken
age" (variable X) and
"egg production" (variable Y)
we would be able to predict when to make chicken
soup. 
Do you understand
that point?
If not...
email the instructor
The closer the dots (data points) are to the general
tendency line (the line represents an average or tendency
direction) the higher the correlation and consequently,
the stronger the correlation. As the variability between
data points increases, the correlation decreases in
strength, as noted by the graph for correlation .40
below.

Notice
how the data points are spread out from the
general tendency line. This makes it very
difficult to predict the value of Y
when X is the value of 3. Look
at the scatter plot for 0.40 and notice that when
X is 3 (horizontal values) that Y
could be anywhere between 1 and 5 (vertical
values). 

Look also
at the correlation graph for .95
and notice how the data points are closely
arranged along the general tendency line, this
indicates very little variability, and much
better predictability. 
Caution: The correlation only
describes the relationship between the
variables, how the variables behave in relationship to
each other. You can not use a correlation to infer cause
and effect, even with a strong correlation. In our
chicken example, "chicken age" is correlated
with "egg production" but this does not
mean that "chicken age" causes
"egg production."
In another example... there is a correlation between
"power outage" and "birth rate 9 months
later", but this does NOT mean that "power
outage" causes "pregnancy." Get the
point??
See additional information in the text regarding
correlations, or
email the instructor with your questions on
this research issue.
See sample scatter plots
below for selected correlations ... Can you tell which
are strong and which are weak??
Try Correlation Exercise  Click here

