The 4 differences between reliability and validity (in science)
These two theoretical constructs are widely used to know whether measurement tools work.
Since in colloquial language they have very similar meanings, it is easy to confuse the terms reliability and validity when we talk about science and, specifically, psychometrics.
With this text we intend to elucidate the main differences between reliability and validity. main differences between reliability and validity. We hope it will be useful to clarify this common doubt.
What is reliability?
In psychometrics, the concept of "reliability" refers to the accuracy of an instrument. refers to the precision of an instrumentSpecifically, reliability coefficients inform us of the consistency and stability of the measurements taken with that tool.
The higher the reliability of an instrument, the lower the amount of random and unpredictable errors that will appear when using it to measure certain attributes. Reliability excludes predictable errors, i.e., those that are subject to experimental control.
According to classical test theory, reliability is the proportion of the variance that is explained by true scores. Thus, the direct score on a test would be composed of the sum of the random error and the true score.
The two main components of reliability are temporal stability and internal consistency. The first concept indicates that the scores change little when measured on different occasions, while internal consistency refers to the degree to which the items that make up the test measure the same psychological construct.
Thus, a high reliability coefficient indicates that the scores on a test fluctuate little internally and over time and, in short, that the instrument is free of measurement errors.
Definition of validity.
When we talk about validity, we refer to whether the test correctly measures the construct it is intended to measure. This concept is defined as the relationship between the score obtained in a test and another related measure; the degree of linear correlation between a test score and a related measure.The degree of linear correlation between both elements determines the validity coefficient.
Likewise, in scientific research, high validity indicates the degree to which the results obtained with a given instrument or in a study can be generalized.
There are different types of validity, depending on the way in which it is calculated; this makes it a term with very diverse meanings. Basically, we can distinguish between content validity, criterion (or empirical) validity and construct validity..
Content validity defines the extent to which the items of a psychometric test are a representative sample of the elements that make up the construct to be evaluated. The instrument must include all the fundamental aspects of the construct; for example, if we want to make an adequate test to measure depression, we must necessarily include items that assess mood and decreased pleasure.
Criterion validity measures the ability of the instrument to predict aspects related to the trait or area of interest. Finally, construct validity aims to determine whether the test measures what it is intended to measure.for example, based on convergence with scores obtained in similar tests.
Differences between reliability and validity
Although these two psychometric properties are closely related, the fact is that they refer to clearly differentiated aspects. Let us see what these differences consist of.
1. The object of analysis
Reliability is a characteristic of the instrument, in the sense that it measures the properties of its component items. Validity, on the other hand, does not refer exactly to the instrument but to the generalizations made to it. generalizations made on the basis of the results obtained from it. obtained from it.
2. The information they provide
Although it is a somewhat simplistic way of putting it, it is generally stated that validity indicates that a psychometric tool really measures the construct it is intended to measure, while reliability refers to whether it measures it correctly, without errors.
3. How they are calculated
Three main procedures are used to measure reliability: the two halves method, the parallel forms method and the test-retest method.. The most commonly used is the two halves procedure, in which the items are divided into two groups once the test has been answered; the correlation between the two halves is then analyzed.
The parallel or alternative forms method consists of creating two equivalent tests to measure the extent to which the items correlate with each other. The test-retest is simply based on passing the test twice, under conditions as similar as possible. Both procedures can be combined, giving rise to the test-retest with parallel forms, which consists of leaving a time interval between the first form of the test and the second.
Validity, on the other hand is calculated in different ways depending on the typeIn general, all methods are based on the comparison between the score on the objective test and other data from the same subjects in relation to similar traits; the objective is that the test can act as a predictor of the trait.
Among the methods used to assess validity are factor analysis and the multimethod-multitrait matrix technique. Also, content validity is often determined by rational, non-statistical analyses; for example, it includes face validity, which refers to the subjective judgment of experts on the validity of the test.
4. The relationship between the two concepts
The reliability of a psychometric instrument influences its validity: the more reliable it is, the greater will also be its validity.. Therefore, the validity coefficients of a tool are always lower than the reliability coefficients, and validity indirectly informs us about reliability.
(Updated at Apr 13 / 2024)