Explain The Methods of Estimating Reliability

There are four procedures in common use for computing the reliability coefficient (sometimes called the self-correlation) of a test. These are: 

  • Test-Retest (Repetition) 
  • Alternate or Parallel Forms 
  • Split-Half Technique 
  • Rational Equivalence

Test-Retest Method

The test-retest approach estimates reliability by administering the same test twice to the same set of students with a time delay between the two administrations of the test.

The resultant test scores are associated, and the correlation coefficient offers a measure of stability, indicating how consistent the test findings are over time. It is also regarded as a measure of stability.

The estimate of dependability in this scenario varies depending on the time period between the two administrations.

The product moment technique of correlation is a useful tool for measuring the dependability of two sets of scores.

As a result, a strong correlation between two sets of scores shows that the exam is trustworthy. It means that the results acquired in the first administration of the same test are similar to the scores obtained in the second administration of the same exam.

The time interval is quite significant in this strategy. If it is too short, say a day or two, the carry-over effect will impact the consistency of the findings, i.e., the students will recall part of the results from the first administration to the second.

If the time gap is long, say a year, the findings will be impacted not just by the inequity of testing techniques and conditions, but also by real changes in the students throughout that time period.

The retest interval should not be higher than six months. A fortnightly (2-week) testing interval provides an appropriate indication of dependability.

Alternate or Parallel Forms Method

Alternative form reliability, Equivalent form reliability, and Comparable form reliability are all names for parallel form reliability. This approach employs two simultaneous or similar variants of a test. By parallel forms, we imply that the forms are identical in terms of content, aims, format, difficulty level and discriminating value of elements, test duration, and so on.

Parallel tests have comparable mean scores, variances, and inter-item correlations. That is, two parallel forms must be homogenous or comparable in every way, but there must be no duplication of test items. Form A and Form B are the two forms.

The reliability coefficient may be defined as the coefficient correlation between the results of two equivalent types of tests. The two comparable forms should be identical in terms of content, degree, mental processes examined, difficulty level, and other factors.

After administering one form of the test to the students, another form of the test is given to the same group. The resulting scores are correlated, yielding an estimate of dependability. As a result, the discovered dependability is known as the coefficient of equivalence.

Gulliksen 1950 defined parallel tests as tests with identical means, variance, and inter-correlations.

Guilford: The alternative form method indicates both content equivalence and performance stability.

Split-Half Method or Sub-divided Test Method

The split-half approach is an improvement over the previous two methods, as it incorporates both stability and equivalence. The two techniques of measuring dependability outlined above may appear complex at times. It may be impossible to use the same test repeatedly and obtain identical results.

As a result, it is preferable to assess dependability by a single administration of the test to eliminate these challenges, limit memory impact, and save money.

This approach is best suited for homogenous testing because the test is only conducted once on the sample. This strategy ensures that test scores are internally consistent.

All of the test items are normally ordered in increasing order of difficulty and are administered just once on the sample. Following the administration of the exam, it is separated into two comparable, similar, or equal halves or halves. The scores are sorted or created in two groups derived from odd and even numbers of elements individually.

For example, a 100-item test is administered. Individual scores based on 50 items of odd numbers such as 1, 8, 5,… 99 and scores based on even numbers such as 2, 4, 6… 10 are separately arranged. Part ‘A’ will have an odd number of items, while Part ‘B’ will have an even number of items.

The co-efficient of correlation is determined after receiving two scores on odd and even numbers of test items. It is really a correlation between two halves of scores collected in a single sitting.

Method of Rational Equivalence

This technique is often referred to as “Kuder-Richardson Reliability” or “Inter-Item Consistency.” It is a single-administration approach.

It is determined by the consistency of replies to all items. The formula devised by Kuder and Richardson is the most often used method for determining inter-item consistency (1937).

This approach computes the inter-correlation of the test items and the correlation of each item with all of the test items. It was dubbed the coefficient of internal consistency by J. Cronbach.
It is expected in this technique that all items have the same or equal difficulty value, that the correlation between the items is equal, that all items assess essentially the same skill, and that the test is homogenous in character.