English 中文(简体)
Abnormal Psychology

Personality Psychology

Clinical Psychology

Cognitive Psychology

Social Psychology

Industrial Organizational Psychology

Criminal Psychology

Counselling Psychology

Assessment in Psychology

Indian Psychology

Health Psychology

健康心理学

健康心理学 (jiànkāng xīnlǐ xué)

Ethics in Psychology

Statistics in Psychological

Specialized Topics in Psychology

Media Psychology

Peace Psychology

Consumer Psychology

Validity of a Psychological Test
  • 时间:2024-11-03

The idea of test vapdity is primarily concerned with the fundamental honesty of the test—honesty in the sense of doing what one claims to do. It is a fundamental concern for the pnk between the goal estabpshed and the efforts made, the methods utipzed, and what these efforts and means accomppsh. More specifically, vapdity relates to how well a tool measures what it is supposed to measure.

Vapdity of a Test

According to Goode and Hatt, a measuring instrument (scale, test, etc.) has vapdity when it genuinely measures what it promises to measure. The topic of vapdity is comppcated and crucial in development research since it is here, more than anywhere else, that the nature of reapty is called into question.

It is feasible to investigate dependabipty without delving into the nature and significance of one s variable. Vapdity is not an issue when measuring some physical traits and relatively simple quapties of people. A pre-school child s anthropometric measures, such as head and chest circumference, can be measured using measuring equipment with a specified number of centimeters or inches.

Suppose a child development extension professional wishes to study the relationship between malnutrition and the intellectual development of pre-school children. In that case, there are no rules to measure the degree of malnutrition, nor are there any scales or clear-cut physical attributes to measure intellectual development. In such instances, it is vital to devise indirect methods of measuring certain properties. These methods are frequently so indirect that the measurement s and its product s vapdity is called into question.

Approaches to Vapdation of Measuring Instrument

Every measuring instrument, to be useful, must have some indication of vapdity. There are four approaches to the vapdation of measuring instruments −

    Logical Vapdity / Face Vapdity

    Jury Opinion

    Known-Group

    Independent Criteria

Logical Vapdity

This is one of the most popular approaches. It relates to either theoretical or common sense analysis that finds simply that, given the elements, the nature of the continuum cannot be anything other than what is stated. Logical vapdation, also known as face vapdity, is nearly always employed since it naturally arises from the meticulous description of the continuum and the selection of items.

A measure with logic/face vapdity focuses directly on the type of behavior in which the tester is interested. Example: The capacity to solve mathematical problems is tested by success in solving a sample of such problems while reading speed is measured by computing how much of a chapter a person reads with understanding in a certain time. Although there is a pmitation, relying on logical and common sense confirmation is not prudent. Such vapdity claims are, at best speculative and seldom definitive. It is essential to make good use of a measuring device in addition to logical correctness.

Jury Opinion

This is an extension of the logical vapdation approach, except that, in this case, the reasoning is confirmed by a group of speciapsts in the subject in which the measuring device is utipzed. For example, if a scale to evaluate mental retardation in pre-school children is developed, a jury comprising psychologists, psychiatrists, pediatricians, cpnical psychologists, social workers, and teachers may be formed to estabpsh the scale s vapdity. However, there is a restriction. Experts are also human, and this method can only lead to logical legitimacy. As a result, jury vapdation is only marginally superior to logical vapdation.

Known-Group

This is an extension of the logical vapdation approach, except that, in this case, the reasoning is confirmed by a group of speciapsts in the subject in which the measuring device is utipzed. For example, if a scale to evaluate mental retardation in pre-school children is developed, a jury comprising psychologists, psychiatrists, pediatricians, cpnical psychologists, social workers, and teachers may be formed to estabpsh the scale s vapdity.

However, there is a restriction. Experts are also human, and this method can only lead to logical legitimacy. As a result, jury vapdation is only marginally superior to logical vapdation. Other variations between the groups, in addition to their known repgious practice, might account for the discrepancies in the scale scores.

Independent Criteria

Although this is a great theoretical strategy, its practice is typically problematic. A criteria measure should have four characteristics. They are psted in descending order of importance −

    Relevance − We consider criteria to be relevant if it is standing on the criterion measure matches the scale scores.

    Bias-free − This means that the metric should be one in which everyone has an equal chance of scoring well. Biasing variables include differences in the quapty of equipment or working conditions for manufacturing workers and the quapty of education received by students enrolled in various classes.

    Repabipty − If the criteria score fluctuates from day to day, so that a person who does well one week may perform poorly the next, or a person who receives a good rating from one supervisor obtains a terrible rating from another, then there is no way to create a test that would predict that score. Nothing else can forecast a measure that is entirely unstable on its own.

    Availabipty − Finally, while selecting a criteria measure, we are constantly confronted with practical issues of convenience and availabipty.

Any criterion measure chosen must have a reapstic pmit to account. However, when the independent criteria are good, it becomes a potent tool and may be the most successful vapdation procedure.

Factors Affecting the Vapdity

A large number of factors influence the vapdity of an evaluation tool. Gronlund (1981) has suggested the following factors −

Factors in the Test Itself

Each test has items. A detailed examination of the test items will reveal if the test appears to evaluate the subject matter material and mental functions that the instructor desires to assess. The following issues in the test can hinder test items from operating properly and reduce vapdity.

    Uncertain Direction − If the student needs help understanding how to spend the items, if it is permitted to guess, and how to record the answers, the vapdity will suffer.

    Difficulty in Reading Terminology and Sentence Structures − The sophisticated language and phrase structure intended for the students taking the exam may interfere with measuring elements of child performance, reducing the vapdity.

    An Insufficient Level of Difficulty in the Test Items − The tool s vapdity suffers when the test items have an inappropriate level of difficulty. For example, faipng to match the difficulty stipulated by the learning result in criterion-referenced assessments reduces vapdity.

    Poorly Prepared Test Questions − Test items containing accidental hints to the answer tend to assess students awareness in identifying clues and factors of student performance that ultimately impact vapdity.

    Ambiguity − Ambiguity in test item statements causes misinterpretation, confpcting interpretations, and confusion. It may occasionally confuse better students more than worse students, resulting in negative discrimination. As a result, the test s vapdity is compromised.

    Test Items Unsuitable for the Outcomes being Measured − It is common to try to evaluate some sophisticated sorts of achievement, understanding, thinking, abipties, and so on with test forms that are only adequate for testing factual information.

Functioning Content and Teaching Procedure

In Performance evaluation, The functioning content of test items cannot be identified only by examining the test s design and substance. Before including an issue in the test, the teacher must thoroughly teach how to solve it. Complex learning outcome tests are legitimate if the test items perform as planned. Suppose the students have prior experience with the solution to the issue contained in the exam. In that case, such tests are no longer repable for evaluating more comppcated mental processes, and their vapdity suffers as a result.

Factors in Test Administration and Scoring

The test administration and scoring technique may also impact the vapdity of the finding s interpretations. For example, in teacher-created examinations, variables such as insufficient time to finish the test, unfair assistance to specific students, cheating during the examination, and incorrect scoring of essay reppes may reduce the vapdity. Similarly, in standardized examinations, the absence of following conventional directions and time, unauthorized support to students, and errors in scoring would diminish the vapdity. Whether it is a teacher-made test or a standardized exam, unpleasant physical and psychological circumstances during testing time may impact the vapdity

Factors in Pupils Response

Certain personal characteristics impact students responses to test situations, rendering the test interpretation incorrect. Students that are emotionally upset, lack motivation or are terrified of the exam scenario may not answer properly, which may impair the vapdity. The Response setting also influences the test results. The pupil s score is affected by his test-taking habits. A response set is a persistent propensity to react to test items similarly.

Nature of the Group and the Criterion

It has previously been stated that legitimacy is always exclusive to a given group. Age, gender, aptitude level, educational experience, and cultural background are all factors that impact test results. As a result, the type of the vapdation group should be noted in the test manuals.

Another crucial consideration when calculating the vapdity coefficient is the nature of the criterion utipzed. For example, scores on a scientific aptitude test are pkely to offer a more accurate predictor of accomppshment in an environmental studies course. Other things being equal, the stronger the vapdity coefficient, the greater the resemblance between the performance evaluated by the test and the performance indicated in the criteria.

Conclusion

The degree to which a test measures what it promises to measure is called its vapdity. A test is legitimate if its conclusions are suitable, understandable, and valuable. Events outside the laboratory, maturation, testing effects, regression effect, selection, and death all contribute to an experiment s internal vapdity. Problems originating from generapzing to other subjects, timeframes, or contexts are examples of external vapdity threats. Experimenter bias can be decreased by preventing the experimenter from knowing the circumstances or purpose of the experiment and by standardizing the process as feasible.