Classical test theory

Classical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score (error-free score) and an error score.[1] Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests.

Classical test theory may be regarded as roughly synonymous with true score theory. The term "classical" refers not only to the chronology of these models but also contrasts with the more recent psychometric theories, generally referred to collectively as item response theory, which sometimes bear the appellation "modern" as in "modern latent trait theory".

Classical test theory as we know it today was codified by Novick (1966) and described in classic texts such as Lord & Novick (1968) and Allen & Yen (1979/2002). The description of classical test theory below follows these seminal publications.

History

Classical test theory was born only after the following three achievements or ideas were conceptualized:

1. a recognition of the presence of errors in measurements,

2. a conception of that error as a random variable,

3. a conception of correlation and how to index it.

In 1904, Charles Spearman was responsible for figuring out how to correct a correlation coefficient for attenuation due to measurement error and how to obtain the index of reliability needed in making the correction.[2] Spearman's finding is thought to be the beginning of Classical Test Theory by some (Traub, 1997). Others who had an influence in the Classical Test Theory's framework include: George Udny Yule, Truman Lee Kelley, Fritz Kuder & Marion Richardson involved in making the Kuder–Richardson Formulas, Louis Guttman, and, most recently, Melvin Novick, not to mention others over the next quarter century after Spearman's initial findings.

Definitions

Classical test theory assumes that each person has a true score,T, that would be obtained if there were no errors in measurement. A person's true score is defined as the expected number-correct score over an infinite number of independent administrations of the test. Unfortunately, test users never observe a person's true score, only an observed score, X. It is assumed that observed score = true score plus some error:

                X         =       T      +    E
          observed score     true score     error

Classical test theory is concerned with the relations between the three variables , , and in the population. These relations are used to say something about the quality of test scores. In this regard, the most important concept is that of reliability. The reliability of the observed test scores , which is denoted as , is defined as the ratio of true score variance to the observed score variance :

Because the variance of the observed scores can be shown to equal the sum of the variance of true scores and the variance of error scores, this is equivalent to

This equation, which formulates a signal-to-noise ratio, has intuitive appeal: The reliability of test scores becomes higher as the proportion of error variance in the test scores becomes lower and vice versa. The reliability is equal to the proportion of the variance in the test scores that we could explain if we knew the true scores. The square root of the reliability is the absolute value of the correlation between true and observed scores.

Evaluating tests and scores: Reliability

Reliability cannot be estimated directly since that would require one to know the true scores, which according to classical test theory is impossible. However, estimates of reliability can be acquired by diverse means. One way of estimating reliability is by constructing a so-called parallel test. The fundamental property of a parallel test is that it yields the same true score and the same observed score variance as the original test for every individual. If we have parallel tests x and x', then this means that

and

Under these assumptions, it follows that the correlation between parallel test scores is equal to reliability (see Lord & Novick, 1968, Ch. 2, for a proof).

Using parallel tests to estimate reliability is cumbersome because parallel tests are very hard to come by. In practice the method is rarely used. Instead, researchers use a measure of internal consistency known as Cronbach's . Consider a test consisting of items , . The total test score is defined as the sum of the individual item scores, so that for individual

Then Cronbach's alpha equals

Cronbach's can be shown to provide a lower bound for reliability under rather mild assumptions. [citation needed] Thus, the reliability of test scores in a population is always higher than the value of Cronbach's in that population. Thus, this method is empirically feasible and, as a result, it is very popular among researchers. Calculation of Cronbach's is included in many standard statistical packages such as SPSS and SAS.[3]

As has been noted above, the entire exercise of classical test theory is done to arrive at a suitable definition of reliability. Reliability is supposed to say something about the general quality of the test scores in question. The general idea is that, the higher reliability is, the better. Classical test theory does not say how high reliability is supposed to be. Too high a value for , say over .9, indicates redundancy of items. Around .8 is recommended for personality research, while .9+ is desirable for individual high-stakes testing.[4] These 'criteria' are not based on formal arguments, but rather are the result of convention and professional practice. The extent to which they can be mapped to formal principles of statistical inference is unclear.

Evaluating items: P and item-total correlations

Reliability provides a convenient index of test quality in a single number, reliability. However, it does not provide any information for evaluating single items. Item analysis within the classical approach often relies on two statistics: the P-value (proportion) and the item-total correlation (point-biserial correlation coefficient). The P-value represents the proportion of examinees responding in the keyed direction, and is typically referred to as item difficulty. The item-total correlation provides an index of the discrimination or differentiating power of the item, and is typically referred to as item discrimination. In addition, these statistics are calculated for each response of the oft-used multiple choice item, which are used to evaluate items and diagnose possible issues, such as a confusing distractor. Such valuable analysis is provided by specially-designed psychometric software.

Alternatives

Classical test theory is an influential theory of test scores in the social sciences. In psychometrics, the theory has been superseded by the more sophisticated models in item response theory (IRT) and generalizability theory (G-theory). However, IRT is not included in standard statistical packages like SPSS, but SAS can estimate IRT models via PROC IRT and PROC MCMC and there are IRT packages for the open source statistical programming language R (e.g., CTT). While commercial packages routinely provide estimates of Cronbach's , specialized psychometric software may be preferred for IRT or G-theory. However, general statistical packages often do not provide a complete classical analysis (Cronbach's is only one of many important statistics), and in many cases, specialized software for classical analysis is also necessary.

Shortcomings

One of the most important or well-known shortcomings of classical test theory is that examinee characteristics and test characteristics cannot be separated: each can only be interpreted in the context of the other. Another shortcoming lies in the definition of reliability that exists in classical test theory, which states that reliability is "the correlation between test scores on parallel forms of a test".[5] The problem with this is that there are differing opinions of what parallel tests are. Various reliability coefficients provide either lower bound estimates of reliability or reliability estimates with unknown biases. A third shortcoming involves the standard error of measurement. The problem here is that, according to classical test theory, the standard error of measurement is assumed to be the same for all examinees. However, as Hambleton explains in his book, scores on any test are unequally precise measures for examinees of different ability, thus making the assumption of equal errors of measurement for all examinees implausible (Hambleton, Swaminathan, Rogers, 1991, p. 4). A fourth, and final shortcoming of the classical test theory is that it is test oriented, rather than item oriented. In other words, classical test theory cannot help us make predictions of how well an individual or even a group of examinees might do on a test item.[5]

See also

Notes

  1. ^ National Council on Measurement in Education http://www.ncme.org/ncme/NCME/Resource_Center/Glossary/NCME/Resource_Center/Glossary1.aspx?hkey=4bb87415-44dc-4088-9ed9-e8515326a061#anchorC Archived 22 July 2017 at the Wayback Machine
  2. ^ Traub, R. (1997). Classical Test Theory in Historical Perspective. Educational Measurement: Issues and Practice 16 (4), 8–14. doi:doi:10.1111/j.1745-3992.1997.tb00603.x
  3. ^ Pui-Wa Lei and Qiong Wu (2007). "CTTITEM: SAS macro and SPSS syntax for classical item analysis". Behavior Research Methods. 39 (3): 527–530. doi:10.3758/BF03193021. PMID 17958163.
  4. ^ Streiner, D. L. (2003). "Starting at the Beginning: An Introduction to Coefficient Alpha and Internal Consistency". Journal of Personality Assessment. 80 (1): 99–103. doi:10.1207/S15327752JPA8001_18. hdl:11655/5356. PMID 12584072. S2CID 3679277.
  5. ^ a b Hambleton, R., Swaminathan, H., Rogers, H. (1991). Fundamentals of Item Response Theory. Newbury Park, California: Sage Publications, Inc.

References

  • Allen, M.J., & Yen, W. M. (2002). Introduction to Measurement Theory. Long Grove, IL: Waveland Press.
  • Novick, M.R. (1966) The axioms and principal results of classical test theory Journal of Mathematical Psychology Volume 3, Issue 1, February 1966, Pages 1-18
  • Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading MA: Addison-Welsley Publishing Company

Further reading

  • Gregory, Robert J. (2011). Psychological Testing: History, Principles, and Applications (Sixth ed.). Boston: Allyn & Bacon. ISBN 978-0-205-78214-7.
  • Hogan, Thomas P.; Brooke Cannon (2007). Psychological Testing: A Practical Introduction (Second ed.). Hoboken (NJ): John Wiley & Sons. ISBN 978-0-471-73807-7.

Read other articles:

Arena KamiNama lengkapShimane Prefectural Hamayama GymnasiumLokasiIzumo, Shimane, JepangPemilikPrefektur ShimaneOperatorPemerintah Prefektur ShimaneKapasitas2,840KonstruksiBiaya Situs webhttp://www.hamayamakoen.jp/taikukan.htmlArena Kami adalah arena di Izumo, Shimane, Jepang.[1] Referensi ^ Hamayama Park (Maret 2018). 体育館. Diarsipkan dari versi asli tanggal 2022-01-29. Diakses tanggal 27 Maret 2018.  Koordinat: 35°23′2.3″N 132°42′23.6″E / 35.383...

 

 

Organic compound containing a –C(=O)OH group COOH redirects here. For the Bulgarian musician, see Ivan Shopov. Not to be confused with Carbolic acid. Structure of a carboxylic acid Carboxylate anion 3D structure of a carboxylic acid In organic chemistry, a carboxylic acid is an organic acid that contains a carboxyl group (−C(=O)−OH)[1] attached to an R-group. The general formula of a carboxylic acid is often written as R−COOH or R−CO2H, sometimes as R−C(O)OH with R re...

 

 

Un refroidisseur par évaporation est un dispositif qui refroidit l'air par évaporation de l'eau. Principe Le refroidissement par évaporation diffère des systèmes typiques de climatisation, qui utilisent des cycles de réfrigération compression de vapeur ou absorption. Le refroidissement par évaporation utilise le fait que l'eau absorbe une quantité relativement importante de chaleur pour s'évaporer (c'est-à-dire qu'elle a une grande enthalpie de vaporisation). La température de l'a...

Extremely Wicked, Shockingly Evil and VilePoster resmiSutradaraJoe BerlingerProduser Michael Costigan Nicolas Chartier Joe Berlinger Ara Keshishian Michael Simkin SkenarioMichael WerwieBerdasarkanThe Phantom Prince: My Life with Ted Bundyoleh Elizabeth KendallPemeran Zac Efron Lily Collins Kaya Scodelario Haley Joel Osment Jim Parsons John Malkovich Penata musikMarco BeltramiDennis SmithSinematograferBrandon TrostPenyuntingJosh SchaefferPerusahaanproduksi COTA Films Voltage Pictures Thi...

 

 

Kramer vs. KramerPoster film asliSutradaraRobert BentonProduserRichard FischoffStanley R. JaffeSkenarioRobert BentonBerdasarkanKramer vs. Krameroleh Avery CormanPemeranDustin HoffmanMeryl StreepJustin HenryJane AlexanderPenata musikPaul GemignaniHerb HarrisJohn KanderErma E. LevinRoy B. YokelsonAntonio VivaldiSinematograferNéstor AlmendrosPenyuntingGerald B. GreenbergDistributorColumbia PicturesTanggal rilis 19 Desember 1979 (1979-12-19) Durasi105 menitNegaraAmerika SerikatBahasaI...

 

 

Chinese political slogan This article is part of a series aboutXi Jinping Xi Jinping Administration 2012 election as General Secretary 2017 reelection as General Secretary 2022 reelection as General Secretary New Zhijiang Army Policies and theories Belt and Road Initiative Chinese Dream Common prosperity Four Confidences Four Comprehensives Comprehensively Deepening Reforms Chinese-style modernization Foreign policy Eight Musts Eight-point Regulation New productive forces Targeted Poverty Al...

† Египтопитек Реконструкция внешнего вида египтопитека Научная классификация Домен:ЭукариотыЦарство:ЖивотныеПодцарство:ЭуметазоиБез ранга:Двусторонне-симметричныеБез ранга:ВторичноротыеТип:ХордовыеПодтип:ПозвоночныеИнфратип:ЧелюстноротыеНадкласс:Четвероно...

 

 

Federal political party Green Party of Canada Parti vert du CanadaLeaderElizabeth May[1]Deputy LeadersJonathan Pedneault[1] Rainbow Eyes[2]PresidentNatalie Odd[3]Founded1983; 41 years ago (1983)[4]Headquarters116 Albert StreetSuite 812Ottawa, Ontario[5]Youth wingYoung Greens of CanadaMembership (2022) 22,000[6]IdeologyGreen politicsContinental affiliationFederation of the Green Parties of the Americas[7]Internat...

 

 

Cet article est une ébauche concernant un aéroport chinois. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Aéroport de Manzhouli Xijiao满洲里西郊机场Mǎnzhōulǐ Xijiāo Jīchǎng Localisation Pays Chine Ville Manzhouli Coordonnées 49° 34′ 00″ nord, 117° 19′ 48″ est Informations aéronautiques Code IATA NZH Code OACI ZBMZ Type d'aéroport Civil Gestionnaire HNA...

Questa voce sull'argomento film commedia è solo un abbozzo. Contribuisci a migliorarla secondo le convenzioni di Wikipedia. Segui i suggerimenti del progetto di riferimento. Ragazzi della marinaLyla Rocco e Silvio Noto in una sequenza del filmPaese di produzioneItalia Anno1958 Durata91 min Generecommedia RegiaFrancesco De Robertis, assistente alla regia Dore Modesti e Armando Dossena SoggettoDore Modesti, Dino Bartolo Partesano, Ranieri Cochetti SceneggiaturaFrancesco De Robertis Produ...

 

 

Questa voce o sezione sull'argomento stadi non cita le fonti necessarie o quelle presenti sono insufficienti. Puoi migliorare questa voce aggiungendo citazioni da fonti attendibili secondo le linee guida sull'uso delle fonti. Segui i suggerimenti del progetto di riferimento. Rogers CentreSkyDome Informazioni generaliStato Canada Ubicazione1 Blue Jays WayToronto, Ontario M5V 1J3 Inizio lavori3 ottobre 1986 Inaugurazione3 giugno 1989 Costo570 000 000 C$ ProprietarioRogers C...

 

 

German tennis player Cilly AussemAussem in 1927Full nameCäcilia Edith AussemCountry (sports) Weimar Republic  Nazi Germany (1933–1934)Born(1909-01-04)4 January 1909Cologne, German EmpireDied22 March 1963(1963-03-22) (aged 54)Portofino. ItalyRetired1935PlaysRight-handed (one-handed backhand)SinglesCareer record240–71 (77.2%)Career titles45Highest rankingNo. 2 (1930)Grand Slam singles resultsFrench OpenW (1931)WimbledonW (1931)DoublesGrand Slam ...

豪栄道 豪太郎 場所入りする豪栄道基礎情報四股名 澤井 豪太郎→豪栄道 豪太郎本名 澤井 豪太郎愛称 ゴウタロウ、豪ちゃん、GAD[1][2]生年月日 (1986-04-06) 1986年4月6日(38歳)出身 大阪府寝屋川市身長 183cm体重 160kgBMI 47.26所属部屋 境川部屋得意技 右四つ・出し投げ・切り返し・外掛け・首投げ・右下手投げ成績現在の番付 引退最高位 東大関生涯戦歴 696勝493敗...

 

 

American entrepreneur and philanthropist (1859–1936) Aaron E. NusbaumBorn(1859-01-08)January 8, 1859DiedJuly 1, 1936(1936-07-01) (aged 77)Other namesAaron E. NormanOccupationsentrepreneurphilanthropist Aaron E. Nusbaum (January 8, 1859 – July 1, 1936), later Aaron Norman, was an American entrepreneur and philanthropist who is best known as one of the two men who acquired 50% of the stock in the fledgling Sears, Roebuck and Co. from Richard Sears and started it on th...

 

 

Brannan & The Embarcadero Muni Metro trains at Brannan & The Embarcadero station in May 2012General informationLocationThe Embarcadero at Brannan StreetSan Francisco, CaliforniaCoordinates37°47′3.72″N 122°23′17.33″W / 37.7843667°N 122.3881472°W / 37.7843667; -122.3881472Line(s)Muni Metro ExtensionPlatforms1 high level island platform2 low level side platformsTracks2ConstructionAccessibleYesHistoryOpenedJanuary 10, 1998[1][2]Ser...

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (أبريل 2016) تسعة أجزاء من الرغبة النوع الفني تراجيديا  المؤلف هيذر رافو  تعديل مصدري - تعديل   تسعة أجزاء من الرغبة (Nine Parts of Desire) هي مسرحية كتبها هيذر رافو. في النسخ...

 

 

Town in Balqa Governorate, JordanFuheis الفحيصTownFuheisLocation in JordanCoordinates: 32°01′N 35°46′E / 32.017°N 35.767°E / 32.017; 35.767Grid position223/157Country JordanGovernorateBalqa GovernorateMunicipality established1962Government • TypeMunicipality • MayorOmar AkroushArea • Metro7 sq mi (17 km2)Elevation2,425−3,445 ft (740−1,050 m)Population (2021)[1] •...

 

 

この記事には複数の問題があります。改善やノートページでの議論にご協力ください。 出典がまったく示されていないか不十分です。内容に関する文献や情報源が必要です。(2022年11月) 脚注による出典や参考文献の参照が不十分です。脚注を追加してください。(2022年11月) ほとんどまたは完全に一つの出典に頼っています。(2022年11月) 独自研究が含まれている�...

Artikel ini membutuhkan rujukan tambahan agar kualitasnya dapat dipastikan. Mohon bantu kami mengembangkan artikel ini dengan cara menambahkan rujukan ke sumber tepercaya. Pernyataan tak bersumber bisa saja dipertentangkan dan dihapus.Cari sumber: Lembaga Pemasyarakatan Kerobokan – berita · surat kabar · buku · cendekiawan · JSTOR (Desember 2013) Pintu masuk penjara Lembaga Pemasyarakatan Kerobokan adalah sebuah lembaga pemasyarakatan atau penjara yang...

 

 

لا-مونتانييه   الاسم الرسمي (بالفرنسية: Challes-la-Montagne)‏[1](بالفرنسية: Challes)‏[1]  الإحداثيات 46°07′30″N 5°27′52″E / 46.125°N 5.4644444444444°E / 46.125; 5.4644444444444 [2]  [3] تقسيم إداري  البلد فرنسا[4]  التقسيم الأعلى آن  خصائص جغرافية  المساحة 7.65 كيلومتر ...