Critically evaluate the use of psychometric assessment and research on individual differences in clinical settings.

2nd Year Undergraduate Essay

Psychometric testing may be used to determine capacity, comprehension, disability and resilience in individuals in clinical settings (Lichtenberger & Kaufman, 2013.) In the context of psychological and neuropsychological assessment, psychometric tests assess the suitability of therapies or interventions and availability of resources- although psychometric tests can be used in many other settings, such as in educational or work contexts (British Psychological Society, 2019; American Psychological Association, 2019). Clinicians use the results of psychometric tests to influence care planning, overall assessment of needs, for legal purposes, and potentially as evidence in court (Lichtenberger & Kaufman, 2013), but must be used in conjunction with other tools to give a bigger picture. The biggest issues with psychometric tests are; that they are not always appropriate for certain individuals with low motivation, they struggle to compete with physiological tests where comparable, and sometimes lack construct validity- particularly in the case of intelligence tests. Therefore, it cannot be overstated that psychometric tests can only be used as part of a wider assessment to cope with individual differences (British Psychological Society, 2019; American Psychological Association, 2019.)

Arguably the biggest problem with psychometric tests is that they are altered by motivation, both lack of motivation, or having an ulterior motive. On one side, they require a significant amount of effort from the individual who needs the assessment, which can be extremely difficult for someone suffering from a mental disorder for example (Perez-Achiaga, Nelson, & Hassiotis, 2009; Strydom & Hassiotis, 2003). In individuals suffering with depression for example, completing a task which is perceived as being hard (such as a lengthy intelligence test), leads to physical and mental responses that alter their ability to perform well and get the most accurate results (Silvia et al., 2016; Silvia, Nusbaum, Eddington, Beaty, & Kwapill, 2014; Gendolla, 2000.) This is a critical issue in clinical settings specifically as patients will differ vastly in motivation levels, which could influence their psychometric results, and therefore alter their overall assessment of needs. This can also be seen in the opposite direction, with individuals who are given a monetary incentive performing better on intelligence testing for instance (Duckworth, Quinn, Lynam, Loeber, & Stouthamer-Loeber, 2011). This situation is not exclusive to intelligence tests, and has been highlighted in other psychometric tests, such as the Beck Depression Inventory (Beck & Steer, 1987) where individuals can exaggerate symptoms and dramatically affect their results (Thombs et al., 2010.)

Nevertheless, psychologists have tried to combat this by both creating shortened versions of frequently used tests and trying to detect other motives. Short tests such as the shortened version of WAIS-IV (Wechsler, 2010; Axelrod 2002) have been developed to help patients who have low motivation by simply trying to make the task less strenuous and daunting. Malingering tests have been developed to try and test for any other motives, in tests such as the test of memory malingering (Tombaugh, 1996.) The test of memory malingering (TOMM) is focused on neurological impairments and is a strong example of how to combat malingering in general. TOMM has been validated several times (Weiborn, Orr, Woods, Conover, & Feix, 2003; Webber et al., 2018) and is a strong indicator of faking or exaggerating symptoms. Administering this test would therefore hope to distinguish between individual differences in memory impairment symptoms, and any ulterior motives. Furthermore, overcoming a lack of motivation by using a quicker test doesn’t always seem to work in real life- mainly because the test takes much longer than expected to administer than expected (Axelrod, 2001). Axelrod (2001) for example, looked at actual administration times for WAIS compared to the manual’s estimation, and concluded that 88% of participants took longer than the manual suggested. Also, these participants were mostly men (96%) who had on average 12 years of education, who had been referred for neuropsychological evaluation (Axelrod, 2001)- they were not individuals with severe mental disorders, which highlights just how much longer these tests must take for individuals with severe mental disorders or learning disabilities. Even though this study was based on WAIS-III and not the most recent test, several other researchers have highlighted similar issues with WAIS-IV (Raiford, Coalson, Saklofske & Weiss, 2010) and so perhaps more measures need to be implemented to allow individuals to do the test in sections, or to prevent extremely lengthy tests.

Some psychometric tests also struggle to completely encompass individual differences within what they are trying to assess, and therefore lack construct validity- the most obvious being intelligence tests. Many researchers have argued that intelligence tests are incomplete because they miss many cognitive features that could be defined as intelligence, such as emotional and social intelligence, and even decision making (Stanovich, 2009). This study specifically refers to the fifth edition of the Stanford- Binet Intelligence Scale however, which is now mostly out of use due to the popularity of WAIS-IV. The WAIS-IV test as mentioned previously, has on the contrary been praised for its construct validity and the range of psychological profiles you can compare results to (Collinson et al, 2017; Canivez, Watkins, good, James, & James, 2017.) This is likely because the latest version of WAIS (Wechsler, 2008) tries to encompass four ‘indexes’-verbal comprehension, perceptual reasoning, working memory and processing speed (Wechsler,2008.) These four indexes have strong construct validity in themselves (Lichtenberger, & Kaufman, 2013) and seem to represent a wide range of intelligences and abilities, compared to more historic IQ tests that focus on more on more stereotypical mathematical abilities (Benson, 2003.)

Nevertheless, there are still moral quandaries surrounding the clinical emphasis on trying to measure intelligence definitively (Rust & Golombok, 2009.) Ultimately, IQ tests can follow a person through care and give detrimental labels. This can be especially detrminetal if an individuals IQ score falls between standard deviations, as this lead to them being passed from service to service, and therefore can slow them getting help (Robstad, Siebler, Soderhamn, Westergren, & Fegrean, 2018). Therefore, it begs the question if intelligence testing are best for the patient or if there are more efficient forms of psychological treatment, such as by using dynamic assessment (Haywood & Lidz, 2007) and outcome feedback (Delgadillo et al., 2017). Dynamic assessment is a more interactive approach to assessment as its name suggests, and involves intervention measures (Haywood & Lidz, 2007.) This perhaps used in conjunction with outcome feedback, which is where psychotherapists regularly talk to patients about their progress (Delgadillo et al., 2017), could lead to a much more flexible and rounded understanding of an individual’s intellectual ability. Furthermore, some psychologists argue that intelligence tests need to be more flexible because of several fundamental flaws. Firstly, IQ tests are reflective of the society they are created by- and so reflect the same inequalities (Rust & Golombok, 2009 p.24). A good example of this is the infamous use of ‘ugly’ and ‘pretty’ faces in early Stanford-Binet IQ tests (Nicolas, Andrieu, Croizet, Sanitioso, & Burman, 2013) which highlights the social bias that can be evident in assessment without even realising it. Early IQ tests also seemed to be Westernised in their understanding of intelligence by stressing computational and mathematical ability as previously mentioned (Benson, 2003.) This arguably seems to be getting more diverse and inclusive as more cultural IQ profiles are established (Rust & Golombok, 2009) and wider expressions of intelligence are included in WAIS-IV. Although, as Nelson and Pontón (2007) argue, accurately establishing which cultural profile an individual should be compared to can be rather difficult, and rather reductive. Nevertheless, researchers have highlighted that IQ testing should not be abandoned, but rather that we should try and make this as fair and representative as possible going forward (Rust & Golombok, 2009 p25).

Another issue with psychometric tests is that they are not always the most effective form of testing in certain fields, such as in neuropsychology, where physiological tests are favoured for diagnosis because they are so definitive. The Psychometric Hepatic Encephalopathy Score (PHES) for example, is used widely to help diagnose individuals who are displaying atypical cognitive abilities, by determining whether their decline in ability matches the profile of hepatic encephalopathy or could be explained by individual differences. The PHES consists of five psychometric tests that aim to detect and diagnose hepatic encephalopathy (Weissenborn, 2013), which is in simple terms, a decline in brain function due to liver failure. PHES aims to compare psychometric test results to a set of predetermined criteria that are typical for hepatic encephalopathy patients, to diagnose hepatic encephalopathy and to determine the degree of severity (Nabi & Bajaj, 2015.) PHES has many advantages, most importantly the test is cheap and relatively easy to administer when compared to other diagnostic tools such as brain scans (Nabi & Bajaj, 2015) which require more specialised training. This is particularly important with current funding cuts making keeping the cost of diagnostic tests as low as possible a high priority.

However, there is a lack consensus surrounding the consistency and accessibility of PHES, and some researchers have suggested that a physiological test would be more reliable (Felipo et al., 2014.) Nabi and Bajaj (2015) highlight for instance, that PHES is hard to access outside of Europe, particularly because the comparative test results used in diagnosis are European, and therefore are a Eurocentric representation of how the disorder presents itself which may differ worldwide. Perhaps then as Felipo et al., (2014) suggests, blood flow measurement would provide a less biased diagnosis, and would potentially detect hepatic encephalopathy earlier, as it focuses on the root cause of the condition, and not a symptom as the impaired cognitions that PHES focuses on are. Using a physiological test would be free from bias and interpretation and would be clearer cut than a psychometric test. Nevertheless, as previously mentioned this test demands specialist tools and a trained tester which PHES does not- meaning that the budget of the facility, and the training of staff would determine what test would be most appropriate. Ideally, both physiological and psychometric testing would be used side by side, to give the most in-depth evaluation of a patient’s condition.

In conclusion, psychometric tests are far from perfect when it comes to accurately encompassing individual differences in clinical settings, but they have improved dramatically over the last century (Rust & Golombok, 2009.) Individual differences in motivation still have too much bearing on results for example, and intelligence tests still struggle to cross cultural boundaries to fully incorporate worldwide notions of intelligence. Lastly, psychometric tests should be used in combination with other tools, such as physiological tests, to get a complete idea of an individual needs and abilities.

Bibliography

American Psychological Association. (2019). Understanding psychological testing and assessment. Retrieved from https://www.apa.org/helpcenter/assessment.

Axelrod, B. (2001). Administration duration for the Wechsler Adult Intelligence Scale-III and Wechsler Memory Scale-III. Archives of Clinical Neuropsychology, 16 (3), 293-301.

Axelrod, J. (2002). Validity of the Wechsler Abbreviated Scale of Intelligence and Other Very Short Forms of Estimating Intellectual Functioning. Assessment, 9,17-23. doi: 10.1177/1073 191102009001003.

Beck, A., & Steer, R. (1987). Manual for the Revised Beck Depression Inventory. TX: Psychological Corporation.

Benson, E. (2003, February 3). Intelligent intelligence testing Psychologists are broadening the concept of intelligence and how to test it. Monitor on Psychology, p. 48.

British Psychological Society. (2019). About Psychological Testing. Retrieved from https://ptc.bps.org.uk/tests-and-testing/psychological-tests/about-psychological-testing.

Canivez, G., Watkins, M., Good, R., James, K., & James, T. (2017). Construct validity of the Wechsler Intelligence Scale for Children – Fourth UK Edition with a referred Irish sample: Wechsler and Cattell–Horn–Carroll model comparisons with 15 subtests. British Journal of Educational Psychology, 87, 383-407. doi: 10.1111/bjep.12155.

Collinson, R., Evans, S., Wheeler, M., Brechin, D.o.n., Moffitt, J., Hill, G., & Muncer, S. (2016). Confirmatory Factor Analysis of WAIS-IV in a Clinical Sample: Examining a Bi-Factor Model. Journal of Intelligence, 5, 2-5. doi: 10.3390/jintelligence5010002.

Duckworth, A., Quinn, P., Lynam, D., Loeber, R., & Stouthamer-Loeber, M. (2011). Role of test motivation in intelligence testing. Proceedings of the National Academy of Sciences, 108 (19), 7716-7720. doi: 10.1073/pnas.1018601108.

Felipo, V., Urios, A., Giménez-Garzó, C., Cauli, O., Andrés-Costa, M., González, O. … Montoliu, C. (2014). Non invasive blood flow measurement in cerebellum detects minimal hepatic encephalopathy earlier than psychometric tests. World Journal of Gastro-enterolology, 20 (33), 11815-11825. doi: 10.3748/wjg.v20.i33.11815.

Fletcher, R., & Hattie, J. (2011). Intelligence and Intelligence Testing. New York : Routledge.

Gendolla, G. H. E. (2000). On the impact of mood on behavior: An integrative theory and a review. Review of General Psychology, 4(4), 378-408. doi :10.1037/1089-2680.4.4.378.

Haywood, C. & Lidz, C. (2007). Dynamic Assessment in Practice. Cambridge: University Press.

Lichtenberger, E.O. & Kaufman, A.S. (2013). Essentials of WAIS-IV Assessment (2nd ed.). New Jersey: John Wiley & Sons.

Nabi, E., & Bajaj, J. (2013). Useful Tests for Hepatic Encephalopathy in Clinical Practice. Current Gastroenterology Reports, 16 (1). doi: 10.1007/s11894-013-0362-0.

Nelson, N. & Pontón , M. (2007). The Art of Clinical Neuopsychology. In A. Ardila (Ed.) International Handbook of Cross-Cultural Neuropsychology (pp. 45-62). New York: Routledge.

Nicolas, S., Andrieu, B., Croizet, J., Sanitioso, R., & Burman, J. (2013). Sick? Or slow? On the origins of intelligence as a psychological object. Intelligence, 41 (5), 699-711. doi: 10.1016/j.intell.2013.08.006.

Perez-Achiaga, N., Nelson, S., & Hassiotis, A. (2009). Instruments for the detection of depressive symptoms in people with intellectual disabilities: A systematic review. Journal of Intellectual Disabilities, 13, 55-76. doi: 10.1177/1744629509104487.

Raiford, S., Coalson, D., Saklofske, D., & Weiss, L. (Eds.) (2010). WAIS-IV Clinical Use and Interpretation. London: Academic Press.

Robstad, N., Siebler, F., Soderhamn, U., Westergren, T., & Fegran, L. (2018). Design and psychometric testing of instruments to measure qualified intensive care nurses’ attitudes toward obese intensive care patients. Research in Nursing & Health, 41 (6), 525-534. doi: 10.1002/nur.21914.

Rust, J., & Golombok, S. (2009). Modern Psychometrics: The Science of Psychological Assessment. East Sussex: Routledge.

Silvia, P. J., Mironovová, Z., McHone, A. N., Sperry, S. H., Harper, K. L., Kwapil, T. R., & Eddington, K. M. (2016). Do depressive symptoms “blunt” effort? An analysis of cardiac engagement and withdrawal for an increasingly difficult task. Biological psychology118, 52-60. doi: 10.1016/j.biopsycho.2016.04.068.

Silvia, P. J., Nusbaum, E. C., Eddington, K. M., Beaty, R. E., & Kwapil, T. R. (2014). Effort Deficits and Depression: The Influence of Anhedonic Depressive Symptoms on Cardiac Autonomic Activity During a Mental Challenge. Motivation and emotion38(6), 779-789. doi: 10.1007/s11031-014-9443-0.

Stanovich, K. (2009). What Intelligence Tests Miss: The Psychology of Rational Thought. New Haven: Yale University Press.

Strydom, A. & Hassiotis, A. (2003). Diagnostic instruments for dementia in older people with intellectual disability in clinical practice. Aging & Mental Health, 7 (6), 431-437. doi: 10.1080/13607860310001594682.

Thombs, B. D., Ziegelstein, R. C., Pilote, L., Dozois, D. J., Beck, A. T., Dobson, K. S., Fuss, S., de Jonge, P., Grace, S. L., Stewart, D. E., Ormel, J., … Abbey, S. E. (2010). Somatic symptom overlap in Beck Depression Inventory-II scores following myocardial infarction. The British journal of Psychiatry197, 61-6. doi: 10.1192/bjp.bp.109.076596.

Tombaugh, T.N. (1996). The Test of Memory Malingering. Toronto: Multi-Health Systems.

Webber, T., Bailey, K., Alverson, W., Critchfield, E., Bain, K., Messerly, J. … Soble, J. (2018). Further Validation of the Test of Memory Malingering (TOMM) Trial 1 Performance Validity Index: Examination of False Positives and Convergent Validity. Psychological Injury and Law,11 (4), 323-335. doi: 10.1007/s12207-018-9335-9.

Wechsler, D. (2008). Wechsler Adult Intelligence Scale (WAIS-IV) (4th ed.). San Antonio, TX: Pearson.

Wechsler, D. (2008). WAIS-IV : Wechsler adult intelligence scale. Tex: Psychological Corporation.

Weiborn, M., Orr, T., Woods, S.P., Conover, E., & Feix, J. (2003). A validation of the test of memory malingering in a forensic psychiatric setting.. Journal of Clinical and Experimental Neuropsychology, 25 (7), 979-990. doi: 10.1076/jcen.25.7.979.16481.

Weissenborn, K. (2013). Psychometric tests for diagnosing minimal hepatic encephalopathy. Metabolic Brain Disease, 28 (2), 227-229. doi:10.1007/s11011-012-9336-4.

Scroll to Top