Year : 2016 | Volume
: 2 | Issue : 1 | Page : 1--5
From rote to reasoning: The paradigm shift required in medical entrance examination and beyond!
Professor of Surgery, Maulana Azad Medical College (University of Delhi) and Associated Lok Nayak Hospital, New Delhi - 110 002, India
Professor of Surgery, Maulana Azad Medical College (University of Delhi) and Associated Lok Nayak Hospital, New Delhi - 110 002
|How to cite this article:|
Lal P. From rote to reasoning: The paradigm shift required in medical entrance examination and beyond!.MAMC J Med Sci 2016;2:1-5
|How to cite this URL:|
Lal P. From rote to reasoning: The paradigm shift required in medical entrance examination and beyond!. MAMC J Med Sci [serial online] 2016 [cited 2020 Feb 16 ];2:1-5
Available from: http://www.mamcjms.in/text.asp?2016/2/1/1/174849
Medical education is in need of a drastic change in its system not only at the earliest stage of selection of potential doctors, but also thereafter in progressing them through various subjects and even later when they strive to pursue their specialty or super-specialty training. At all stages right from schooling to college and then to professional education, the Indian system relies heavily on rote-based learning, requiring extensive mugging up of a bucket load of facts, waiting to be vomited out on the day of the examination without the need for understanding the facts at all. The one who does this best wins this race for selection and also to become a person who is supposed to have skills of judgment for treatment of the human race and save them from disease and infirmity and even take crucial spot decisions to save them from their illness and possible death!
It is a pity that the number of legs a centipede has or the type of reproductive system in a spider and many such irrelevant facts easily retrievable from the use of technology such as “Google,” are supposed to be learnt by heart by budding medical stream students to gain entry into a medical college as this is what is asked in the rote-based entrance examinations! What is most interesting to note here is that none of this knowledge would ever be needed by the student in his entire medical career. So, be it for entrance or for passing from one professional semester to the next during the MBBS course, rote-based multiple choice questions (MCQs) is what makes our question banks and are used for evaluation. Even for the postgraduate entrance examination, similar rote-based questions are asked in limited time such as MCQ examination which have reduced and forced our doctors to rote learners based on the previous years' papers and the rote-based question banks taught to them in the coaching classes.
Rote learning is the memorization of information based on repetition or recall of facts. In school education, the two biggest examples of rote learning are the alphabet and numbers, followed by multiplication tables and spelling words. At the high school level, the elements and their chemical numbers are usually memorized by rote. Is rote learning an outdated technique or is there a valid place for its use in the classroom today? Associative learning, metacognition, and critical thinking are increasingly being used as a functional foundation to higher levels of learning in place of rote-based learning. Rote memorization is a learning process that involves repeating information until it is remembered verbatim. Actors and singers often use rote memorization when they have to learn the lines of a play or a song. Students start using rote memorization from early in school, preparing for spelling tests or memorizing definitions of terms, names of presidents, verb forms in a foreign language, and multiplication tables, among other things. Rote memorization is different from meaningful learning, where the material is applied to other ideas and connections are made between concepts. Acronyms and chain mnemonics are routinely used by students toward rote learning.
In education, cramming (also known as mugging or swotting, from swot, akin to “sweat,” meaning “to study with determination”) is the practice of working intensively to absorb large volumes of informational material in short amounts of time. H.E. Gorst stated in his book, “The curse of education,” “as long as education is synonymous with cramming on an organized plan, it will continue to produce mediocrity.” But, why is it that we need to differentiate the two forms of learning? This may be easier to understand when we understand the difference between memorization and intelligence.
The Difference between Memory and Intelligence
The mental ability to memorize is often used as an indicator of intelligence. No doubt, the two are strongly linked, but memory is not always a reliable indicator of intelligence. Working memory does not directly affect the level of intelligence of a student. Most of the time, a deficit in working memory is due to the structure of learning. We can compare working memory to a filing cabinet where each piece of information has a separate file and, therefore, finding the information becomes difficult. Factors such as stress, lack of sleep, and distractions make it even harder in finding the information. Effective memorization would involve categorization of the information and sections within sections of the filing cabinet, thus making it easier to retrieve the information.
On the other hand, general intelligence has been variously referred to as “g” and is described as being composed of verbal comprehension, perceptual organization, working memory, and processing speed., As is clear, working memory is only a small component of intelligence, and this fact needs to be well-known to teachers and others who are making and executing selection processes.
Medical Entrance Examinations
Let us have a look at a sample question for biology for All India Pre Medical Test preparation series. It reads, “The echinoderms that lack distinct arms are the: (a) Sea Urchins (b) Asteroidea (c) Brittle stars (d) Sea stars.” Another one from physics reads, “Which of the following is not a basic conservation law in physics: (a) Conservation of mass (b) conservation of energy (c) conservation of momentum (d) conservation of charge.” Finally, a question from chemistry reads, “Set representing the correct order of second ionization potential is (a) Ba > K = Ca (b) K > Ba > Ca (c) K > Ca > Ba (d) Ca > Ba > K.” Let us now compare these with questions from the national examination conducted by the United Kingdom, the UK Common Admission Test (UKCAT) for entry to medical schools in the UK. This unique examination tests verbal reasoning (VR), quantitative reasoning, abstract reasoning, decision analysis, and situational judgment, and nothing is asked from physics, chemistry, botany, or zoology or from any medical subjects to be taught in the medical school itself. A series of questions follow a paragraph stating details of content which needs to be analyzed and interpreted meaningfully, based on each of the five sections. Let me cite an example which has a relatively short paragraph, from the section on Situational Judgment. It reads, “A medical student, Cameron, is told by a patient that a senior doctor frequently swears loudly on the ward which makes him so uncomfortable that he does not want to stay in the hospital. Cameron consults a nurse on the ward, and she tells him that she has not ever witnessed this behavior by the senior doctor. The nurse reminds Cameron that the patient might just dislike being in hospital. Cameron is unsure what to do because the senior doctor is marking one of his assessments. How important to take into account are the following considerations for Cameron when deciding how to respond to the situation? The senior doctor is marking one of his assessments: (a) Very important (b) important (c) of minor importance (d) not important at all.”
The UKCAT test neither contains any curriculum or science content nor can it be revised for. As is obvious from the above example, it focuses on exploring the cognitive powers of the candidates and other attributes considered to be valuable for health care professionals. The 2015 format of the test includes a 120 min of test comprising 44 items on VR in 22 min, 36 items of quantitative reasoning in 25 min, 55 items of abstract reasoning in 14 min, 28 items of decision analysis in 32 min, and finally, 68 items of situational judgment in 27 min. A pretty tall order in terms of time per item considering the complicated details in the paragraphs and the analysis is required!
Quite obviously, there is a completely different set of skills required and the aptitude tested for, in the two types of examinations in India and the UK. Whereas there is a predominantly rote knowledge being tested in the former with plethora of facts, details, and numbers requiring memorization without understanding, the test in the UK is based on an evaluation of higher order cognitive skills, assimilation skills, and reasoning. The inherent advantage of this evaluation stems from the ability to make a selection of prospective doctors who are neither under pressure to perform nor under any undue strain to complete any set curriculum. It seems reasonable not to evaluate the knowledge of the Grade XI and XII level physics, chemistry, and biology as these are already evaluated by achieving the minimum standards. However, these minimum standards are A and A+ levels equivalent to >85% or 90% in contrast to a measly 60% requirement in the Indian board examinations. Why minimum requirement for eligibility to take the examinations is so low makes one ponder if excellence is what is being evaluated at all! Hence, the starting point of eligibility for taking the UKCAT test is much higher regarding science scores required to appear for this examination. There is also no point in evaluating subjects yet to be taught in the MBBS course as has become a practice in the Indian examinations which is not only ridiculous, but also adds undue strain to the students. Whether there is a qualitative difference between the candidates selected out of the two types of tests has not been evaluated in India, but the UKCAT definitely takes the stress of preparations off the students and leaves their Class XI and XII studies much more enjoyable.
In the United States, revised Medical College Admission Test (MCAT) was introduced in 1991, which includes assessment in VR, biological sciences (BS), physical sciences (PS), and the writing sample (WS). The three multiple-choice sections of the MCAT (VR, BS, and PS) and the WS are designed to assess (1) mastery of basic concepts in biology, chemistry, and physics; (2) facility with scientific problem solving and critical thinking; and (3) communication/writing skills. The MCAT's strong emphasis remains on problem-solving in novel situations, and the fact that half of the MCAT focuses on verbal skills (the rest half is based on pure sciences) suggests that it might also predict well in more self-directed learning environments. A lot of work has been done to find out the validity of these testing methods so as to identify the correct cohort of students who will keep excelling and succeeding in the medical school rather than failing or dropping out or changing careers. Therefore, MCAT has been validated against undergraduate Grade Point Average and the United States Medical Licensing Examination and found to be fairly accurate. These changes stem from papers which argued that while traditional intelligence tests have been validated almost entirely against school performance, the evidence that they measure abilities which are essential to perform well in various life outcomes is weak. It was felt that physicians must be more than holders of vast amount of scientific information. Physicians should be able to apply the principles of scientific problem solving to evaluate situations critically and to arrive at logical solutions. Mastery of basic concepts in biology, chemistry, and physics while still considered prerequisite was not judged to be a sufficient indicator of success in medical school.
In Australia and New Zealand, selection of students into medical study directly from high school typically uses a combination of academic performance at school, performance on a standardized test (the Undergraduate Medicine and Health Sciences Admission Test [UMAT]) and an interview. In 2009, 14 Universities in Australia and New Zealand used the UMAT as part of their selection processes. Devised by the Australian Council for Educational Research, the UMAT comprises three parts: Section 1 (logical reasoning and problem solving), Section 2 (understanding people), and Section 3 (non-VR). In Section 1, students are required to exercise reasoning and problem-solving skills. Section 2 assesses a student's ability to understand and think about people. Items are based on passages of text representing specific interpersonal situations. Section 3 consists of abstract items that are designed to evaluate a student's ability to exercise non-VR skills. Predictive validity of UMAT for medical student academic performance is done periodically with a widespread desire to ensure that the test selects those students deemed more likely to become better doctors.
Unfortunately, no validity tests have been performed to date in India, even after 67 years of independence and with nearly 400 medical colleges, several standalone institutions of excellence, and seven functioning All India institutes of Medical Sciences. Data on whether our tests conducted at the end of the Class XII examination picked up candidates who went on to be successful doctors practicing as successful practitioners or were successful later in their postgraduate careers, is not known. How many dropped out during their MBBS? How many took several more years to complete it? How many changed their careers and are doing something else in life in which they are more contented, successful, and happy? All these people were possibly a result of poor selection process, who spoilt the valuable seat of an interested candidate wanting to do medicine but could not as he/she was just below the last cut-off. And, who accounts for the amount of money, resources, and effort that the country spent on making a doctor in 5½ years, who never took it up as a career later in life?
Medical Schools and Beyond
Taking the cue from this, it is high time to have a look at the examinations that process our medical students from their first semester to the ninth and thereafter when they undertake the postgraduate entrance examinations. Rote-based questions rule the roost here too, whereas there is tremendous scope to introduce and use problem-based questions evaluating cognitive skills. For example, in surgery which is my specialty, one of the questions used presently could read as: “All the following scores are used for determining prognosis in Acute Pancreatitis except: (a) Ranson's score (b) Glasgow score (c) serial C-reactive protein levels (d) APACHE II index.” This is a simple fact-based question that does not add much to the assessment of the candidate from quality and understanding. A better problem-based question evaluating higher order cognitive domain could read as: “A 35-year-old nonalcoholic male presented to the emergency with 4 days history of upper abdominal pain radiating to the back, vomiting, and abdominal distension. He was afebrile and his pulse was 120/min, blood pressure was 100/70 mmHg, and abdominal examination revealed tenderness and vague upper abdominal lump. Total leucocyte count was 24,000 and serum amylase was 2567 IU/ml. Contrast-enhanced computed tomography (CECT) of the abdomen revealed a bulky edematous pancreas with 40% necrosis. The most appropriate statement regarding this patient is: (a) This is a case of acute interstitial pancreatitis (b) infected pancreatic necrosis requires surgical debridement (c) acute necrotic collection needs to be evaluated further (d) 96 h is too early for CECT to analyze the condition.” It is easy to see the difference between the two questions and the ability to differentiate the candidates based on rote knowledge alone and those who understand the concepts and can apply cognitive domain.
Likewise, questions can be generated for each discipline that can evaluate the candidate at a deeper plane to assess their higher order thinking and analytical skills and not just eliciting factual knowledge. However, teachers have to spend far more time, effort, energy, and intellect to construct such questions. Most of the examining universities, boards, and test centers seek high-quality questions for their question banks from teachers and professors with a certain number of experiences in a particular specialty. They may be well advised to seek not more than 15 questions in a 3 week time period per teacher, so as to give teachers ample time to prepare and submit quality questions. The present practice of asking for a bulk of MCQs from each teacher who not uncommonly passes it on further to their residents should be done away with. University and examination banks would have to be standardized, completely modified, and replaced with thought provoking, higher order, analytical, and problem-based questions and keeping factual questions to the bare minimum to make the examination more effective and efficient in screening the right candidates for the job at each stage of their career. Unless this happens, we would continue to select wrong students to become doctors and once selected, encourage rote-based memorization sponsored learning which would kill their reasoned thinking and make them less efficient to face the professional requirements of this not so easy career.
A lot of research is going on presently in constructing and validating tools for knowledge and cognitive assessment of students. Techniques for creating and validating an assessment test that measures the effectiveness of instruction by probing how well that instruction causes students in a class to think like experts about specific areas of science have been described. Three statistical measures are common for collecting evidence of validity: Item difficulty, item discrimination, and reliability of the instrument. Item difficulty measures the percentage of students who answer a question correctly. Item discrimination measures how well a question discriminates between students who have performed well versus students who have performed poorly, on the assessment overall. Reliability checks the uniformity of assessment when the same student takes the test at different points of time. An article in the New England Journal of Medicine has analyzed the various domains which are required for assessment of medical students. These assessments have three main goals: To optimize the capabilities of all learners and practitioners by providing motivation and direction for future learning, to protect the public by identifying incompetent physicians, and to provide a basis for choosing applicants for advanced training.
Assessment can be formative (guiding future learning, providing reassurance, promoting reflection, and shaping values) or summative (making an overall judgment about competence, fitness to practice, or qualification for advancement to higher levels of responsibility). Formative assessments provide benchmarks to orient the learner who is approaching a relatively unstructured body of knowledge. They can reinforce students' intrinsic motivation to learn and inspire them to set higher standards for themselves. Although summative assessments are intended to provide professional self-regulation and accountability, they may also act as a barrier to further practice or training. Psychometric rigor includes reliability (precision), validity (accuracy), and social desirability (transparency). A distinction should be made between assessments that are suitable only for formative use and those that have sufficient psychometric rigor for summative use. This distinction is, especially important in selecting a method of evaluating competence for high-stakes assessments (i.e. licensing and certification examinations). Correspondingly, summative assessments may not provide sufficient feedback to drive learning.
Of all the types of written assessments, MCQs remain the most popular. MCQs provide a large number of examination items that encompass many content areas, can be administered in a relatively short period, and can be graded by computer. These factors make the administration of the examination to large numbers of trainees straightforward and standardized. Formats that ask the student to choose the best answer from a list of possible answers are most commonly used. However, newer formats may better assess processes of diagnostic reasoning. Key-feature items focus on critical decisions, in particular, clinical cases. Script-concordance items present a situation (e.g., vaginal discharge in a patient), add a piece of information (dysuria), and ask the examinee to assess the degree to which this new information increases or decreases the probability of a particular outcome (acute salpingitis due to Chlamydia trachomatis). MCQs that are rich in context are difficult to write, and those who write them tend to avoid topics – such as ethical dilemmas or cultural ambiguities – that cannot be asked about easily. MCQs may also create situations in which an examinee can answer a question by recognizing the correct option, but could not have answered it in the absence of options. This effect, called cueing, is especially problematic when diagnostic reasoning is being assessed because premature closure – arriving at a decision before the correct diagnosis has been considered – is a common reason for diagnostic errors in clinical practice.
Extended matching items (several questions, all with the same long list of possible answers), as well as open-ended short-answer questions, can minimize cueing. Structured essays also preclude cueing. Extended Matching Questions (EMQs) are similar to MCQs but have the ability to evaluate a candidate's understanding in far more depth than simple MCQs. In addition, they involve more complex cognitive processes and allow for more contextualized answers than do multiple-choice questions. When clear grading guidelines are in place, structured essays can be psychometrically robust (reliable, valid, and sensitive). The Intercollegiate Membership examination of the Royal College of Surgeons of UK, MRCS, which selects candidates after their MBBS for six years of higher surgical training is based on a Objective Structured Clinical Examination (OSCE) following an MCQ/EMQ based assessment process.
There is thus a strong need to reinvent the way we assess our students so that we can objectively differentiate by higher order cognitive domain and not just the theoretical and fact-rich rote-based evaluation. Medical teachers will have to take a lead in this respect and show the way forward for medical entrance examinations as well as progressions of the medical trainees through their subjects in the 5-year period and postgraduate entrance examinations, like our engineering colleagues have done for admission to the Indian Institutes of Technology. Assessment for selection to make doctors should be able to assess the attributes that we want in our doctors and not just rote-based knowledge and facts. Understanding and reasoning in clinical sciences needs to be supported and encouraged to make our medical students better doctors and make their selection process relevant to the careers that they are going to pursue in their lives!
|1||Gorst HE. The Curse of Education. London: Grant Richards; 1901. p. 5.|
|2||Conway AR, Kane MJ, Engle RW. Working memory capacity and its relation to general intelligence. Trends Cogn Sci 2003;7:547-52.|
|3||Kane MJ, Engle RW. Working-memory capacity and the control of attention: The contributions of goal neglect, response competition, and task set to Stroop interference. J Exp Psychol Gen 2003;132:47-70.|
|4||The UKCAT Official Guide; 2015. Available from: http://www.ukcat.ac.uk. [Last accessed on 2015 Dec 14].|
|5||Julian ER. Validity of the Medical College Admission Test for predicting medical school performance. Acad Med 2005;80:910-7.|
|6||Donnon T, Paolucci EO, Violato C. The predictive validity of the MCAT for medical school performance and medical board licensing examinations: A meta-analysis of the published research. Acad Med 2007;82:100-6.|
|7||McClelland DC. Testing for competence rather than for “intelligence”. Am Psychol 1973;28:1-14.|
|8||Wiley A, Koenig JA. The validity of the Medical College Admission Test for predicting performance in the first two years of medical school. Acad Med 1996;71 10 Suppl: S83-5.|
|9||Australian Council for Educational Research. UMAT. Undergraduate Medicine and Health Sciences Admission Test. Available from: http://www.umat.acer.edu.au/. [Last accessed on 2015 Dec 14].|
|10||Carr SE. Emotional intelligence in medical students: Does it correlate with selection measures? Med Educ 2009;43:1069-77.|
|11||Adams WK, Wieman CE. Development and validation of instruments to measure learning of expert-like thinking. Int J Sci Educ 2011;33:1289-312.|
|12||Ding L, Chabay R, Sherwood B, Beichner R. Evaluating an electricity and magnetism assessment tool: Brief electricity and magnetism assessment. Phys Rev Spec Top Phys Educ Res 2006;2:010105.|
|13||Jenny K. Biology concept assessment tools: Design and use. Microbiol Aust 2010;31:5-8.|
|14||Epstein RM. Assessment in medical education. N Engl J Med 2007;356:387-96.|
|15||Ben-David MF. The role of assessment in expanding professional horizons. Med Teach 2000;22:472-7.|
|16||Sullivan W. Work and Integrity: The Crisis and Promise of Professionalism in America. 2nd ed. San Francisco: Jossey-Bass; 2005.|
|17||Schuwirth L, van der Vleuten C. Merging views on assessment. Med Educ 2004;38:1208-10.|
|18||Case S, Swanson D. Constructing Written Test Questions for the Basic and Clinical Sciences. 3rd ed. Philadelphia: National Board of Medical Examiners; 2000.|
|19||Farmer EA, Page G. A practical guide to assessing clinical decision-making skills using the key features approach. Med Educ 2005;39:1188-94.|
|20||Charlin B, Roy L, Brailovsky C, Goulet F, van der Vleuten C. The Script Concordance test: A tool to assess the reflective clinician. Teach Learn Med 2000;12:189-95.|
|21||Frederiksen N. The real test bias: Influences of testing on teaching and learning. Am Psychol 1984;39:193-202.|
|22||Schuwirth LW, van der Vleuten CP, Donkers HH. A closer look at cueing effects in multiple-choice questions. Med Educ 1996;30:44-9.|
|23||Graber ML, Franklin N, Gordon R. Diagnostic error in internal medicine. Arch Intern Med 2005;165:1493-9.|
|24||Schuwirth LW, van der Vleuten CP. Different written assessment methods: What can be said about their strengths and weaknesses? Med Educ 2004;38:974-9.|