The Trail Making Test is one of the more reliable brief cognitive assessments available in clinical practice, and the research behind it is substantial enough to take seriously. For detecting Alzheimer’s disease, the TMT-B portion of the test has demonstrated 83.3% sensitivity and 91.8% specificity using a cut-off score of 188.5 seconds — meaning that a patient who takes longer than roughly three minutes to complete Part B is flagged with a high degree of accuracy. Those are meaningful numbers for a test that requires nothing more than a pencil and a sheet of paper. This article also covers how digital versions of the test perform, which populations the test works best for, and where its accuracy starts to break down.
To understand what those statistics mean in practice, consider a memory clinic screening 100 older adults, 50 of whom have confirmed Alzheimer’s disease. With 83.3% sensitivity, the TMT-B would correctly identify about 42 of those 50. With 91.8% specificity, it would correctly clear about 46 of the 50 who are cognitively healthy. That leaves roughly 8 missed diagnoses and 4 false positives — not perfect, but clinically meaningful for a five-to-ten-minute test administered with minimal equipment. For mild cognitive impairment, the accuracy picture is more complicated, as discussed below.
Table of Contents
- What Does the Research Say About Trail Making Test Accuracy for Cognitive Decline?
- How Accurate Is the Trail Making Test for Detecting Mild Cognitive Impairment?
- How Age and Education Affect Trail Making Test Accuracy
- Digital and AI-Enhanced Trail Making Tests — How Do They Compare?
- Limitations of the Trail Making Test — When Accuracy Breaks Down
- Trail Making Test Accuracy in Specialty Populations
- The Future of Trail Making Test-Based Cognitive Screening
- Conclusion
- Frequently Asked Questions
What Does the Research Say About Trail Making Test Accuracy for Cognitive Decline?
The Trail Making Test comes in two parts. Part A asks the patient to connect numbered circles in sequence as quickly as possible. Part B adds alternating letters and numbers — connect 1, then A, then 2, then B, and so on — which taxes executive function, cognitive flexibility, and working memory simultaneously. Because cognitive decline tends to impair these functions early, Part B has become the more diagnostically useful portion, and most of the accuracy research focuses there. The landmark figures come from a study published in the International Psychogeriatrics journal, which established the 188.5-second cut-off for TMT-B in distinguishing Alzheimer’s disease from normal cognition. That 83.3% sensitivity and 91.8% specificity pairing is strong by the standards of brief cognitive screening tools.
For comparison, the widely used Mini-Mental State Examination (MMSE) has sensitivity estimates that vary widely depending on the population and cut-off used, sometimes falling below 80% for early-stage dementia. The TMT-B’s specificity in particular — its ability to correctly clear patients who do not have Alzheimer’s — is notably high. However, accuracy varies significantly depending on which stage of decline you are trying to detect. The TMT performs best when there is already moderate-to-severe impairment. For very early mild cognitive impairment, where changes are subtle, sensitivity drops and the test becomes a less reliable standalone screen. This is not a flaw unique to the Trail Making Test — nearly every brief cognitive instrument struggles at the boundary between normal aging and early MCI — but it is worth being explicit about when interpreting results.

How Accurate Is the Trail Making Test for Detecting Mild Cognitive Impairment?
Mild cognitive impairment represents the gray zone between normal aging and dementia, and it is precisely where screening accuracy matters most. Catching MCI early opens a window for intervention, lifestyle modification, and monitoring. The Walking Trail Making Test — a physical adaptation where patients navigate numbered and lettered cones arranged on a floor — has shown 78% sensitivity and 90% specificity for detecting MCI. Those numbers are somewhat lower on the sensitivity side than the figures for Alzheimer’s detection, which reflects the genuine difficulty of identifying early-stage impairment. The 78% sensitivity figure means that roughly one in five people with MCI will not be flagged by the walking version of the test.
In a clinical context, that is a meaningful miss rate. Clinicians who use the TMT as a standalone MCI screen should understand that a passing result does not rule out early decline. The test is best understood as one component within a broader assessment battery, not as a definitive gate. There is also a language and cultural validation dimension worth noting. A Chinese-language version of the Trail Making Test has demonstrated strong diagnostic accuracy for distinguishing MCI from normal aging in Chinese-speaking populations, published in the Journal of the American Geriatrics Society. This matters because cognitive screening tools developed and normed on Western, English-speaking populations do not always translate well, and the existence of validated non-English versions extends the test’s clinical utility to broader populations — though local normative data remains essential for accurate interpretation.
How Age and Education Affect Trail Making Test Accuracy
One of the more significant confounders in interpreting Trail Making Test results is that completion time is strongly influenced by age and education level — two variables that have nothing to do with dementia. Older adults complete both parts more slowly than younger adults. Less educated individuals complete Part B more slowly, partly because the letter-sequencing component draws on literacy and alphabetical familiarity. If a clinician applies a fixed cut-off like 188.5 seconds to a 78-year-old with an eighth-grade education, they may be flagging normal variation in that population as a sign of pathology. This is why the published specificity figures — however strong they appear in research settings — can erode in practice if age- and education-adjusted norms are not used.
In research studies, cut-off scores are typically derived and validated within defined populations. Apply those same cut-offs to a demographically different patient group and the accuracy profile changes substantially. A clinician working with a population of elderly patients who have limited formal education should be cautious about over-interpreting borderline TMT-B scores without consulting norms appropriate to that group. A concrete example: a 75-year-old woman with six years of formal education who completes TMT-B in 200 seconds might fall above the 188.5-second clinical cut-off, triggering concern. But if her age and education-adjusted percentile rank places her within the typical range for her demographic, that result carries very different clinical meaning than the same score in a 65-year-old college graduate. The raw time is not the whole story.

Digital and AI-Enhanced Trail Making Tests — How Do They Compare?
The traditional paper-and-pencil Trail Making Test has a well-documented limitation: it captures completion time, but loses a great deal of behavioral data along the way. Digital versions of the test change that fundamentally, capturing pen pressure, hesitation patterns, velocity, and directional errors that paper simply cannot record. A hybrid approach combining paper and electronic TMT administration has shown classification accuracy of 0.726 for distinguishing healthy controls from MCI, 0.929 for healthy versus Alzheimer’s disease, and 0.815 for MCI versus Alzheimer’s — figures that speak to the added value of the electronic dimension, particularly for the more severe diagnostic contrast. More recently, a 2025 study published in Alzheimer’s and Dementia applied deep learning models to TMT data combined with demographic variables. The model achieved 87.69% sensitivity and 98.26% specificity in one cohort, and 85.00% sensitivity and 96.75% specificity in a second validation cohort. The specificity figures in particular — approaching 97-98% — represent a meaningful improvement over traditional paper administration.
The practical implication is that clinicians and researchers who have access to digital TMT platforms may be able to draw more confident conclusions from the same fundamental test. The tradeoff is accessibility. Paper administration requires almost no infrastructure. Digital platforms require hardware, software, and often training for both clinicians and patients who may be unfamiliar with touchscreen or tablet interfaces. For patients with motor impairment, arthritis, or tremor, digital administration introduces its own confounders — a problem that eye-tracking based TMT variants are beginning to address. An eye-tracking version of the test demonstrated 87.2% accuracy in differentiating cognitive decline from motor problems, which matters significantly for populations where physical limitations make hand-based testing unreliable.
Limitations of the Trail Making Test — When Accuracy Breaks Down
The Trail Making Test is not a standalone diagnostic tool, and understanding where it fails is as important as knowing its strengths. First, motor slowing — whether from Parkinson’s disease, arthritis, or peripheral neuropathy — can inflate completion times without reflecting any cognitive change. A patient with significant hand tremor may score poorly on TMT-B for purely physical reasons. Clinicians unfamiliar with this limitation may misattribute a slow completion time to cognitive impairment. Second, anxiety and test-taking pressure can measurably affect performance. Some patients, particularly older adults who are already worried about their cognitive health, experience significant performance anxiety during timed cognitive tests.
This is not unique to the Trail Making Test, but it is worth flagging because the test is explicitly timed and the patient is aware throughout. There is no easy correction factor for anxiety-related underperformance. Third, the test’s accuracy for very early MCI — before functional impairment is clinically apparent — remains limited. The research consistently shows that the TMT performs well for moderate-to-severe cognitive decline and reasonably well for established MCI, but it is not sensitive enough to reliably catch the earliest neurodegenerative changes. For that level of detection, biomarker approaches including cerebrospinal fluid analysis or PET imaging remain necessary. The TMT is a valuable triage and monitoring tool, not a replacement for comprehensive neurological evaluation when early-stage disease is suspected.

Trail Making Test Accuracy in Specialty Populations
The Trail Making Test has been studied in populations beyond typical older adults presenting with memory concerns. In patients with multiple sclerosis, where cognitive impairment — particularly in processing speed and executive function — is a common but sometimes underappreciated complication, the TMT-switching component showed 100% sensitivity and 93.3% specificity for detecting certain cognitive impairments.
That sensitivity figure is notable, though it applies to a specific type of cognitive impairment profile common in MS rather than to dementia broadly, and the sample sizes in MS cognitive studies tend to be smaller. This finding has a practical implication for neurologists managing MS patients: the Trail Making Test may deserve routine inclusion in MS follow-up appointments, not just in memory clinic settings. Cognitive decline in MS often goes undetected until it is functionally significant, and a brief, inexpensive test with high sensitivity in that population represents a genuine clinical opportunity.
The Future of Trail Making Test-Based Cognitive Screening
The trajectory of Trail Making Test research points toward integration with AI and digital platforms as the primary direction for improving accuracy. The 2025 deep learning study is unlikely to be the last word on this. As digital tablet-based testing becomes more widespread in clinical and research settings, the rich behavioral data captured during test performance — hesitation times, error patterns, stroke velocity — will feed increasingly sophisticated models. The gap between paper TMT accuracy and AI-enhanced TMT accuracy is likely to widen further.
What this means for everyday clinical practice is less clear in the near term. Digital platforms will take years to become standard in community memory clinics, primary care offices, and geriatric practices, particularly in under-resourced settings. In the meantime, the paper Trail Making Test remains a valuable, low-cost cognitive screening tool — one that performs meaningfully well when interpreted with appropriate normative data and within a broader clinical assessment context. The test invented in the 1950s still earns its place in the clinic; the research is simply making it better.
Conclusion
The Trail Making Test, particularly Part B, offers clinically meaningful accuracy for detecting cognitive decline — especially for established Alzheimer’s disease, where sensitivity reaches 83.3% and specificity reaches 91.8%. For mild cognitive impairment, accuracy is somewhat lower, and for very early-stage decline, the test is best used as one component of a broader battery rather than a definitive screen.
Age, education level, motor function, and the specific normative data being applied all significantly affect how results should be interpreted. Digital and AI-enhanced versions of the test consistently outperform traditional paper administration, and the 2025 deep learning findings suggest that specificity approaching 98% may be achievable in structured research settings. For clinicians, the practical takeaway is to use the Trail Making Test as a reliable, inexpensive first-line screening tool while remaining alert to its limitations — and to watch the digital enhancement space, which is moving quickly and may soon change what is achievable in routine clinical practice.
Frequently Asked Questions
Is the Trail Making Test the same as a neuropsychological evaluation?
No. The Trail Making Test is a brief screening instrument, typically taking five to ten minutes. A full neuropsychological evaluation involves a battery of tests administered over several hours and provides a much more detailed profile of cognitive strengths and weaknesses. The TMT is often included within a full evaluation, but is not a substitute for one.
Can someone practice and improve their Trail Making Test score?
Yes, to a degree. Repeated administration of the same test version can produce practice effects, meaning scores may improve simply from familiarity with the task rather than genuine cognitive improvement. This is why clinicians using the TMT for longitudinal monitoring sometimes use alternate forms or interpret repeated scores with caution.
What score on TMT-B should raise concern?
The research cut-off most commonly cited is 188.5 seconds for distinguishing Alzheimer’s disease from normal cognition, but this figure applies to specific demographic populations. Clinicians should use age- and education-adjusted normative tables rather than applying a single fixed cut-off across all patients.
Does the Trail Making Test detect all types of dementia?
The test is sensitive primarily to impairments in processing speed, executive function, and cognitive flexibility — functions affected across multiple dementia types. However, its accuracy profile has been studied most thoroughly for Alzheimer’s disease. Its performance in detecting frontotemporal dementia, Lewy body dementia, or vascular dementia specifically is less comprehensively characterized.
Can the test be done at home?
Standard paper versions are sometimes available online, but home administration is not recommended for clinical purposes. Accurate scoring requires a trained administrator, proper timing, standardized conditions, and comparison against appropriate normative data. An unsupervised home result has no clinical validity.





