Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: transforming the practice of medicine. Future Healthc. J. 8, e188–e194 (2021).
Teo, Z. L. et al. Generative artificial intelligence in medicine. Nat. Med. 31, 3270–3282 (2025).
Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. J. AI in health and medicine. Nat. Med. 28, 31–38 (2022).
Wu, J. et al. Vision-language foundation model for 3D medical imaging. NPJ Artif. Intell. 1, 17 (2025).
Ong, J. C. L. et al. Large language model as clinical decision support system augments medication safety in 16 clinical specialties. Cell Rep. Med. 6, 102323 (2025).
Ke, Y. H. et al. Clinical and economic impact of a large language model in perioperative medicine: a randomized crossover trial. NPJ Digit. Med. 8, 462 (2025).
Abdulnour, R. -E. E., Gin, B. & Boscardin, C. K. Educational strategies for clinical supervision of artificial intelligence use. N. Engl. J. Med. 393, 786–797 (2025).
Budzyń, K. et al. Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study. Lancet Gastroenterol. Hepatol. 10, 896–903 (2025).
Kosmyna, N. et al. Your brain on ChatGPT: accumulation of cognitive debt when using an AI assistant for essay writing task. Preprint at https://arxiv.org/abs/2506.08872 (2025).
Leong, Y. H., Nambiar, L., Tay, V. Y. J., Lie, S. A. & Yuhe, K. Feasibility of a specialized large language model for postgraduate medical examination preparation: single-center proof-of-concept study. JMIR Form. Res. 9, e77580 (2025).
Klimova, B. & Pikhart, M. Exploring the effects of artificial intelligence on student and academic well-being in higher education: a mini-review. Front. Psychol. 16, 1498132 (2025).
Diaz, E. A. et al. Diabetic retinopathy screening among federally qualified health center patients using point-of-care AI: DRES-POCAI: a trial protocol: DRES-POCAI: a trial protocol. JAMA Netw. Open 8, e2538114 (2025).
Pak, A. et al. Mixed methods evaluation of a clinical decision support system to reduce variation in healthcare. NPJ Digit. Med. 8, 781 (2025).
American Medical Association. 2 in 3 physicians are using health AI—up 78% from 2023. https://www.ama-assn.org/practice-management/digital-health/2-3-physicians-are-using-health-ai-78-2023 (2025).
Tufts University School of Medicine. How medical faculty and students are using AI today. https://medicine.tufts.edu/news-events/news/how-medical-faculty-and-students-are-using-ai-today (2025).
Ke, Y. H. et al. Real-world deployment and evaluation of PEri-operative AI CHatbot (PEACH): a large language model chatbot for peri-operative medicine. Anaesthesia 81, 62–71 (2025).
Ong, A. Y. et al. Flight rules for clinical AI: lessons from aviation for human–AI collaboration in medicine. NPJ Digit. Med. 9, 201 (2026).
Lea, A. S. Cognitive aids, artificial intelligence, and deskilling in medicine: the history of an enduring anxiety. NEJM AI 3, 1 (2025).
Meshaka, R. & Arthurs, O. J. Are we too reliant on medical imaging?. Br. J. Hosp. Med. 83, 1–3 (2022).
Litman, R. S. et al. Monitoring. in Smith’s Anesthesia for Infants and Children 8th edn (eds Davis, P. J., Cladis, F. P. & Motoyama, E. K.) 322–343, https://doi.org/10.1016/b978-0-323-06612-9.00011-0 (Elsevier, 2011).
Bjork, E. L. & Bjork, R. A. Making things hard on yourself, but in a good way: creating desirable difficulties to enhance learning. in Psychology and the Real World: Essays Illustrating Fundamental Contributions to Society (ed. Gernsbacher, M. A.) 56–64 (Worth Publishers, 2011).
Ericsson, K. A., Krampe, R. T. & Tesch-Römer, C. The role of deliberate practice in the acquisition of expert performance. Psychol. Rev. 100, 363–406 (1993).
Sweller, J. Cognitive load during problem solving: effects on learning. Cogn. Sci. 12, 257–285 (1988).
Sweller, J., Ayres, P. & Kalyuga, S. The expertise reversal effect. in Cognitive Load Theory Vol. 1, 155–170, https://doi.org/10.1007/978-1-4419-8126-4_12 (Springer, 2011).
Bastani, H. et al. Generative AI without guardrails can harm learning: evidence from high school mathematics. Proc. Natl. Acad. Sci. USA. 122, e2422633122 (2025).
Gerlich, M. AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies 15, 6 (2025).
Wang, A. et al. Generative AI for medical education: insights from a case study with medical students and an AI tutor for clinical reasoning. in Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems 1–8 https://doi.org/10.1145/3706599.3721208 (ACM, 2025).
Steenhof, N., Woods, N. N., Van Gerven, P. W. M. & Mylopoulos, M. Productive failure as an instructional approach to promote future learning. Adv. Health Sci. Educ. Theory Pract. 24, 739–749 (2019).
Goddard, K., Roudsari, A. & Wyatt, J. C. Automation bias: a systematic review of frequency, effect mediators, and mitigators. J. Am. Med. Inform. Assoc. 19, 121–127 (2012).
Griot, M., Hemptinne, C., Vanderdonckt, J. & Yuksel, D. Large language models lack essential metacognition for reliable medical reasoning. Nat. Commun. 16, 642 (2025).
Ruskin, K. J., Corvin, C., Rice, S. C. & Winter, S. R. Autopilots in the operating room: safe use of automated medical technology. Anesthesiology 133, 703–716 (2020).
Braarud, P. Ø Measuring cognitive workload in the nuclear control room: a review. Ergonomics 67, 849–865 (2024).
Fletcher, G. et al. Anaesthetists’ Non-Technical Skills (ANTS): evaluation of a behavioural marker system. Br. J. Anaesth. 90, 580–588 (2003).
Vaccaro, M., Almaatouq, A. & Malone, T. When combinations of humans and AI are useful: a systematic review and meta-analysis. Nat. Hum. Behav. 8, 2293–2303 (2024).
Agarwal, N., Moehring, A., Rajpurkar, P. & Salz, T. Combining human expertise with artificial intelligence: experimental evidence from radiology. SSRN Electron. J. https://doi.org/10.2139/ssrn.4505053 (2023).
Fletcher, L. & Carruthers, P. Metacognition and reasoning. Philos. Trans. R. Soc. Lond. B Biol. Sci. 367, 1366–1378 (2012).
Ainge, L. E., Edgar, A. K., Kirkman, J. M. & Armitage, J. A. Developing clinical reasoning along the cognitive continuum: a mixed methods evaluation of a novel Clinical Diagnosis Assessment. BMC Med. Educ. 25, 31 (2025).
Morley, J. et al. The ethics of AI in health care: a mapping review. Soc. Sci. Med. 260, 113172 (2020).
European Society of Medicine. Bridging global health AI divide with local wisdom. https://esmed.org/bridging-global-health-ai-divide-with-local-wisdom/ (2025).
Monteith, S. et al. Artificial intelligence and deskilling in medicine. Br. J. Psychiatry https://doi.org/10.1192/bjp.2025.10496 (2026).
Wen, X. & Thamotharampillai, T. When is it ethically defensible for a medical practitioner to deviate from clinical practice guidelines? Ann. Acad. Med. Singap. https://doi.org/10.47102/annals-acadmedsg.2025189 (2025).
US Food and Drug Administration. Software as a medical device (SaMD) https://www.fda.gov/medical-devices/digital-health-center-excellence/software-medical-device-samd (2025).
Ke, Y. et al. Mitigating cognitive biases in clinical decision-making through multi-agent conversations using large language models: simulation study. J. Med. Internet Res. 26, e59439 (2024).
Ten Cate, O. Competency-based postgraduate medical education: past, present and future. GMS J. Med. Educ. 34, Doc69 (2017).
Shah, N., Desai, C., Jorwekar, G., Badyal, D. & Singh, T. Competency-based medical education: an overview and application in pharmacology. Indian J. Pharmacol. 48, S5–S9 (2016).
