Stravoris Medical MCQ - Preclinical Bundle (Second Edition)
The foundational basic sciences: 64,398 curated multiple-choice questions across 7 preclinical subjects, mapped to a structured curriculum and ready to use out of the box.
Subjects: Biochemistry, Anatomy, Cell Biology & Histology, Physiology, Pathology, Microbiology, Pharmacology
7 subjects, 64,398 questions, 396 topics
Formats
Every bundle ships in three formats:
- Parquet - for fast loading into data pipelines
- JSONL - one question per line, ready for instruction-tuning workflows
- Excel - a human-readable view with each question, its options, the correct answer, and explanations laid out for browsing
Each question carries a stable ID, its subject and syllabus topic, the four options, the correct answer, two explanations (a reference explanation and a training explanation), and a Bloom's taxonomy level. One identical schema across every subject.
Quality
Every question - original and synthetic - is curated against 24 published item-writing standards (NBME, Haladyna-Downing), mapped to a structured topic taxonomy, and tagged with a Bloom's level.
Don't take our word for it. Free samples of every subject are available on Hugging Face - browse the questions before you buy:
- Machine-readable: stravoris/medical-mcq-dataset · Datasets at Hugging Face
- Human-readable: stravoris/medical-mcq-dataset-readable · Datasets at Hugging Face
Updates
Your purchase includes three years of updates from the date of purchase: improvements to the dataset and any new subjects added to this bundle, delivered to your purchase email.
Sources & attribution
This dataset is derived in part from MedMCQA (Pal, Umapathi, & Sankarasubbu, 2022), released under the Apache License 2.0. We gratefully acknowledge the MedMCQA authors.
License
This dataset is a curated, proprietary work. The commercial license permits AI/ML model training, internal R&D, and distribution of trained model weights. Redistribution of the dataset itself is not permitted; the Enterprise tier expands the grant for embedding the dataset in redistributed products. See stravoris.com/license for full terms.
Other options
Individual subjects are available on request - contact sales@stravoris.com.
Need the complete corpus (all 20 subjects) or a redistribution license? Enterprise licensing is available - contact sales@stravoris.com.