In May 2022 I became the first person to graduate from Duke University with an MD and a PhD in Computer Science. The process took eight years: three years of medical school and five years of graduate school. One of the most common questions I heard from other students and occasionally professors was “Why are you doing medical school and a Computer Science PhD?” Well, here is a post answering that question! I’ve formatted it as an FAQ based on many conversations I’ve had on the topic. If you’re interested in medicine and computer science, I hope you will consider this training pathway!

Why did you decide to do an MD/PhD? Did you always know you wanted to do an MD/PhD? Did you always like computer science?

The short answer: I decided to do an MD/PhD because by the end of college I wanted to be a doctor and I wanted to do research so I decided to do both. I didn’t learn about MD/PhD programs until I was a senior in college, and I became interested in computer science during college.

The long answer:

When I was a toddler, the first time someone asked me, “What do you want to be when you grow up?” I apparently answered “truck driver” because I loved trucks (especially cement mixers). In elementary school I wanted to be a writer. In high school I wanted to be a marine biologist or a doctor. By the time I got to my freshman year of college, I wanted to be a veterinarian, and signed up for the biology major. The summer after freshman year, I volunteered at a veterinary clinic and ended up deciding that for various reasons veterinary medicine was not for me (although I do love animals).

During my freshman year at Cornell I got involved in research because I thought it would be fun to do science. My first-ever research position was with Dr. John Hermanson in the Cornell vet school. Dr. Hermanson taught me how to dissect bats and I wrote about what muscles of the bat larynx could’ve first contributed to evolution of echolocation. (Dr. Hermanson also had the coolest office/lab ever, full of animal skeletons and papers, including some Tupperwares with animal skeletons actively being cleaned by beetles). I will always be grateful for Dr. Hermanson’s mentorship throughout college.

During my middle year of college, I became fascinated with genetics and joined the laboratory of Dr. Alan Nixon. Per unwritten tradition in many wet labs, my first task as a research assistant was washing glassware. I remember standing in front of a deep metal sink for hours, listening to music on my iPod and rinsing each tissue culture bottle three times to make sure I removed every trace of soap. I was then given a project to optimize gene transfer into equine mesenchymal stem cells, with the ultimate goal of using genetically engineered cells to treat orthopedic injuries in horses. Horses, especially racehorses, often need orthopedic treatment, and also are a great model system for humans since their bones are of comparable size. I ultimately increased the gene transfer efficiency (specifically, non-viral transfection efficiency) from 10% to 54%, and completed an undergraduate honors thesis. There were parts of research I liked very much (thinking about problems) and parts I didn’t like very much (pipetting for ten hours a day into 96-well plates).

During my first semester of senior year I decided I was taking a gap year after graduation because, honestly, I was burned out. I graduated from Cornell in three years which meant that every semester was intense. I knew I wanted to have a medical career but had ruled out veterinary medicine so I decided to apply to human medical schools during my gap year. Because of the timing of my decision I was late signing up for the MCAT. The only date I could take it was the middle of January, and the only location available was out of state. I am not a car person but I still remember the bright blue 2013 Ford SUV that I rented to drive the 200-odd miles from Cornell to the testing center during a blizzard. At one point I finished an achingly slow drive across miles of snow-covered backcountry roads only to reach a steep hill, and I thought, “I am literally going to die in a car crash on the way to a standardized test.” I didn’t have time to turn back so I crept down that winding road as slowly as possible and thankfully the four wheel drive kept me from slipping. I did some last minute studying with flashcards strewn all over a Burger King table, slept in a hotel, and then drove to the testing center location. When I arrived at the GPS address for the exam, I found myself next to a nondescript strip mall with no signs on any of the stores, brown paper covering all the windows, and weeds growing up out of the asphalt. I panicked, thinking that the place was abandoned and my GPS had misled me, but after pacing around I saw a tiny sticker in the lower corner of one of the store windows indicating it was a testing center. So I pushed open the brown-paper-covered door and found myself in a Prometric facility.

Also during my senior year, I discovered that I like computer science. This was a weird realization for me because I grew up in Redmond, Washington, home to Microsoft, and I had been determined for years to avoid CS because “CS is what everybody does.” In the fall of my senior year, after years of my dad insisting that I try CS, I finally took CS 101. I was absolutely shocked that I really liked it.

At that point, by sheer luck, I stumbled across an info page for MD/PhD programs. I had never heard of an MD/PhD program before but it sounded perfect, because I wouldn’t have to choose between medicine, computer science, and research. I decided that I wanted to do medical school and a computational PhD. I imagined that a computational PhD would preserve the parts of research I liked (thinking about a problem, reading the literature, taking steps to solve a problem) while replacing the parts I didn’t like (pipetting) with something that I did like (programming). In spring 2013 I graduated from Cornell with a B.A. in Biological Sciences and a concentration in computational biology.

I applied for MD/PhD programs during my gap year, while I worked full-time as a computational biology research assistant at the University of Pennsylvania with Dr. Muredach Reilly and Dr. Mingyao Li. My project focused on predicting long noncoding RNAs involved in cardiovascular disease, and working on it allowed me to verify that I did indeed like computational research. I also got to write my first-ever scientific paper! You can see it here. You’ll note from the publication date that it actually ended up getting published nearly two years after I “finished” the project, which was my first lesson in timelines of scientific publishing.

During my gap year I applied to every Medical Scientist Training Program (“MSTP”; an NIH-funded MD/PhD) that offered a computational option for a PhD – computer science, computational biology, mathematical biology, or biomedical engineering. Applying to an MSTP involves completing a medical school application and various university-specific supplemental applications, but once you are accepted, you are part of a unified program that includes both MD and PhD degrees. MSTPs are administratively housed within medical schools and at the time I applied, the vast majority of MSTPs forbade computational PhDs, so I only applied to the 17 programs that would allow me to do a PhD in an area of my interest. I believe that this policy has been relaxed/removed at many universities so now there are more MD/PhD programs out there that allow computational PhDs. I decided to join the Duke MD/PhD program and then spent the next 8 years at Duke!

What was the application process like for the MD/PhD program?

You apply to MD/PhD programs through the medical school application process which starts multiple months earlier than the graduate school application process. A typical timeline is about 1.5 years from the start of the MD/PhD application process to matriculation. I started applying for MD/PhD programs in spring 2013 and I matriculated in fall 2014.

There is a primary medical school application which includes grades, essays, letters of recommendation, and descriptions of your activities. If you are applying for MD/PhD programs you must write extra essays in the primary application (“why do you want a PhD”, “why do you want to be MD/PhD” and so on.) The primary application is mostly the same for most medical schools – you tick off which schools you want to apply for and pay their fees.

The secondary applications are school-specific. Some schools have only an MD secondary. Some schools have an MD secondary and an MD/PhD secondary. Others have an MD secondary, MD/PhD secondary, and PhD secondary. The secondaries contain a lot of essays. It is good to receive secondary applications because it means you are still being considered for acceptance. There is often topical overlap between the essays from different universities so I found it helpful to sort the essay prompts of all the secondaries by topic, write one essay for each topic, and then change the essay’s length to match the word limits of the various universities. One popular secondary application prompt the year I applied was “tell us about your greatest challenge.”

After the secondaries there are interviews. Most interviews take place over two days: an MD day which is the regular MD interview day, and an MD/PhD day in which you have several interviews with faculty members. With your secondary you submit the names of faculty you would like to interview with, and the school tries to get you interviews with those faculty. Sometimes you interview with the people you requested and other times you don’t. These faculty interviews are focused on research.

Do you have to choose a PhD research advisor when you apply for MD/PhD programs?

No. You choose your PhD research advisor after you finish the first two years of medical school. I chose my research advisor, Dr. Lawrence Carin, by the end of my 3rd year in the MD/PhD program (i.e. by the end of my first year of graduate school). This is quite different from most PhD-only applications, where you choose your PhD research advisor during the initial application process.

What is the structure of an MD/PhD program? How do you complete both degrees?

In most MD/PhD programs, you do 2 years of MD lecture + x years of PhD + 2 years of MD clinical clerkships.

Duke is unique in that they compress the 2 years of lecture into 11 months and then you jump right in to 13 months of clerkships, so the structure for the Duke MD/PhD program is 1 year of MD lecture + 1 year of MD clerkships + x years of PhD + 1 year of MD clerkships.

A natural question is why medical school gets split in half by the PhD. I think there are two good reasons:

  • By completing part of medical school before your PhD, you have more medical insights throughout your PhD, which can affect what kinds of research projects you pursue and how you approach them.
  • By alternating between the two degrees, it’s more likely that you will finish both. If the degrees were completed in sequence, there would be a strong temptation to leave after completing one of the degrees.

What are the requirements for an MD?

Every medical school has different requirements. Medical school requirements also change at a faster rate than PhD requirements, because MD courses and clerkships are always being updated as the field of medicine advances. During my time at Duke, the MD curriculum changed quite a bit between my first year in the program and my last year. The current requirements for the Duke School of Medicine can be found here. When I completed my MD, the requirements were:

  • First year: a predefined lecture sequence. It lasted 11 months and grades are pass/fail. The courses included Human Structure and Function, Brain and Behavior, and Body and Disease. Topics covered included molecular biology, cell biology, histology, microbiology, anatomy, pharmacology, neuroscience, physiology, and diseases of all organ systems.
  • Second year: 13 months of clinical rotations: Medicine, Surgery, Pediatrics, Obstetrics and Gynecology, Family Medicine, Radiology, Neurology, Psychiatry, and one month of electives (2 electives, 2 weeks per elective.) This year felt more like employment than “school” (although we still had to study and take tests.) On a clerkship, you are assigned your schedule and you show up in the hospital/clinic during your required hours, to interview patients, do physical exams, watch surgeries, write notes, draw blood, etc.
  • Third year: research (a relatively unique aspect of Duke). For MD/PhD students, the third year is the first year of their PhD.
  • Fourth year: at least 7 months of clinical rotations, including multiple clinical electives, plus a capstone course.

What are the requirements for a CS PhD?

Every PhD program is different, but in general, the requirements for a CS PhD will be some combination of graduate-level CS classes (or exams), a qualifying exam (written and/or oral), and a thesis defense (including a presentation and a written dissertation). The detailed requirements of the Duke Computer Science PhD program can be found here. Generally PhD requirements at a particular institution don’t change very much over time, but they can change. When I did my PhD, these were Duke’s requirements:

Eight graduate classes:

  • Four graduate computer science qualifications classes in which you get a B+ or better by the end of the second year. If you’d prefer not to take the quals classes you can instead take an exam at the beginning of the semester and if your score is high enough the exam will count as the quals class.
    • Choose one from: [Algorithms, Computational Complexity]
    • Choose one from: [Architecture, Operating Systems]
    • Choose one from: [Artificial Intelligence, Numerical Analysis]
    • Fourth qual is an elective. Can be any of the above, or from a longer list of approved quals electives
  • Two computer science graduate course electives, B- grade or higher
  • Two graduate electives, B- grade or higher

TA-ing at least one semester.

Four research presentations to a committee of professors:

  • Research Initiation Project Proposal (end of 1st year)
  • Research Initiation Project Defense (end of 2nd year)
  • Preliminary Exam (end of 3rd year)
  • Thesis Defense (end of PhD)

Why did you decide to do your PhD in Computer Science?

My original plan upon matriculating at Duke was to do a PhD in Computational Biology since that was my undergrad concentration. However, when I did a side-by-side comparison of the Computational Biology PhD requirements and the Computer Science PhD requirements, I realized that they were approximately equivalent in terms of total number of classes, but the computer science requirements aligned better with the topics I wanted to learn more about during my PhD. Also, several of the research labs I was interested in joining for my PhD were part of the Computer Science department. So I decided to pursue a Computer Science PhD.

I am a student getting a degree in computer science. I want to spend my entire career focused on medical applications. Should I get an MD for background knowledge even if I don’t want to do a medical residency or be a practicing physician?

I’d recommend it! If you are determined to work on medical problems, then an MD will be incredibly valuable and the time it will take to get one will be completely worth it. My medical background has radically changed my perspective on what kinds of problems I want to work on. I’ve seen many papers where the authors propose some machine learning model to solve a medical problem and the model’s performance is great – but the “medical problem” they are solving isn’t a real problem at all from the perspective of anyone in the medical field. This is fine for the researchers if they only care about the model, and the “medical application” is just icing on the cake. This is not fine if the researchers actually wanted to impact medicine.

What is it like to be an MD student?

Medical school is highly structured. You are assigned classes, exams, labs, clinical rotations, homework, and teams. If you complete all of the assigned tasks, you will pass. A lot of medical school is about showing up, having a good attitude, and studying what you are required to study. There is a lot of memorization. It is not possible to “reason out all of pharmacology” from first principles the way you might derive a formula in math; instead you just have to rote memorize the names of the drugs and what they do. It feels a lot like studying a foreign language. It was great that first year at Duke was pass/fail because that provided more flexibility – it allowed each student to spend more time studying the subjects that were more interesting to them, and less time on the subjects that were less interesting to them.

The hardest part of medical school for me was not academic, it was emotional. It is emotionally difficult to witness people suffering and dying. That sounds obvious written down, but a lot of portrayals of doctors in the media don’t show the part where the doctor runs to the bathroom after a terrible patient outcome and cries in a locked stall. Every single person who goes to medical school will be emotionally affected by bad patient outcomes.

The best part of medical school for me was when I formed connections with patients and was able to meaningfully participate in their care. I also enjoyed the academic side of medicine, learning about the many thousands of human diseases and their treatments.

What is it like to be a PhD student?

Graduate school is EXTREMELY different from medical school. Graduate school felt so much more flexible to me than medical school. Within the CS PhD requirements, I got to choose what classes to take, what research group/laboratory to join, to some extent what projects I worked on, how I worked on those projects, what hours I worked, what days of the week I worked, to some extent who I worked with, and so on. I even had a flexible location, and so I ended up working from home frequently because it was easier for me to concentrate in complete silence.

For me the hardest part of graduate school was the uncertainty and the delayed gratification. Successful completion of a PhD program depends on conducting research, which is inherently unpredictable and will ALWAYS have more dead ends than publishable results. A PhD is one very long stretch of delayed gratification, because you don’t get your degree until the very end, and nobody is going to “round up” and award you a PhD for “75% of a PhD.” And if you’ve done 2 years of medical school beforehand, that means it’ll be about 7 years in school before you get either of your terminal degrees. It does start to feel LONG. A PhD is a mind game. You start a new research project – will you get good results? You get stuck in a rut in a research project – will you ever get out of the rut? You sign up for a class in a subject you’re not too familiar with – will you do well? You get good results on your research project – will your advisor/colleagues expect you to do 500 more experiments before you’re “done”? You write a paper – will it get published? And on and on. The third year of a PhD is notoriously rough. If you ask any third-year PhD student how they think it’s going, and if they answer honestly, the answer is usually “horrible” (with an unspoken undercurrent of “I think I’m never going to graduate”). Usually by third year you have spent enough time in the program to feel tired but not enough time working on your challenging problem to have the publications you want. In my third year, I didn’t have any papers yet. There is a big difference between the kinds of research projects that undergraduate students get assigned and the kinds of research projects that graduate students get assigned. Undergraduate research projects are generally carefully crafted to be accomplishable by undergraduate students in a relatively short amount of time. They are generally low-risk and well-defined. So, that way, as the undergrad, you get a taste of research success. Graduate student projects are typically larger-scale, more ambiguous, and higher risk, meaning you will fail and fail and fail until finally, some speck of something looks interesting, and you pursue it and then boom, you have publishable results. But it takes a lot of wandering in the dark before you find your way out of the cave.

My favorite parts of graduate school were studying a subject in depth, and the schedule/location flexibility. I liked that I could learn a lot and for the most part learn it on my own terms. I’ve heard the saying “in medical school you learn nothing about everything and in graduate school you learn everything about nothing” and it’s basically true – medical school is concerned with breadth of knowledge, and graduate school is concerned with depth of knowledge. I enjoyed acquiring “deep knowledge” in my chosen topics during graduate school.

What did you work on during your PhD?

My PhD advisor was Dr. Lawrence Carin. I will always be grateful to Dr. Carin for taking a chance on me and letting me join his research group even though I was new to the field of machine learning. My main research project focused on machine learning methods development for automatic interpretation of chest CT scans. Chest CT scans are huge grayscale volumetric images of the chest, and show the heart, lungs, and other structures. I developed the RAD-ChestCT dataset, which at the time of publication was the largest volumetric medical imaging dataset in the world. Dataset development included creation of an end-to-end Python pipeline for transforming DICOM images into 3D numpy arrays. I designed SARLE, a natural language processing framework to automatically extract presence/absence abnormality labels from free-text radiology reports, so that these labels could be used to train a volume classifier. I then created CT-Net, a convolutional neural network that was the first machine learning model to predict multiple diverse abnormalities within a volumetric medical image. I became interested in explainable AI and created AxialNet, an explainable CNN for detecting CT abnormalities. During my study of gradient-based neural network explanation methods, I discovered a problem with Grad-CAM, which is a popular AI explanation method that has been cited over 10,000 times. To solve this problem, I developed HiResCAM, a new AI explanation method for CNNs that can reliably highlight which regions of an image were used to make a particular prediction. Papers related to my main thesis project are available here:

My dissertation is also focused on machine learning for CT scans: Towards Fully Automated Interpretation of Volumetric Medical Images with Deep Learning

I also worked on several other research projects during my PhD:

With Dr. Cynthia Rudin: Playing Codenames with Language Graphs and Word Embeddings; Metaphor detection using contextual word embeddings from transformers; A transformer approach to contextual sarcasm detection in twitter

With Dr. Andrew Landstrom: GENESIS: Gene-Specific Machine Learning Models for Variants of Uncertain Significance Found in Catecholaminergic Polymorphic Ventricular Tachycardia and Long QT Syndrome-Associated Genes

With Dr. Michael Cary: Machine Learning Algorithms to Predict Mortality and Allocate Palliative Care for Older Patients With Hip Fracture

What activities did you do during the MD/PhD program?

I sang in the med school a cappella group Major Groove for three years, had a lead role in the Duke Med Student Faculty Show in 2015 “Into the Wards,” took jazz piano lessons from Ed Paolantonio, cycled, walked my dog, wrote (this blog as well as creative writing), gave lectures at Duke on AI/ML and medicine, mentored over 15 undergraduate and Master’s students on research projects, taught at Duke SPLASH, TA’d Professor Rudin’s Data Science Competition course for 2 semesters, and founded my company, Cydoc.

What are you doing now that you’ve graduated?

I am working full time as the CEO of Cydoc, a health tech startup that I founded in 2018. At Cydoc, we’ve created an AI system that interviews a patient through an interactive intake form before their appointment. Then, our software generates a medical note, enabling a streamlined appointment and faster note writing. If you’re a student interested in a software engineering internship, or if you’re a physician interested in a free trial of Cydoc in your practice, feel free to send me a message! I’m extremely excited about the potential for carefully designed AI systems and human-centric user interface design to improve patient care. It’s been a privilege to work on Cydoc and I’m excited to see where the next few years lead!

Summary

Pursing a medical degree and a PhD in Computer Science was an adventure, and I will always be grateful for the opportunity to study both fields in depth. I hope that this post has been helpful for anyone considering a similar career path. If you’d like to talk with me about the MD/CS PhD road, feel free to reach out!