I slide back into the MRI machine, adjust the mirror above the lacrosse helmet-like setup holding my skull steady so that I can see the screen positioned behind my head, then I resume my resting position: video game button pad and emergency abort squeeze ball in my hands, placed crosswise across the breast bone like a mummy.
My brain scan and the results of this MRI battery, if they were not a demo, would eventually be fed into a machine learning algorithm. A team of scientists and researchers would use it to help potentially discover how human beings respond to social situations. They want to compare healthy people’s brains to those of people with mental health disorders. That information might help make correct diagnoses for mental health disorders and even find the underlying physical causes. But the ultimate goal is to find the most effective intervention for any given mental health disorder.
Can the machine learning approach provide a better answer?
The idea is simple: use an algorithm to tease out actionable insights, putting data to feelings.
Mental health disorders haunt a sizable portion of humanity at any given time. According to the World Health Organization, depression alone afflicts roughly 300 million people around the globe, one of the main causes of disability in the world. The organization estimates bipolar disorder is present in roughly 60 million people, schizophrenia in 23 million.
The question is whether the current model is a viable answer. Are we diagnosing the best way? Right now, diagnosis is based on the display of symptoms categorized into mental health disorders by professionals and collected in the Diagnostic and Statistical Manual of Mental Disorders (the DSM), which is now on its fifth iteration. Can the machine learning approach provide a better answer?
First up is the structural MRI, essentially a soft tissue X-ray. The extremely noisy scan takes five minutes. Next: the functional MRI, which will actually show my brain, well, functioning. The fMRI needs my brain to perform a task, and so I play a game.
My scans, if I were a real subject, would go in the mental health disorder category: borderline personality disorder. In fact, I had a pretty bad borderline episode the night before and morning of my scan, so this chance to look inside felt well timed, like getting hit by an ambulance.
Psychiatry is seeking to measure the mind, which is not quite the same thing as the brain
For the Virginia Tech team looking at my brain, computational psychiatry had already teased out new insights while they were working on a study published in Science in 2008. During the study, they found that my fellow borderliners seem to care more about reciprocity — I help you, you help me — than neurotypical people, the opposite of the team’s initial hypothesis. For what it’s worth, this supports my own experience; it is a personal failing that I tend to view friendships too transactionally, often with maddening currencies like “caring.”
After 15 minutes or so of playing the game, I slide from my sarcophagus. My brain has been imaged. I look at it on the computer screen, rendered in grayscale.
I’ve seen the enemy.
The Fralin Biomedical Research Institute at Virginia Tech Carilion, home to the Human Neuroimaging Laboratory, is in downtown Roanoke. The HNL is host to a fast-growing field, computational psychiatry, that applies the tools of computer science to psychiatry. The hope is that machine learning will lead to a more data-driven understanding of mental illness.
This science was not possible until very recently. The algorithms Tech uses are decades old: they combine with fMRI imaging, which was invented in 1990. But the computing power required to make them useful is finally available now, as is a newer willingness to combine scientific disciplines in novel ways for novel problems.
Psychiatry is seeking to measure the mind, which is not quite the same thing as the brain. So it relies on having people quantify how they feel. While clinical diagnostic surveys are actually quite accurate, they are prone to some inaccuracies. What one person considers a 3 on a 1 to 10 sadness scale, for example, could be another person’s seven and yet another’s ten — and none of them are wrong. The language for accurately measuring pain just isn’t consistent.
The language for accurately measuring pain just isn’t consistent
Mental health disorders are also amorphous things, with overlapping symptoms among different diagnoses. But by combining the neuroimaging of the fMRI with a trove of data, a machine learning algorithm may be able to learn how to diagnose disorders with speed and accuracy. Researchers hope to discover physical symptoms of mental disorders and track within the body the effectiveness of various interventions.
My first day at Fralin, I’m met in the spacious lobby by research coordinators Doug Chan and Whitney Allen, as well as Mark Orloff, a translational biology, medicine, and health doctoral student. We arrive at the Human Neuroimaging Laboratory past security card doors and a lobby, which, like any other medical lobby, has a pile of magazines on the waiting room table.
Past the lobby are doctors’ individual offices. Other members of the lab work out of a large bullpen, desks and computers and succulents. The MRI machines are further down the hall. On the other side of the window and door separating us from the machines, Orloff picks up a tiny model of a brain the color of Fun-Tak — a 3D-printed representation, he says, of his own brain. It’s about as large as a well-fed adult hamster.
“Life size,” jokes Allen.
Nearby, there are survey rooms, complete with police interrogation-style one-way mirrors and microphones so the researchers can watch patients be clinically interviewed. There are rooms where players can compete in social games with other players online to help gather more data from subjects around the world.
Surrounding the researchers are the tools key to their work. In the bullpen, the conference room, and on whiteboards, windows, and walls are mathematical formulas in every color of the marker rainbow. Math as wallpaper, as background radiation.
Pearl Chiu has jet black hair and a bearing of quiet confidence. She pauses to think before she speaks, and radiates a teacher’s joy in discussing her work. She’s the only clinically trained psychologist in the lab who has direct experience with patients in a clinical setting, and she arrived at machine learning from a distinctly human place. “As I was seeing, working with, patients, I was just frustrated with how little we knew about what is happening,” Chiu says. She believes bringing in machines to detect patterns may be a solution.
One thing is clear to Chu: “What we have now just isn’t working.”
Survey responses, functional and structural MRIs, behavioral data, speech data from interviews, and psychological assessments are all fed into the machine learning algorithm. Soon, saliva and blood samples will be added as well. Chiu’s lab hopes to pluck the diagnostic signal from this noise.
“What we have now just isn’t working.”
The fMRI scans provide the algorithm with neurological information, allowing the machine to learn what parts of the brain are lighting up for certain stimuli, building a comparison for healthy controls. The algorithm can find new patterns in our social behaviors, or see where and when a certain therapeutic intervention is effective, perhaps providing a template for preventative mental health treatment through exercises one can do to rewire the brain. Unfortunately, fMRI — like any tool — has its faults: it can give false positives. The most egregious example was a scan of a dead salmon that… showed brain activity.
A person coming into the lab will first take their clinical survey, before completing tasks — like playing behavioral games — in and out of the MRI. Their genetic info is gathered. Once all the data has been taken, it’s fed into the algorithms, which spit out a result. Quick and dirty results are available within minutes — more detailed results could take weeks. Strong models also make for faster data-crunching. A subject whose clinical interview points to depression, for example, will be processed more quickly if the researchers use a depression model.
Chiu wants to use these scans to help patients get better treatment. Perhaps, she says, this method can identify patterns that clinicians don’t notice or can’t access through the brain alone. By making mental health disorders more physical, Chiu hopes to help destigmatize them as well. If it can be diagnosed as objectively and corporeally as heart disease, would depression or bipolar disorder or schizophrenia carry the same shame?
With those patterns in hand, Chiu imagines the ability to diagnose more acutely, say, a certain kind of depression, one that regularly manifests itself in a specific portion of the brain. She imagines the ability to use the data to know that one person’s specific type of depression regularly responds well to therapy, while another is better treated with medicine.
Currently, the lab focuses on “disorders of motivation,” as Chiu calls them: depression and addiction. The algorithms are developing diagnostic and therapeutic models that the researchers hope will have a direct application in patients’ lives. “How do we take these kinds of things back into the clinic?” Chiu asks.
Machine learning is crucial to getting Chiu’s work out of the lab and to the patients they are meant to help. “We have too much data, and we haven’t been able to find these patterns” without the algorithms, Chiu says. Humans can’t sort through this much data — but computers can.
As in Chiu’s lab, the machine learning algorithms — specifically algorithms that learn by trial and error — are crucial for helping Brooks King-Casas, associate professor at the Fralin Biomedical Research Institute at VTC, figure out which combination matters out of the thousands and thousands of variables his lab is measuring.
“I’m interested in dissecting how people make decisions.”
King-Casas looks celestial, his dark hair dusted with silver and his glasses the color of the deep night sky, and when he speaks, he uses his hands as punctuation marks. In a big-picture sense, King-Casas’ lab is focused on social behaviors. They are studying the patterns, nuances, feelings, and engaged brain regions of interpersonal interaction. The lab has a particular interest in the differences in those patterns (and nuances, feelings, and engaged brain regions) between people with mental health disorders and those without. Between someone clinically healthy and someone with, say, borderline personality disorder, for whom social relationships are spider traps.
Someone like me.
“I’m interested in dissecting how people make decisions, and the ways in which that varies across different psychiatric disorders,” King-Casas says.
The lab is building quantitative models which parse the components of the decision-making process, hopefully pinpointing where that process goes awry. By atomizing interaction, King-Casas hopes to put numbers to feelings — to study social behavior as we would cellular. The data could potentially tell us how someone with borderline personality disorder values the world, versus someone unafflicted.
“We need these reinforcement learning algorithms to take a hundred choices that you make, and parse them into three numbers that capture all of that,” King-Casas says. Without the algorithms, he says, such a distillation is not even possible. Even in something as simple as a two-choice task, the lab has as many as ten models that could explain how choices are being made.
“Think about the brain as a model,” King-Casas says. “What we do is we take everybody’s behavior and we say ‘okay, which model best captures the choices that you made?’”
What the lab is trying to do is discover the algorithms of the computational brain.
Humans are biased, and that carries over to the algorithms we write, too. It is tempting to believe that algorithms make judgments based on impartial data, but this isn’t true. The data is collected and shaped by people who come with their own biases. And even the tools used to collect that data have shortfalls that can bias the data as well.
A diagnosis found by a machine learning pattern would mean little if the bias is in the programming. Psychiatry, in particular, has a history of gender bias, which continues to this day: being a woman makes you more likely to be prescribed psychotropic drugs, the World Health Organization notes.
A diagnosis found by a machine learning pattern would mean little if the bias is in the programming
Even something as basic as pain is colored by gender. A 2001 study published in The Journal of Law, Medicine & Ethics found that women report more pain, more frequent pain, and longer experiences of pain, yet are treated less aggressively than men. They are met with disbelief and hostility, the report concludes, until they essentially prove they are as sick as a male patient.
Unsurprisingly, race plays a factor in medical treatment. There’s the problem of access: whiter, more affluent communities have better resources. But even when black people have equal access to medicine, they tend to be undertreated for pain. A 2016 study by the University of Virginia found that medical students had ridiculous — and potentially dangerous — misconceptions about black people, like that their nerve endings are less sensitive. Inequitable treatment afflicts Latinx, Native American, and Asian and Pacific Islander patients as well.
How can the researchers at VTCRI ensure that their machine is not learning our biases?
“That’s a really, really, really tough question,” Chiu says. In this work, interviewers do not know a subject’s mental health history, or what treatments they may be receiving. The data analyst is blind as well. Basically, everyone involved is “blind to as many things as possible.”
Chiu considers her presence a help as well. The team has a diverse array of students, researchers, and scientific backgrounds. Chiu is acutely aware of what’s at stake: if the diagnostic and custom treatment guidelines her lab’s algorithms discover are infected with the same human biases already at work in society, they will simply codify — and perhaps even strengthen — those biases.
The technical aspects of the machine learning algorithms’ data, such as the visual stimuli used in the functional MRI scans, must be carefully controlled with biases accounted for as well.
Chiu lab research programmer Jacob Lee, speaking over video chat, helped explain the challenge. There are lots of things to consider, including human biases, that can affect the data quality, Lee tells me.
One issue is that the amount of time between the “events of interest” in the fMRI machine must be carefully planned to ensure clean results. Lee explains the challenges: The machine gets a snapshot of the brain every two seconds. But getting the right window of time is crucial. To make sure that the researchers are measuring the response, they have to account for the lag time it takes for the blood to get to the correct part of the brain, which is what the machine is truly measuring. That limits neuroimaging and creates the intervals between the scans.
The triggers themselves must be carefully thought of; different cultures think of certain colors or numbers differently. The stimuli include showing images meant to spur attention and emotion from the International Affective Picture System database or asking subjects to rate risks.
The small number of subjects — sometimes tens of people — in fMRI studies could also be misleading. That’s why the lab is trying to share data to increase the size and diversity of cohorts. (The imaging lab at Tech has scanned over 11,000 hours since they opened, Chiu writes in an email. To help ensure privacy, they do not collect numerical data on subjects.) The Human Neuroimaging Lab currently works and shares data with University College London, Peking University, in the western suburbs of Beijing, and the Baylor College of Medicine. Additionally, they are currently collaborating with researchers at the University of Hawai’i at Hilo.
However, the fMRI scanners are almost all located in developed countries, while most of the world’s population is not. Add in that most of the cohorts being studied are tipped toward population centers and college students — an easily accessible pool of subjects — and the data seems even less indicative of the world.
The fMRI has its problems: for instance, scientists are not truly looking at the brain, according to Science Alert. What they are looking at is a software representation of the brain, divided into units called voxels. A Swedish team led by Anders Eklund at Linköping University decided to test the three most popular statistical software packages for fMRI against a human data set. What they discovered is that the differences between the three resulted in false positives was higher than expected. The findings, published in the Proceedings of the National Academy of Sciences of the United States of America in June 2016, are cause for caution.
The paper’s initial alarm about invalidating 40,000 fMRI-based research papers was overblown, later corrected to closer to 3,500. Still, as Vox explained, neuroscientists do not believe fMRI is a broken tool — it merely needs continued sharpening. Making scans more accessible and more accurate will be key to a clinical application of the techniques.
Who gets to define what “normal” is?
“All of that hardware renovation is super, super valuable,” Adam Chekroud, a scientist whose work in computational psychiatry has been published in influential journals like The Lancet, says in a phone interview. Chekroud has worked in machine intelligence before, using algorithms which proved accurate in predicting the specific antidepressant with the best chance of success. A firm believer that clinical application is the most important part of the field, Chekroud is the founder of, and chief scientist for, Spring Health, which aims to bring the technologies to the patients.
Beyond buggy fMRI, computational psychiatry faces ethical, spiritual, practical, and technological issues. Immediate issues include the huge stores of intensely personal data necessary for the algorithms, which could prove irresistible to hackers. Consent is a question as well: can a depressed person, for example, be considered to be in sound enough mind to consent? If we create models for mental health disorders, are we not also creating a model for normality, which can be used as a cudgel as well as a tool? Who gets to define what “normal” is?
Paul Humphreys, Commonwealth professor of philosophy at the University of Virginia, where he studies the philosophy of science, raises another fascinating concern: Machine learning presents a black box problem similar to the brain itself. We can train an algorithm to recognize a cat by feeding it enough data, but we cannot quite determine yet how it decides what a cat is. This presents a risk of miscommunication between scientists and their machine learning results since scientists have only a partial understanding of what their models are saying. Can we trust that the machine’s definition of a mental illness is close enough to our own?
Problems that seem fantastical now may threaten the field later
Further complicating matters is the lack of ground truth in psychiatric data sets, a human-vetted training set with which we can test the machine’s learning.
“You need at least one, truly independent, well-powered verification,” says Steven Hyman in a phone interview. The director of the NIMH from 1996 to 2001, where he pushed for neuroscience and genetics to be incorporated into psychiatry, Hyman is now a core institute member and director of the Stanley Center for Psychiatric Research at the Broad Institute.
A machine learning algorithm, which diagnoses, say, skin cancer, has a training set of samples which have been biopsied and cataloged, leaving no doubt as to whether they are malignant or not. But there is no biopsy for mental health disorders, at least not yet. “And you’d be surprised by how often people forget that,” Hyman says.
The future of computational psychiatry provides its own problems, problems that seem fantastical now but may threaten the field later. If the real-time brain-scanning capabilities the field is working on do become cheap, easy, and accurate for specific thought patterns and scenarios, one can imagine a world wherein we can basically monitor thoughts, an ability which is ripe for abuse.
Perhaps most concerning of all is the potential for computational psychiatry to join the long, notorious list of sciences used to disenfranchise people. If we can put numbers and biomarkers to feelings, what becomes of the soul? What makes us a human being, instead of a complex organic model?
“The way we do diagnosis today is really pretty limited.”
“It’s showing that there is no ghost in the machine. It’s just a machine,” Chandra Sripada, an associate professor with a joint appointment in philosophy and psychiatry at the University of Michigan, says by phone. Sripada believes the fear is perhaps unfounded. It comes up in other, older branches of psychiatry, including B. F. Skinner’s behaviorism.
“Any comprehensive theory of psychology, there’s a worry that it’s going to take away soul, and the mysterious, and the aspects of who we are that we want to be kind of forever protected from explanation,” Sripada says.
While computational models do offer the possibility of diagnosis and treatment, scientists are walking a tightrope. They are, after all, working with people and don’t want to undermine the patients’ own experiences. People want to be viewed as human beings; their social and environmental factors are crucial. It’s dangerous to ignore those things or to imagine they won’t matter for treatment.
“What you’re calling the soul is sort of an inescapable component of treating many people,” Humphreys, the philosophy professor, says.
Understanding what a mental health disorder even is proves surprisingly difficult. As Gary Greenberg, DSM and pharmaceutical-model psychiatric skeptic, points out to The Atlantic, the term “disorder” was used to specifically avoid the term “disease,” which implies a level of base physiological understanding that is lacking in psychiatry.
“The way we do diagnosis today is really pretty limited,” says Tom Insel, co-founder of Mindstrong Health and director of the National Institute of Mental Health (NIMH) from 2002 to 2015, in a phone interview. “It’s a little bit like trying to diagnose heart disease without using any of the modern instruments, like an EKG, cardiac scans, blood lipids, and everything else.”
The hope is that computational psychiatry can provide the equivalent to those tools. Current understanding of mental health disorders is murky. The common explanation in the public consciousness that some sort of chemical imbalance is to blame — especially in the case of depression — has been left by the wayside in favor of thinking of the brain as operating on circuits. When a problem arises in said circuits, we have a mental health disorder.
The problem with psychiatry, to Insel, is the current lack of biomarkers. Acute clinical observation has lead to a taxonomy of afflictions, which he feels is a critical aspect of the field that psychiatry does particularly well, but without neurological underpinnings is simply not enough. “It’s necessary, but not sufficient,” Insel says.
“A disorder like depression is many, many illnesses.”
Current NIMH director Joshua Gordon agrees. The NIMH’s push toward more objective measures in the field began under director Steven Hyman’s leadership from 1996 to 2001. It was further propelled by Insel, and it’s now having money poured into it by Gordon, with the goal of providing concrete, objective data to help sharpen diagnoses and better provide treatment. Gordon believes the criticism that the DSM model is meant to steer people toward medicine is incorrect. The best practice is to use any intervention that is effective. That being said, the diagnoses can fall short.
“We have to acknowledge in psychiatry that our current methods of diagnosis — based upon the DSM — our current methods of diagnosis are unsatisfactory anyway,” Gordon says by phone.
Further complicating matters is the diversity of mental health disorders. There is a brain chemical composition that is associated with some depressed people, Greenberg says, but not all who meet the DSM criteria. Add to this the problem that many disorders present as a spectrum — to my more recent borderline personality disorder diagnosis, my psychiatrist also added shades of bipolar. And since the disorders were categorized without a basis in biology, Greenberg points out, one would need to discover a perfect one-to-one relationship between disorders in multiple people presenting in multiple ways that all stem from one issue in the brain to confirm the DSM model.
“That would just be incredible luck,” Greenberg says over the phone.
And what of the environmental factors? Some psychiatric disorders can be caused by external events — deaths, breakups, change in financial status, a big move, stress — which can be alleviated by time and action.
“A disorder like depression is many, many illnesses,” Insel said. “It’s like fever. There’s lots of ways to get a fever. There’s lots of ways to get major depressive disorder. We, today, don’t go beyond just taking someone’s temperature and saying ‘this person has a fever, therefore we need something to bring down the fever.’ So everybody goes on an antidepressant.”
What we are seeing right now is not a model. Whitney Allen, the research coordinator, has taken my place in the un-silent tomb. She’s imagining two different scenarios. One is the steak dinner she’d buy if she were given $50 today. Her teeth rending flesh, the taste of it, the feeling of it between incisors and tongue and gums. The second is the shoes she would get if she were given $100 a year from now. She’s imagining her father handing her the shoe box, the weight of it in her hand. Her focused thoughts are actually moving something, a slider across a screen. She can see it with the little mirror I used, so she knows how well she is thinking about the present and the future. Behind the glass on a computer screen, a storm of blue and red voxels light up like fireworks in her brain, and for a brief flash, every two seconds, the lid of the black box inside our skulls feels slightly opened.
Allen was asked to project her brain into the future, or focus on the immediate present, in an attempt to help find out what goes on under the hood when thinking about instant or delayed gratification, knowledge which could then be used to help rehabilitate people who cannot seem to forgo the instant hit, like addicts. Working in conjunction with the Addiction Recovery Research Center up on the third flood, Stephen LaConte’s lab is using real-time fMRI scans to provide neural feedback to subjects.
Harshawardhan Deshpande, a biomedical engineering grad student working on his PhD in LaConte’s lab, explains the experiment’s purpose. If addicts have a short temporal window — an issue projecting themselves into the future and understanding those consequences — they may be able to train themselves to better think in the long term. The neural feedback helps the subjects know how well they are doing at elongating that temporal window.
“In the near future, we can try to rehabilitate the ability of that participant to think about the future,” Deshpande says.
In addition to the addiction work, the LaConte lab has teamed with Zachary Irving, a philosophy professor at the University of Virginia’s Corcoran Department of Philosophy whose focus is the philosophy of cognitive science. Irving and LaConte are using the real-time fMRI to attempt to discern when, and in what way, a subject’s mind is wandering. Using categories developed in the humanities, the hope is that real-time fMRI gets closer than the currently available tools to studying how people feel about their own experiences.
“Our goal is to have that algorithm be able to detect in real time, by just looking at your neural activity, detect whether your mind is wandering or not,” Irving says over the phone.
Maybe all this, the collateral damage of psychiatry and its current mode, can be mitigated — maybe it can be stopped.