At first glance, this is an example of a major technical victory. Through careful development and testing, an AI model successfully augmented the doctors’ abilities to diagnose their patients. But a new report from the Data & Society Research Institute argues this is only half the story. The other half is the amount of skillful social labor that the clinicians leading the project needed to perform in order to integrate the tool into their daily workflows. This included not only designing new communication protocols and creating new training materials but also navigating workplace politics and power dynamics.
The case study is an honest reflection of what it really takes for AI tools to succeed in the real world. “It was really complex,” says Madeleine Clare Elish, a cultural anthropologist who examines the impacts of AI and a co-author of the report.
Innovation is supposed to be disruptive. It shakes up old ways of doing things to achieve better outcomes. But rarely in conversations about technological disruption is there an acknowledgement that disruption is also a form of “breakage.” Existing protocols turn obsolete, social hierarchies get scrambled. To make it work within existing systems requires what Elish and her co-author Elizabeth Anne Watkins call “repair work.”
During the researchers’ two-year study of Sepsis Watch at Duke Health, they documented numerous examples of this disruption and repair. One of the primary ones was the way in which the tool challenged the medical world’s deeply-ingrained power dynamics between doctors and nurses.
In the early stages of the tool design, it became clear that Rapid Response Team (RRT) nurses would need to be its primary users. Though a patient’s attending physician is typically in charge of evaluating and making sepsis diagnoses, they don’t have time to continuously monitor another app on top of their existing duties in the emergency department. In contrast, the main responsibility of an RRT nurse is to continuously monitor patient wellbeing and provide extra assistance where needed. Checking the Sepsis Watch app naturally fitted into their workflow.
But here came the challenge. Once the app flagged a patient as high risk, a nurse would need to call their attending physician (known in medical speak as “ED attendings”). Not only did these nurses and attendings often have no prior relationship because they spent their days in entirely different sections of the hospital, the protocol also represented a complete reversal of the typical chain of command in any hospital. “Are you kidding me?” one nurse recalled thinking after learning how things would work. “We are going to call ED attendings?”
But this was indeed the best possible solution, arrived after years of consultation between many stakeholders across the hospital. So the project team went about repairing the “disruption” in various big and small ways. The head nurses hosted informal pizza parties to build excitement and trust about Sepsis Watch among their fellow nurses. They also developed communication tactics to smooth over their calls with the attendings, such as to make only one call per day to discuss multiple high-risk patients at once, timed for when the physicians were least busy.
On top of that, the project leads began regularly reporting out the impact of Sepsis Watch to the clinical leadership. The project team discovered that not every hospital staff believed sepsis-induced death was a problem at Duke Health. Doctors, especially, who didn’t have a bird’s eye view of the hospital’s statistics were far more occupied with the emergencies they were dealing with day to day, like broken bones and severe mental illness. As a result, some found Sepsis Watch a nuisance. But for the clinical leadership, sepsis was a huge priority, and the more they saw Sepsis Watch working, the more they helped grease the gears of the operation.
Elish identifies two main factors that ultimately helped Sepsis Watch succeed. First, the tool was developed for a hyper-local, hyper-specific context: it was developed for the emergency department at Duke Health and nowhere else. “This really bespoke development was key to the success,” she says, which flies in the face of typical AI norms.
Second, throughout the tool’s development process, the team regularly sought feedback from nurses, doctors, and other staff up and down the hospital hierarchy. This not only made the tool more user-friendly but also cultivated a small group of committed staff to help champion its success. It also made a difference that the project was led by Duke Health’s own clinicians, says Elish, rather than by technologists who had parachuted in from a software company. “If you don’t have an explainable algorithm,” she says, “you need to build trust in other ways.”
These lessons are very familiar to Marzyeh Ghassemi, an incoming assistant professor at MIT who studies machine learning applications for health care. “All machine-learning systems that are ever intended to be evaluated on or used by humans must have socio-technical constraints at front of mind,” she says. Especially in clinical settings, which are run by human decision-makers and involve caring for humans at their most vulnerable, she adds, “the constraints that people need to be aware of are really human and logistical constraints.”
Elish hopes her case study of Sepsis Watch convinces researchers to rethink how to approach medical AI research and AI development at large. So much of the work being done in the space right now focuses on “what AI might be or could do in theory,” she says. “There’s too little information about what actually happens on the ground.” But for AI to live up to its promised benefits, the social integration of the tools is as important as the technical development.
Her work also raises serious questions. “Responsible AI must require attention to local and specific context,” she says. “My reading and training teaches me: you can’t just develop one thing in one place and then roll it out somewhere else.”
“So the challenge is actually to figure how we keep that local specificity while trying to work at scale,” she adds. That’s the next frontier of AI research.