AI outperforms clinicians' judgment in triaging postoperative patients for intensive care
Artificial intelligence (AI) in the form of a machine-learned algorithm correctly triaged the vast majority of postoperative patients to the intensive care unit in its first proof-of-concept application in a university hospital setting. The accuracy of this computer-generated algorithm is leading surgeons to envision active use of AI in the real-time acquisition of clinical information from a patient's electronic medical record to more reliably determine whether a patient needs intensive or routine postoperative care. Findings from the pilot study of the algorithm were presented at the American College of Surgeons Clinical Congress 2019.
At the present time, surgical teams rely on clinical judgment to decide which patients need postoperative intensive care. There is no single set of fixed criteria or a standardized postoperative pathway for making the determination.
Clinicians typically tend to over-triage, meaning if they are in doubt, they err on the side of caution and send a patient to intensive care. However, over-triaging may result in admitting a patient to the ICU who doesn't need to be there. "In those cases, the patient may be unnecessarily exposed to multidrug-resistant bacteria and have an increased overall length of stay. On the other hand, under-triaging means a patient that should have been in the ICU is sent to a recovery or step-down unit, and the opportunity for quick rescue of a deteriorating condition is delayed because monitoring is not as intense," said Marcovalerio Melis, MD, FACS, an associate professor of surgery, New York University Langone Hospital System, New York City, and coauthor of the pilot study.
AI is starting to be used to help patients triage their symptoms so they can decide whether they should go to the emergency department or seek treatment in another setting, such as an urgent care center. It is now beginning to be applied in surgery and has potential for generating comprehensive databases about surgical techniques and practices and their outcomes and providing evidence-based, real-time clinical support.
The pilot study utilized the random forest form of machine learning to analyze large amounts of data, search for correlations among variables, evaluate options, and find solutions for a complex problem. Random forest constructs a flow chart of the questions and answers that lead to a decision, and it pools experience and information from many sources to reduce variability and increase the reliability of predictions.
The resulting algorithm included 87 clinical variables and 15 specific criteria related to the appropriateness of admission to the ICU within 48 hours of surgery. An admission to the ICU was considered appropriate if one of these criteria was met. The criteria included: intubation for more than 12 hours, reintubation, respiratory or circulatory arrest, call for rapid response or code, blood pressure below 100/60 mHg for two consecutive hours, heart rate below 60 or above 110 bpm for two consecutive hours, use of pressors, placement of a central venous line or Swan-Ganz catheter, echocardiogram, new onset of cardiac arrhythmia, myocardial infarction, return to the operating room, blood transfusion requiring more than 4 units, or readmission to the ICU after a prior admission.
Researchers prepared a questionnaire to prospectively ask clinicians how they would evaluate the need for intensive care for each patient. "We asked clinicians which is the best pathway for each patient: should the patient go to the post-acute care unit, a regular floor, or the ICU? We asked the machine the same question and compared the results," explained Francesco Maria Carrano, MD, a postdoctoral research fellow at NYU Langone and first author of the study.
Artificial intelligence correctly triaged 41 of the 50 patients in the study (82 percent). Surgeons had an accuracy triage rate of 70 percent (35 patients), intensivists 64 percent (32 patients), and anesthesiologists 58 percent (29 patients). The number of incorrect triage decisions was lowest for AI (18 percent), followed by 30 percent for surgeons, 36 percent for intensivists, and 42 percent for anesthesiologists.
The rate of undertriage was similar for AI (12 percent) and surgeons (10 percent); the rate of overtriage was much lower for AI (6 percent) than for the clinicians whose rates ranged from 20 percent to 40 percent. Further, AI achieved a positive predictive rate of 50 percent and negative predictive rate of 86 percent.
Although the algorithm in this study clearly outperformed clinicians' judgment, it is a first step. The surgical researchers plan to apply the algorithm to other populations of patients and include other demographic and clinical features. "The majority of the patients in this study were men in our hospital. We would like to expand study of the algorithm to women and patients in other hospitals," said Dr. Carrano.
"The algorithm will be improved and perfected as the machine analyzes more patients, and testing at other sites will validate the AI model. Certainly, as shown in this study, the concept is valid and may be extrapolated to any hospital," said Dr. Melis.