ACM CHI 2023

Comparing Zealous and Restrained AI Recommendations in a Real-World Human-AI Collaboration Task

Chengyuan Xu1 Kuo-Chin Lien2 Tobias Höllerer1
1 UCSB logo 2 Appen logo
Paper PDF Videos
teaser image

Given two AIs that generate recommendations of similar performance (F1 score), a "zealous" AI would prioritize recall by giving more recommendations (in this study, face detections), even low-confidence ones. On the other hand, a "restrained" AI would only provide high-precision recommendations. Which AI can help the human teammate annotate faces in a video anonymization task in less time and with higher recall? Will the collaboration with the two AIs affect user skills differently?

We share our findings and suggestions from a large real-world user study on when to use "zealous" or "restrained" AIs to boost human-AI team performance or avoid hurting user skills.

Paper Highlights [9:14]

Preview Video [0:30]

Abstract

When designing an AI-assisted decision-making system, there is often a tradeoff between precision and recall in the AI's recommendations. We argue that careful exploitation of this tradeoff can harness the complementary strengths in the human-AI collaboration to significantly improve team performance. We investigate a real-world video anonymization task for which recall is paramount and more costly to improve. We analyze the performance of 78 professional annotators working with a) no AI assistance, b) a high-precision "restrained" AI, and c) a high-recall "zealous" AI in over 3,466 person-hours of annotation work. In comparison, the zealous AI helps human teammates achieve significantly shorter task completion time and higher recall. In a follow-up study, we remove AI assistance for everyone and find negative training effects on annotators trained with the restrained AI. These findings and our analysis point to important implications for the design of AI assistance in recall-demanding scenarios.

TL;DR: Careful exploitation of the tradeoffs in AI precision and recall can harness the complementary strengths in the human-AI collaboration to significantly improve team performance. Naively pairing humans with an AI system designed for autonomous settings could potentially have a negative training effect on the users.

Return