81
v1v2v3 (latest)

Comparative validation of surgical phase recognition, instrument keypoint estimation, and instrument instance segmentation in endoscopy: Results of the PhaKIR 2024 challenge

Tobias Rueckert
David Rauber
Raphaela Maerkl
Leonard Klausmann
Suemeyye R. Yildiran
Max Gutbrod
Danilo Weber Nunes
Alvaro Fernandez Moreno
Imanol Luengo
Danail Stoyanov
Nicolas Toussaint
Enki Cho
Hyeon Bae Kim
Oh Sung Choo
Ka Young Kim
Seong Tae Kim
Gonçalo Arantes
Kehan Song
Jianjun Zhu
Junchen Xiong
Tingyi Lin
Shunsuke Kikuchi
Hiroki Matsuzaki
Atsushi Kouno
João Renato Ribeiro Manesco
João Paulo Papa
Tae-Min Choi
Tae Kyeong Jeong
Juyoun Park
Oluwatosin Alabi
Meng Wei
Tom Vercauteren
Runzhi Wu
Mengya Xu
An Wang
Long Bai
Hongliang Ren
Amine Yamlahi
Jakob Hennighausen
Lena Maier-Hein
Satoshi Kondo
Satoshi Kasai
Kousuke Hirasawa
Shu Yang
Yihui Wang
Hao Chen
Santiago Rodríguez
Nicolás Aparicio
Leonardo Manrique
Juan Camilo Lyons
Olivia Hosie
Nicolás Ayobi
Pablo Arbeláez
Yiping Li
Yasmina Al Khalil
Sahar Nasirihaghighi
Stefanie Speidel
Daniel Rueckert
Hubertus Feussner
Dirk Wilhelm
Christoph Palm
Main:32 Pages
16 Figures
Bibliography:5 Pages
17 Tables
Abstract

Reliable recognition and localization of surgical instruments in endoscopic video recordings are foundational for a wide range of applications in computer- and robot-assisted minimally invasive surgery (RAMIS), including surgical training, skill assessment, and autonomous assistance. However, robust performance under real-world conditions remains a significant challenge. Incorporating surgical context - such as the current procedural phase - has emerged as a promising strategy to improve robustness and interpretability.To address these challenges, we organized the Surgical Procedure Phase, Keypoint, and Instrument Recognition (PhaKIR) sub-challenge as part of the Endoscopic Vision (EndoVis) challenge at MICCAI 2024. We introduced a novel, multi-center dataset comprising thirteen full-length laparoscopic cholecystectomy videos collected from three distinct medical institutions, with unified annotations for three interrelated tasks: surgical phase recognition, instrument keypoint estimation, and instrument instance segmentation. Unlike existing datasets, ours enables joint investigation of instrument localization and procedural context within the same data while supporting the integration of temporal information across entire procedures.We report results and findings in accordance with the BIAS guidelines for biomedical image analysis challenges. The PhaKIR sub-challenge advances the field by providing a unique benchmark for developing temporally aware, context-driven methods in RAMIS and offers a high-quality resource to support future research in surgical scene understanding.

View on arXiv
Comments on this paper