Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation
In the realm of biomedical research, the accuracy of machine learning models is heavily reliant on precise data labeling. However, annotating biomedical time-series data poses significant challenges for researchers. A recent study published on arXiv (2603.26592v1) investigates the effectiveness of various sample selection methods that could enhance the annotation process. This article delves into the findings of the study, which compares three distinctive sample selection strategies: random sampling (RND), farthest-first traversal (FAFT), and an innovative graphical user interface-based method that facilitates exploration of complementary 2D visualizations (2DVs) of high-dimensional data.
The research involved a comprehensive evaluation of the sample selection methods across four classification tasks, specifically targeting infant motility assessment (IMA) and speech emotion recognition (SER). A total of twelve annotators, categorized into experts and non-experts, participated in the annotation process under a restricted budget for annotations. Following the annotation phase, additional experiments were conducted to assess the efficacy of the sampling methods used.
Key Findings from the Study
- Performance Across Tasks: The study found that the 2DV method consistently outperformed the other sampling strategies when aggregating labels across all classification tasks. In the context of IMA, 2DV was particularly effective in capturing rare classes, showcasing its potential for handling complex data scenarios.
- Variability in Label Distribution: While 2DV proved beneficial in many aspects, it also led to increased variability in label distribution among annotators. This variability emerged due to the limitations imposed by the restricted annotation budget, which negatively impacted classification performance when models were trained using individual annotators’ labels.
- Expert vs. Non-Expert Performance: For the SER task, the 2DV method outshone the other methods when employed by expert annotators. Interestingly, it matched the performance of expert annotators in settings involving non-experts, suggesting a level of accessibility for less experienced annotators.
- Risk Analysis: The study conducted a failure risk analysis, revealing that RND was the safest selection strategy when the number of annotators or their expertise was uncertain. In contrast, the 2DV method carried the highest risk due to its variability in label distribution.
- User Experience: Post-experiment interviews indicated that annotators found the 2DV method more engaging and enjoyable, enhancing their overall experience during the annotation task.
In conclusion, the findings suggest that 2DV-based sampling presents a promising avenue for annotating biomedical time-series data, particularly in scenarios where the annotation budget is not severely constrained. As the demand for precise data labeling in biomedical research continues to grow, the integration of innovative sampling strategies like 2DV may play a crucial role in advancing the field.
