Lazy or Efficient? Towards Accessible Eye-Tracking Event Detection Using LLMs
Summary: arXiv:2604.13243v1 Announce Type: cross
Introduction
Gaze event detection plays a crucial role in various fields such as vision science, human-computer interaction, and applied analytics. It involves the identification of key visual events such as fixations and saccades from eye-tracking data. However, the current methodologies employed often necessitate specialized programming skills and meticulous management of diverse raw data formats, creating significant barriers for researchers and practitioners.
Challenges with Current Workflows
Traditional gaze event detection methods like I-VT (I-Velocity Threshold) and I-DT (I-Direction Threshold) have exhibited effectiveness in controlled environments. Nevertheless, they come with inherent limitations:
- High sensitivity to preprocessing techniques and parameter settings.
- Inaccessibility for users without advanced technical skills.
- Increased time and effort required for data handling and analysis.
Introducing a New Approach
This paper presents an innovative solution—a code-free, large language model (LLM)-driven pipeline designed to streamline the eye-tracking event detection process. This system significantly eases the burdens associated with traditional workflows by enabling users to interact using natural language instructions. The capabilities of this new framework include:
- Raw Data Inspection: The system inspects raw eye-tracking files to deduce their structure and associated metadata.
- Automated Routine Generation: It generates executable routines for data cleaning and detector implementation based on simple user prompts.
- Event Labeling: The generated detector is applied to accurately label fixations and saccades.
- Result Reporting: It provides results along with explanatory reports for better understanding and analysis.
- Iterative Optimization: Users can iteratively refine their analyses by modifying their initial prompts.
Evaluation and Results
The proposed framework has been evaluated against well-established public benchmarks. The results revealed that the LLM-driven approach achieves an accuracy level comparable to traditional detection methods. Moreover, it significantly reduces the technical overhead typically associated with such analyses.
Conclusion
This novel framework represents a substantial advancement in making eye-tracking research more accessible. By lowering the barriers to entry, it opens up opportunities for a broader audience to engage in eye-tracking studies. The flexibility and user-friendliness of this LLM-driven pipeline offer a promising alternative to the code-intensive workflows that have traditionally dominated the field.
Overall, this work signals a shift towards more efficient and inclusive methodologies in eye-tracking research, encouraging further innovation in the domain.
