Animating Petascale Time-varying Data on Commodity Hardware with LLM-assisted Scripting
Summary: arXiv:2603.07053v2 Announce Type: replace
Abstract: Scientists face significant visualization challenges as time-varying datasets grow in speed and volume, often requiring specialized infrastructure and expertise to handle massive datasets. Petascale climate models generated in NASA laboratories necessitate a dedicated group of graphics and media experts and access to high-performance computing resources. Scientists may need to share scientific results with the community iteratively and quickly. However, the time-consuming trial-and-error process incurs significant data transfer overhead and far exceeds the time and resources allocated for typical post-analysis visualization tasks, disrupting the production workflow.
Our paper introduces a user-friendly framework for creating 3D animations of petascale, time-varying data on a commodity workstation. Our contributions include:
- Generalized Animation Descriptor (GAD): A keyframe-based adaptable abstraction for animation.
- Efficient Data Access: Access from cloud-hosted repositories to reduce data management overhead.
- Tailored Rendering System: A customized rendering approach designed for efficiency and effectiveness.
- LLM-assisted Conversational Interface: A scripting module that allows domain scientists with no visualization expertise to create animations of their region of interest.
We demonstrate the framework’s effectiveness with two case studies. The first case involves generating animations where sampling criteria are specified based on prior knowledge. The second case showcases the production of AI-assisted animations, in which sampling parameters are derived from natural-language user prompts. In both scenarios, we utilize large-scale NASA climate-oceanographic datasets that exceed 1PB in size, yet we achieve a fast turnaround time of 1 minute to 2 hours.
Users can generate a rough draft of the animation within minutes, allowing for quick iteration and feedback. They can then seamlessly incorporate as much high-resolution data as needed for the final version, thus optimizing the visualization process.
This innovative approach not only democratizes access to advanced visualization techniques but also significantly enhances the productivity of scientists working with large datasets. By enabling quick and efficient animation creation, we bridge the gap between complex data analysis and effective communication of scientific findings.
In conclusion, our framework represents a pivotal advancement in the realm of scientific visualization, empowering researchers to better share their discoveries with the community and facilitating a deeper understanding of complex time-varying phenomena.
