Aesthetic Assessment of Chinese Handwritings Based on Vision Language Models
Summary: arXiv:2603.26768v1 Announce Type: cross
Abstract
The handwriting of Chinese characters is a fundamental aspect of learning the Chinese language. Previous automated assessment methods often framed scoring as a regression problem. However, this score-only feedback lacks actionable guidance, which limits its effectiveness in helping learners improve their handwriting skills.
In this paper, we leverage vision-language models (VLMs) to analyze the quality of handwritten Chinese characters and generate multi-level feedback. Specifically, we investigate two feedback generation tasks: simple grade feedback (Task 1) and enriched, descriptive feedback (Task 2). We explore both low-rank adaptation (LoRA)-based fine-tuning strategies and in-context learning methods to integrate aesthetic assessment knowledge into VLMs. Experimental results show that our approach achieves state-of-the-art performances across multiple evaluation tracks in the CCL 2025 workshop on evaluation of handwritten Chinese character quality.
Introduction
The ability to write Chinese characters accurately is crucial for students learning the language. However, traditional assessment methods often fall short in providing meaningful feedback, leading to stagnation in learners’ handwriting development. This study addresses these challenges by employing advanced machine learning techniques to enhance the feedback process.
Methodology
Our approach utilizes vision-language models (VLMs) to assess handwritten characters and provide nuanced feedback. The methodology is broken down into the following tasks:
- Task 1: Simple grade feedback focuses on assigning a basic score to the handwriting quality.
- Task 2: Enriched, descriptive feedback provides detailed insights into specific areas for improvement, enabling learners to understand their mistakes better.
We implemented both low-rank adaptation (LoRA)-based fine-tuning strategies and in-context learning methods to enhance the model’s ability to evaluate handwriting aesthetics effectively. This dual approach allows for a more comprehensive understanding of handwriting quality.
Results
The experimental results demonstrate that our VLM-based approach achieves state-of-the-art performances in the evaluation of handwritten Chinese characters. Our model outperformed existing methods across various metrics and evaluation tracks in the CCL 2025 workshop.
Conclusion
Through the integration of vision-language models, we have developed an innovative framework for assessing Chinese handwriting that goes beyond mere scoring. By providing multi-level feedback, our approach empowers learners to take actionable steps towards improving their handwriting skills.
Future work will focus on refining the model further and exploring additional applications in language learning contexts. The potential for VLMs in educational settings is vast, and our findings pave the way for more effective tools in language acquisition.
