Think Before Writing: Feature-Level Multi-Objective Optimization for Generative Citation Visibility
Summary: arXiv:2604.19113v1 Announce Type: cross
Abstract: Generative answer engines expose content through selective citation rather than ranked retrieval, fundamentally altering how visibility is determined. This shift calls for new optimization methods beyond traditional search engine optimization. Existing generative engine optimization (GEO) approaches primarily rely on token-level text rewriting, offering limited interpretability and weak control over the trade-off between citation visibility and content quality.
In response to these challenges, we propose FeatGEO, a feature-level, multi-objective optimization framework that abstracts webpages into interpretable structural, content, and linguistic properties. Instead of directly editing text, FeatGEO optimizes over this feature space and utilizes a language model to realize feature configurations into natural language. This approach effectively decouples high-level optimization from surface-level generation.
Key Features of FeatGEO
- Feature-Level Optimization: By focusing on structural, content, and linguistic properties, FeatGEO provides a broader perspective compared to traditional token-level rewrites.
- Decoupling Optimization and Generation: This method allows for more control and flexibility, as high-level optimization is separated from the actual text generation process.
- Improved Citation Visibility: Experiments demonstrate that FeatGEO consistently enhances citation visibility, a crucial aspect in the context of generative answer engines.
- Content Quality Maintenance: The framework not only improves citation visibility but also maintains or improves the quality of the content presented.
Experimental Validation
To validate the effectiveness of FeatGEO, we conducted experiments on the GEO-Bench across three different generative engines. The results showed that FeatGEO significantly outperforms token-level baselines in terms of both citation visibility and content quality. This is a notable advancement, as traditional methods often struggled to balance these two critical components.
Insights on Citation Behavior
Further analyses revealed that citation behavior is more strongly influenced by document-level content properties than by isolated lexical edits. This insight underscores the importance of understanding the broader context of content rather than merely focusing on surface-level modifications. Moreover, the learned feature configurations exhibited strong generalizability across language models of various scales, indicating the robustness of the FeatGEO framework.
Conclusion
The introduction of FeatGEO marks a significant step forward in the field of generative citation visibility optimization. By leveraging feature-level insights and optimizing over a broader set of properties, this framework not only addresses the limitations posed by token-level approaches but also enhances the overall quality and visibility of generated content. As generative engines continue to evolve, methods like FeatGEO will be crucial in shaping how information is presented and accessed in the digital landscape.
For more information, refer to the original paper on arXiv.
