Discover Intent2Tx, a benchmark evaluating LLMs' ability to translate natural language intents into Ethereum blockchain transactions with real-world data.
Explore how rubric wording and metric choices affect measurement risk in supervised financial NLP using the JF-ICR framework for reliable model evaluation.