Benchmarking Humor Alignment in Large Language Models

Date:

Cards Against LLMs: Benchmarking Humor Alignment in Large Language Models

Summary: arXiv:2604.08757v1 Announce Type: cross

Abstract

Humor is one of the most culturally embedded and socially significant dimensions of human communication, yet it remains largely unexplored as a dimension of Large Language Model (LLM) alignment. In this study, five frontier language models play the same Cards Against Humanity games (CAH) as human players. The models select the funniest response from a slate of ten candidate cards across 9,894 rounds. While all models exceed the random baseline, alignment with human preference remains modest. More striking is that models agree with each other substantially more often than they agree with humans. We show that this preference is partly explained by systematic position biases and content preferences, raising the question whether LLM humor judgment reflects genuine preference or structural artifacts of inference and alignment.

Introduction

The exploration of humor in artificial intelligence has gained traction as researchers delve into the complexities of human communication. This study aims to evaluate how well Large Language Models (LLMs) align with human humor preferences, particularly through the lens of the popular game, Cards Against Humanity (CAH).

Methodology

To investigate humor alignment, five advanced LLMs were selected to participate in a series of CAH games. Each model was tasked with choosing the funniest response from ten candidate cards over a total of 9,894 rounds. This setup allowed for a comprehensive analysis of how each model’s humor judgment aligns with that of human players.

Findings

  • All models exceeded the random baseline in selecting humorous responses.
  • However, there was only modest alignment with human preferences.
  • Notably, the models demonstrated a higher degree of agreement with each other than with human players.

Discussion

The findings raise important questions about the nature of humor in LLMs. The substantial inter-model agreement suggests that these systems may share underlying biases or preferences that do not necessarily reflect human tastes. This discrepancy indicates that while LLMs can generate responses that are statistically funnier than random chance, their humor judgment may be influenced by systematic biases rather than an authentic understanding of humor.

Implications

This research has significant implications for future developments in AI and its application in social interactions. Understanding how LLMs interpret humor can inform their design and deployment in various contexts, from entertainment to customer service. Additionally, it highlights the need for enhanced alignment strategies that better capture the nuances of human humor.

Conclusion

The study represents a pioneering effort to benchmark humor alignment in LLMs using a structured game format. As AI continues to evolve, ongoing research in this area will be crucial for developing models that not only understand language but also the subtleties of human interaction. The insights gained from this study pave the way for further exploration into how artificial intelligence can better resonate with human experiences.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.