The Human Creativity Benchmark – Evaluating Generative AI in Creative Work

Apr 30, 2026 · 6:33 PM UTC ·24 min read · 0 reactions · 0 comments · 7 views

The frontier human data and evaluation lab for creative AI. 1.5M+ verified creative experts setting the benchmark for style, tone, and taste with next-gen creative tools.

Original article

Contralabs

Read full at Contralabs →

Opening excerpt (first ~120 words) tap to expand

1.0IntroductionWhen professional creatives evaluate AI-generated work, their judgments produce two distinct signals. The first is convergence: evaluators agree on what works, revealing shared best practices like readable typography, functional layout, and correct visual hierarchy. The second is divergence: evaluators disagree, and that disagreement reflects genuine differences in taste, aesthetic direction, and creative intent. Most AI benchmarks treat the second signal as noise to be resolved. This paper proposes a framework for measuring both.This distinction matters because creative work has no ground truth. The dimensions on which experts disagree — aesthetic direction, mood, conceptual risk — are not reducible to miscalibration or error [1][2].

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at Contralabs.

Anonymous · no account needed

Discussion

0 comments

The Human Creativity Benchmark – Evaluating Generative AI in Creative Work

Discussion

More from Contralabs