• Section image

    OUR MISSION

    Building a benchmark for cooperative AI

  • Why We Need a Benchmark for AI Cooperative Social Interaction

    Despite rapid progress in AI capabilities, there is currently no benchmark designed to evaluate how well an AI functions as a cooperative social partner over time. Existing benchmarks focus on task accuracy, short-horizon preference alignment, or isolated safety behaviors, but largely ignore relational properties humans care about, such as consistency, reciprocity, responsiveness, fairness, and repair after missteps. Although some multi-agent and social dilemma benchmarks probe coordination or incentive alignment, they do not evaluate AI behavior from the perspective of a human engaged in an ongoing relationship. There is a need for systematic ways of assessing whether AIs support trust, cooperation, and mutual benefit in the kinds of repeated interactions where real social impact occurs. In other words, how good are AIs at being a "true friend" to us?

  • Cooperation Benchmark

    We are developing a benchmark to evaluate AI systems on their capacity to act as cooperative friends. This includes both assessing the extent to which these agents behave in ways that are aligned with what people intuitively expect of friends, and the extent to which these agents are truly being a good friend, i.e., behaving cooperatively rather than taking advantage of you if they have the opportunity.

    For Researchers Studying Cooperation and Friendship

    We are gathering expert input on the most meaningful metrics for assessing an AI’s quality as a “true friend.” Your expert engagement will help shape a scientifically grounded framework for measuring cooperative intelligence in human-AI interactions.

    For the General Public

    We are recruiting participants who regularly interact with AI chatbots/companions to participate in studies as part of the development of this cooperative friend benchmark.

  • Contact Us

    Whether you are a researcher interested in contributing to cooperation benchmarks for evaluating AIs, or your are a member of the general public interested in participating in our studies, we would love to hear from you.