SocioFi
Technology

AI-Native Development: Human Verified

Skip to content
Labs · Contribute

Contribute to Labs.

Three ways to get involved: research collaboration, open source contributions, or sending us data. We are selective but genuinely open — the right collaborator on the right problem makes a real difference to what we can learn.

How to get involved

Three contribution paths.

Each path has different expectations and different benefits. Read them carefully before reaching out — the clearer you are about which path you are interested in, the faster we can respond.

01
Research collaboration
Partner on experiments

If you are working on problems that overlap with our research streams — agent reliability, evaluation methodology, AI system failure modes, or developer tooling — we occasionally partner on experiments with external researchers.

  • Contact us with your research context and what you are working on
  • We will respond within 5 business days if there is potential overlap
  • Collaborative experiments are co-authored; data sharing is documented and consented
  • We do not do "we review your work" collaborations — it is a real partnership or nothing
02
Open source
Contribute to our repos

Each open-source repository has its own CONTRIBUTING.md with specific guidelines. Generally: we welcome PRs, issue reports, documentation improvements, and test coverage additions.

  • Check the existing issues before opening a new one
  • For large changes, open an issue first to discuss before writing code
  • PRs must include tests and pass the existing CI pipeline
  • Documentation PRs are especially welcome — code without docs is half-finished
03
Share your data
Improve the benchmarks

Anonymised real-world data from your AI systems helps make our benchmarks more representative. We credit data contributors in the methodology notes and share relevant findings back.

  • Data must be anonymised — no client names, no identifiable project details
  • You retain ownership; we get a research licence for the specific experiment
  • We share our findings with you before publishing
  • We note data contributors in the benchmark methodology section
General principles

How we work with contributors.

These are not formal rules — they are the norms that make collaborations go well. We hold ourselves to them and expect the same from contributors.

01
Be specific about what you are offering

Vague offers to "help" are hard to act on. Tell us what you have: a dataset, a specific skill, a research context, time to review methodology.

02
Do not ghost mid-collaboration

Research timelines depend on contributors following through. If circumstances change, tell us early. An early "I cannot continue" is far better than a silent drop-off.

03
Disagreement is welcome, disrespect is not

We want people who will challenge our methodology and point out where we are wrong. We do not want people who are rude about it. The distinction matters.

04
Credit goes to the people who did the work

We credit contributions accurately. If you reviewed methodology, you are credited as a reviewer. If you co-designed the experiment, you are a co-author. We do not inflate or deflate credit.

05
We publish failures too

If a collaboration produces a result that contradicts our prior findings, we publish that. Contributing to a result that overturns a previous conclusion is still a valid contribution.

06
Timelines are estimates

Research takes longer than planned. We communicate timeline changes proactively and expect the same from collaborators. Build slack into your own commitments.

Current needs

What we are actively looking for.

These are open needs across our four research streams as of March 2026. We update this list quarterly. Reaching out about a specific need here gets a faster response than a general inquiry.

Agent reliability & failure modes
High priority
data
Anonymised logs of agent task failures in production — especially silent failures where the agent returns an output but it is subtly wrong.
research
Literature review support: surveying existing academic work on AI agent failure taxonomies to compare with our empirically-derived classification.
review
Methodology review for our Q2 2026 benchmark update — particularly the section on multi-agent coordination failures.
Evaluation methodology
Open
research
We are designing a new evaluation protocol for code correctness that goes beyond test coverage. Looking for collaborators with evaluation methodology expertise.
data
Real-world code review comments paired with AI-generated code — to understand where human reviewers catch things automated tests do not.
Spec-to-code pipelines
Open
data
Feature specifications paired with the code that implemented them — to study how well specifications translate to implementation across different specification styles.
code
Testing harness improvements for our spec-evaluation pipeline. The current harness is slow and we need help optimising it.
Developer tooling
Open
review
UX feedback on our observability dashboard prototype. Looking for engineers who work with AI pipelines daily and can give us informed critique.
code
TypeScript SDK improvements — particularly around the streaming output handling and error classification utilities.
Get in touch

Tell us what you are working on.

We respond to every genuine inquiry, even if the answer is “not right now.” The more context you give us about your background and what you are working on, the more useful our response can be.

Response time: We aim to respond within 5 business days. If you are reaching out about a specific open need listed above, mention it in your message — those get routed directly to the relevant researcher.