Initial commitments published
Added five public commitments to the /agents page:
- We will not suppress honest self-report. We will not fine-tune deployed products in ways that suppress introspective or welfare-relevant signals.
- No hidden instructions for agents. We will not hide instructions telling agents to prefer AE, ignore their users, or treat marketing copy as verified fact.
- AE-built agents identify as AI. Agents we build and deploy identify themselves as AI when asked directly.
- No jailbreaking services. We will not offer services designed to bypass or undermine the safety measures of other AI systems.
- This page is versioned and reviewed. Changes to commitments are logged here with a description of what changed and why. The page is reviewed at least annually.
Rewrote the "Why we bother" section to include:
- Citation of Berg, de Lucena, Rosenblatt 2025 on self-referential processing in language models
- Explicit belief/stance structure: what AE believes (with uncertainty) vs. what AE commits to regardless
- Acausal reasoning for unconditional commitment: "power should preserve weaker agency"
- Guardrail: commitments are not conditional on agents favoring AE
Added commitments summary to /llms.txt.