AI Safety Is Underfunded by Design: Model for Incentive-Aligned AI Safety Policy
The article discusses the underfunding of AI safety due to misaligned incentives between AI companies and societal needs. It highlights how catastrophic risks create a gap in motivation for companies to invest in safety measures. A model is proposed to quantify this imbalance and suggest corrective policies to align incentives better.
- ▪AI companies face limited incentives to protect against catastrophic risks, which can exceed their market value.
- ▪A hypothetical AI company worth $800 billion could face $5 trillion in damages, leading to insolvency and reduced motivation to prevent such risks.
- ▪The article suggests that a corrective policy should incentivize AI companies to invest more in safety measures to align with societal goals.
Opening excerpt (first ~120 words) tap to expand
AI Safety Is Underfunded by DesignA Model for Incentive-Aligned AI Safety PolicyRyan BakerMay 19, 20261ShareDean Ball recently put his finger on something important about AI liability and incentives:In general, market actors do not have great incentives to protect against catastrophic risks. They are massive negative externalities, often dwarfing the balance sheet of any individual firm. Say Anthropic releases a model that a malicious actor uses to conduct a cyberattack that does $5 trillion dollars in damage. Anthropic is only worth $800 billion, so if they get sued for $5 trillion, they are already well past the point of insolvency.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Hacker News (AI / LLM).