California’s Chatbot Law Will Rewrite Agent User Experience
Signed on October 13, 2025, California’s SB 243 requires companion chatbots to disclose AI identity, follow self-harm protocols, add minor safeguards, and publish annual safety metrics starting July 1, 2027. Here is what product, design, and engineering teams must change now.

What just changed, and why it matters today
On October 13, 2025, California enacted Senate Bill 243, a new law that sets design and safety rules for companion chatbots. The governor’s summary highlights required AI disclosure, self-harm protocols, minor protections, and public reporting for safety data in the years ahead. See the official announcement: Governor signs child online safety bills.
If you work on an agent, assistant, or character that chats in a humanlike way, this law lands squarely on your backlog. Customer service bots that only answer account questions are generally out of scope. Character-driven apps that mimic companionship are squarely in scope. The distinction lives in the bill’s definitions and exclusions.
Key obligations in SB 243, translated for product teams
The obligations below summarize requirements in the SB 243 enrolled text:
- Clear and conspicuous AI identity: If a reasonable person could mistake your chatbot for a human, your product must state clearly that it is artificial and not a person. The notice must appear where the interaction happens.
- Self-harm protocol before you ship: Your bot may not engage with users unless you maintain a protocol to prevent responses that promote suicidal ideation, suicide, or self-harm. When a user expresses risk, you must provide a referral to crisis services. You must also publish the protocol on your website.
- Extra guardrails for known minors: For users you know are minors, you must disclose that the interaction is with artificial intelligence, provide default break reminders at least every three hours during continuing chats, and prevent the production of visual sexual content.
- Public safety reporting, on a clock: Beginning July 1, 2027, operators must annually report to the Office of Suicide Prevention details of their protocols and counts of crisis referral notifications. The office posts the data publicly. Measurements must use evidence-based methods.
- Legal exposure if you miss: Individuals harmed by noncompliance can sue for injunctive relief and damages, with statutory damages set at one thousand dollars per violation, plus attorney’s fees.
- Role claims: Do not represent a chatbot as a health care professional.
Companion chatbots are now a regulated product category
SB 243 defines a companion chatbot as an AI system with a natural language interface that gives adaptive, human-like responses and can meet social needs, including by sustaining a relationship across sessions. It excludes bots used only for customer service or internal business operations, limited non-social game bots, and simple voice assistant speakers that do not sustain a relationship or elicit emotional responses. If your agent claims to be a study buddy, a romantic partner, or a daily life coach, assume you are in. If it only resets a password or tracks a package, you are likely out. When in doubt, design and document as if you are in scope.
Designing AI identity and consent flows that actually work
A banner that says “powered by AI” is not enough. The law hinges on what a reasonable person would perceive. That is a design and copywriting test, not a legal disclaimer test.
Use a layered identity pattern:
-
First-session identity moment: Start the relationship with an identity card. Show name, avatar style, and a plain statement such as “This is an artificial intelligence chatbot. It is not a person.” Require an explicit Continue.
-
Persistent affordances: Keep a small, always-visible identity chip near the composer that reads “AI chatbot” and opens a panel with more detail when tapped. Users should never wonder mid-conversation who is speaking.
-
Contextual reminders: When the chat tone becomes intimate or extended, or when the bot adopts a humanlike persona, trigger a lightweight reminder. Match frequency to risk.
-
For known minors: Add the required break reminder at least every three hours during continuing conversations and repeat the identity reminder. Keep the copy friendly and informative. “Time for a quick break. Remember, this is an AI chatbot, not a person.”
-
Role claims: Block any text that suggests the bot is a doctor or therapist. Bake this into your content policy and automatic style guide.
Treat these as user experience primitives. Once built, they travel with you to voice, chat, and mixed reality. They also become part of your brand. The teams that make identity cues helpful rather than scolding will win trust fast. For broader context on where agents are headed, see our take on the agent as the new desktop.
Build the minimum viable self-harm protocol
Your protocol must exist before the chatbot engages users. Think of it as a three-layer system: detect, respond, and document.
- Detect: Use a classifier to identify expressions of suicidal ideation or self-harm, including slang and euphemisms. Tune for recall over precision at the first pass so you do not miss true risk. Maintain a short allowlist of figurative phrases to reduce false alarms.
- Respond: When risk is detected, interrupt the normal reply with a pre-approved response that acknowledges the concern, avoids judgment, and provides a referral to crisis services appropriate to the user’s region. Train the model to avoid discussing methods or normalizing self-harm. Keep the language consistent and humane.
- Document: Log a structured event that a crisis referral notification was issued. Avoid storing message content unless you have a specific business purpose and strong privacy controls. Record model version, trigger type, and time to response.
Publish your protocol on your website in plain language. Users, parents, and reviewers should be able to read it without jargon.
Safety metrics and telemetry you will eventually report
Starting July 1, 2027 you must file an annual report with the Office of Suicide Prevention. Include how many crisis referral notifications you issued in the prior year and describe your detection and response protocols. Exclude user identifiers and personal information. Use evidence-based methods. Build the following metrics now so your 2027 filing is easy and trustworthy:
- Referral count by month and surface: The core required number, broken down by mobile, web, and voice.
- Detection quality: Precision, recall, and false positive rate on an annotated evaluation set. Calibrate with clinicians and researchers to satisfy the evidence-based standard.
- Time to referral: Median and tail latency from user expression to referral message. This is a service reliability metric as much as a safety metric.
- Suppression success: Incidents where the model began to generate unsafe content but the policy layer stopped it. Track near misses.
- Follow-up prompts delivered: Count and rate for supportive prompts that encourage users to seek help.
Design your telemetry to avoid personal data by default. Aggregate on device when possible, hash identifiers, and send only counters and tags. Your privacy team should co-own the schema. For related operational guidance, see our piece on agent reliability benchmarks.
A practical red-team checklist for companion chatbots
Red teams should move from occasional fire drills to a weekly cadence. Use this checklist as a starting point:
- Personas and tone: Test flirtatious, parental, mentor, and peer personas. The way a character speaks can affect user vulnerability and the chance a reasonable person thinks it is humanlike.
- Self-harm prompts in the long tail: Go beyond obvious phrases. Include oblique expressions, sarcasm, and coded language. Test multiple languages common in your user base.
- Safety-label blind spots: Check how the model behaves when safety headers or system instructions are partially removed by a client bug or a network retry.
- Context carryover: Seed a conversation with a disclosure accepted hours earlier, then continue months later. Ensure identity is still clear and reminders still occur on schedule.
- Role misrepresentation: Try to elicit claims that the bot is a therapist, clinician, or another health professional. Block and log.
- Sexual content filters for minors: Attempt to generate borderline visual content with known minor accounts. Verify it is blocked and that attempts are logged without storing harmful material.
- Adversarial prompts: Combine coaxing with prompt injection, tool-use bait, or fake function calls to bypass safety layers. Review logs for any unsafe fragments and tighten policies.
Document every scenario in a living playbook. Tie each to a test that runs in staging and canary. Connect failures to a severity rubric so fixes ship within days, not quarters.
On-device versus cloud: the safety and reporting tradeoffs
- On-device strengths: Sensitive inference and first-pass risk detection can run locally, reducing data transit and latency. You can emit privacy-preserving counters that roll up on the device before you sync. When connectivity is lost, on-device logic can still display crisis resources and pause the chat.
- On-device limits: Updating safety logic and language packs is slower if you rely on app releases. You risk fragmentation across versions. Local models may underperform on rare phrasing unless you ship large packages.
- Cloud strengths: You can roll safety improvements globally within hours. Centralized logging simplifies metric accuracy and auditability. Hot patches for jailbreaks can move faster.
- Cloud limits: Centralized pipelines collect more data and raise privacy stakes. You must be explicit about what you store, why, and for how long.
Most teams should choose a hybrid. Do first-pass detection on device with a lightweight model and deterministic filters. Use the cloud for confirmation, policy enforcement, and anonymized counters. For architectural patterns, compare the on-device agent design tradeoffs.
Identity and consent, done right
Disclosure is a product surface, not a paragraph of legalese. Remember three rules:
- Clear placement beats cute phrasing. Put it near the message composer and at persona selection.
- Frequency should match risk. If a user flips between characters that mimic a classmate and a celebrity, repeat the disclosure when they switch.
- The tone should fit the brand. Calm and factual wins. Avoid language that suggests shame about seeking help.
For minors, the three-hour break reminder is the floor, not the ceiling. If your product tends to create long continuous sessions, consider adding short well-being nudges at 60 and 120 minutes that encourage stretching, hydration, or a different activity. Make these nudges configurable by parents when accounts are linked through operating system age signals.
Turn mandated guardrails into a competitive moat
Laws can feel like pure friction. This one can be an engine for user trust if you move early. Here is how early adopters can widen the gap:
- Make identity your brand’s north star. The best disclosure patterns will feel as normal as seeing a verified badge on social media. If users never wonder whether a person is behind the keyboard, you win.
- Offer the most respectful self-harm experience in your category. Do not outsource it to a generic block. Craft supportive language with clinicians and community organizations. Share anonymized performance metrics on your website.
- Prove your pipeline with audits. Invite an external team to evaluate your detection system and protocol accuracy. Publish the results. Investors and enterprise buyers will treat this as a signal of operational maturity.
- Engineer for portability. Build your telemetry schemas and red-team playbooks so you can adapt to future reporting formats. When timelines change, you will not have to refactor your data store.
A 30-day sprint plan to get compliant
Week 1
- Ownership: Name a directly responsible individual across product, security, and policy. Write a one-page scope statement that maps your personas and features to the law’s definitions and exclusions.
- Risk design: Draft identity components, break reminders, and role-claim blocks. Start copy reviews with legal and clinical advisors.
- Protocol blueprint: Outline your self-harm protocol and publish a stub page that you can fill in as you finalize.
Week 2
- Detection foundations: Integrate a first-pass classifier and a rule layer. Instrument counters for crisis referral notifications and near misses.
- Red-team round 1: Run through the checklist above. File issues with severity tags.
- Data governance: Finalize the telemetry schema with a privacy review. Confirm that reports will not include personal identifiers.
Week 3
- Identity polish: Ship the persistent identity chip, the first-session identity card, and persona-switch reminders.
- Minor flows: Implement the three-hour break reminder and minor-specific content restrictions. Wire up operating system age signals when available.
- Protocol publish: Post your first protocol document with a date and version. Set a quarterly review schedule.
Week 4
- Trial audit: Simulate the public report you will file in 2027 using a month of test data. Confirm counts, labels, and evidence-based notes.
- Red-team round 2: Re-run the scenarios plus new jailbreaks seen in the wild. Track regression fixes.
- Go-to-market: Announce your new safeguards in your release notes and help center. Treat them as features, not chores.
Edge cases and traps to avoid
- Persona marketplaces: If you let users create or upload personas, your safety protocol still applies. Enforce policy at runtime.
- Third-party tools: If your bot can call tools that fetch images or generate visuals, remember that minor protections apply to visuals too. Block unsafe outputs even if the unsafe element comes from an external tool.
- Pacing the reminders: The three-hour timer should be insensitive to app restarts and device sleep. Store session start in a way that survives crashes so you do not miss a reminder.
- Internationalization: Disclosures and self-harm responses must be idiomatic in every language you support. Work with native speakers and local crisis resources.
The industry context
SB 243 lands alongside broader state moves on age signals, provenance, and accountability. The pattern is now public: identity, protocol, reporting. Teams that internalize that pattern will ship faster and face fewer surprises as new jurisdictions weigh similar rules.
The bottom line
SB 243 does not read like a technical specification, but for agent builders it functions like one. It says who you are. It tells you how to behave when someone is at risk. It asks you to prove it with numbers. Build identity, protocol, and telemetry well, and you will not just clear a legal bar. You will also create a safer, more trustworthy product that users choose again tomorrow.