Part I — The Paradigm Shift

Chapter 3 AI as the Specification Engine

Chapter 3 — AI as the Specification Engine

Let me show you something that would have seemed like science fiction in 2015.

I open a conversation with an AI assistant. I type:

Design a FOSM lifecycle for an invoice in a consulting firm. Include states, events, guards, side-effects, and actors. The firm does retainer and project-based billing. Invoices sometimes require client approval before payment.

Forty seconds later, I have a complete specification. States I hadn’t thought of. Guards that encode business rules I would have missed in a first-pass design. Side-effects that correctly anticipate downstream consequences. A working first draft of a lifecycle that would have taken a senior domain expert two hours to sketch on a whiteboard.

This is not a demo. This is the current state of practice. And it changes everything about why FOSMs are now the right model for business software.

The Historical Barrier

Let me be honest about why FOSM-style modeling wasn’t the default before.

In Chapter 1, we established that CRUD won not because it was better, but because it was easier. The deepest reason it was easier: specification cost.

Designing a correct, complete state machine for a real business object is genuinely hard. You have to know the domain deeply. You have to anticipate edge cases that only emerge in production — what happens when a retainer invoice is partially paid but the retainer period has ended? What happens when a client disputes an invoice that’s already been marked partially paid? What happens when the company restructures and the counterparty entity changes mid-lifecycle?

These aren’t hypothetical edge cases. They’re real situations that any consulting firm’s finance team has navigated. And the only way to get them right in a state machine specification is to either have deep domain expertise yourself, or spend days interviewing the people who do.

Then the business changes. A new billing type is introduced. A new approval requirement is added. The state machine needs revision. Back to the whiteboard. Back to the interviews. Back to the expensive, slow process of translating business knowledge into formal specification.

This is why BPM tools always struggled. Not because workflow engines were technically wrong — the concept was sound. But the cost of maintaining accurate process specifications in a changing business was prohibitive for most organizations.

CRUD sidestepped the problem entirely: there’s nothing to specify upfront. Add a table, add some fields, build some forms. The business rules will live in someone’s head and in the application code. This is a terrible long-term trade, but the short-term savings are real.

That calculation has changed.

What AI Actually Does Well Here

Large language models are, at their core, next-token prediction systems trained on vast quantities of human text — including decades of business documentation, process manuals, software specifications, accounting standards, HR handbooks, legal agreements, and industry best practices.

What this means in practice: LLMs have internalized a remarkable amount of domain knowledge about how businesses work. They know what an invoice is in a consulting context versus a retail context. They know what “approval authority” means in finance. They know what guards should protect a contract execution. They know the difference between a vendor being “suspended” and “terminated,” and what each state implies.

When you ask an LLM to design a FOSM lifecycle, it does something that looks like reasoning but is fundamentally a synthesis of patterns it has seen in thousands of real business process documents. That synthesis is extraordinarily useful — not because it’s always right, but because it’s a strong first draft that a domain expert can evaluate and refine in minutes rather than hours.

This is the new collaboration: you provide the business context, AI provides the structured first draft, you refine it.

The AI is not replacing the domain expert. It’s doing the tedious synthesis work that previously made specification so expensive: drafting the states, naming the events, proposing the guards. The domain expert’s time is now spent reviewing and correcting, not producing from scratch. The ratio of expert time to specification quality has improved by roughly an order of magnitude.

Why FOSMs are Especially AI-Friendly FOSMs have a well-defined structure: a finite, enumerable set of states; named events; boolean guard conditions; explicit side-effects; named actors. This structure is easy for an AI to fill in. Compare this to asking an AI to "design the full application logic for an invoicing system" — an unbounded request that produces unbounded, hard-to-evaluate output. FOSMs give the AI a template with slots. The quality of the output is dramatically higher when the output has structure.

Three Demonstration Lifecycles

Let me show you what AI-generated FOSM specifications look like in practice. These examples are representative of what a capable LLM produces when prompted well. They’re not perfect — we’ll discuss evaluation in a moment — but they’re shockingly good starting points.

Example 1: Invoice Lifecycle (Consulting Firm)

Prompt:

Design a complete FOSM lifecycle for an Invoice in a consulting firm. The firm does retainer and project-based billing. Some invoices require client approval before they can be paid. Include states, events, guards, side-effects, and actors. Format as a structured specification.

AI Output:

FOSM: Invoice (Consulting)

STATES
------
draft           Invoice created, not yet sent to client
pending_approval  Sent to client, awaiting explicit approval (approval-required invoices only)
sent            Sent to client, awaiting payment (or approval waived)
viewed          Client has opened/viewed the invoice
partially_paid  Partial payment received; balance remains
paid            Full payment received
overdue         Past due date without full payment
disputed        Client has formally disputed the invoice
cancelled       Invoice cancelled before payment

EVENTS
------
invoice_sent           Consultant sends invoice to client
approval_requested     System routes invoice for client approval (if required)
client_approved        Client approves invoice for payment
client_rejected        Client rejects invoice (dispute before payment)
invoice_viewed         Client opens the invoice (tracked via read receipt or portal)
payment_received       Payment processor confirms incoming payment
payment_partial        Payment processor confirms partial payment
due_date_passed        Scheduled job fires when due date elapses
dispute_raised         Client formally raises a dispute
dispute_resolved       Both parties agree to resolution
cancelled              Invoice explicitly cancelled

TRANSITIONS
-----------
draft → pending_approval  ON approval_requested
  Guard: invoice.requires_approval? && invoice.client_email.present?
  Actors: system, billing_manager
  Side-effects: Email client with approval link; log timestamp

draft → sent  ON invoice_sent
  Guard: !invoice.requires_approval? && invoice.total > 0
  Actors: account_manager, billing_manager
  Side-effects: Email invoice to client; set sent_at; start due date timer

pending_approval → sent  ON client_approved
  Guard: approver_is_authorized_client_contact?
  Actors: client_contact
  Side-effects: Notify billing team; log approval with approver identity

pending_approval → disputed  ON client_rejected
  Guard: rejection_reason.present?
  Actors: client_contact
  Side-effects: Alert account manager; pause due date timer

sent → viewed  ON invoice_viewed
  Guard: viewer_ip_not_internal?
  Actors: system (via email tracking pixel or portal analytics)
  Side-effects: Log view timestamp; notify account manager if invoice > $10,000

viewed → partially_paid  ON payment_partial
  Guard: amount_received > 0 && amount_received < invoice.total
  Actors: system (Stripe webhook)
  Side-effects: Create Payment record; send partial receipt; update outstanding balance

viewed → paid  ON payment_received
  Guard: amount_received >= invoice.total
  Actors: system (Stripe webhook)
  Side-effects: Create Payment record; send receipt; close associated project billing; update revenue recognition

sent → overdue  ON due_date_passed
  Guard: current_date > due_date && status != :paid
  Actors: system (scheduled job)
  Side-effects: Send overdue notice to client; alert account manager; escalate if > 30 days

partially_paid → paid  ON payment_received
  Guard: total_received >= invoice.total
  Actors: system (Stripe webhook)
  Side-effects: Create final Payment record; send final receipt; close invoice

partially_paid → overdue  ON due_date_passed
  Guard: current_date > due_date
  Actors: system
  Side-effects: Send overdue notice referencing partial payment; escalate

overdue → paid  ON payment_received
  Guard: total_received >= invoice.total
  Actors: system
  Side-effects: Create Payment record; send receipt; log overdue resolution

overdue → disputed  ON dispute_raised
  Guard: dispute_reason.present?
  Actors: client_contact, account_manager
  Side-effects: Pause all automated reminders; alert finance director; open dispute record

disputed → sent  ON dispute_resolved
  Guard: resolution_terms_agreed? && revised_amount_set?
  Actors: finance_director, account_manager
  Side-effects: Update invoice amount if revised; notify both parties; restart due date

[draft|sent|viewed|partially_paid|overdue] → cancelled  ON cancelled
  Guard: cancellation_authorized?
  Actors: billing_manager, finance_director
  Side-effects: Notify client; void in accounting system; log reason

What’s striking about this output:

It correctly identifies pending_approval as a state that only some invoices need (retainer clients often have pre-approved billing, project clients often require explicit approval)
The viewed state is non-obvious — most developers would skip it — but it’s genuinely useful for account managers tracking engagement
The dispute_resolved transition correctly moves back to sent rather than directly to paid, because a resolved dispute often involves a revised invoice amount
The guard viewer_ip_not_internal? on the viewed transition is smart — you don’t want internal testing to trigger view notifications

Could I have designed all of this myself? Yes, eventually. But “eventually” is the key word. The AI gave me a 90% solution in 40 seconds. I spend the next 10 minutes improving the 10%.

Example 2: Hiring Pipeline FOSM

Prompt:

Design a FOSM for a hiring pipeline at a mid-size tech company. The company uses structured interviews. Include an offer negotiation phase and a background check before the final hire. Include states, events, guards, side-effects, actors.

AI Output:

FOSM: Job Application (Hiring Pipeline)

STATES
------
applied             Candidate submitted application
screening           Recruiter actively reviewing application
phone_screen        Scheduled phone screen in progress or completed
interview_panel     Panel interview scheduled or in progress
offer_pending       Hiring decision made; offer being prepared
offer_extended      Formal offer sent to candidate
offer_negotiating   Candidate has countered; negotiation in progress
background_check    Offer accepted; background check initiated
hired               Background check passed; candidate hired
rejected            Candidate rejected at any stage
withdrawn           Candidate withdrew their application
on_hold             Pipeline paused (role put on hold, not cancelled)

EVENTS
------
application_received    Application submitted through ATS
shortlisted             Recruiter marks application for phone screen
phone_screen_scheduled  Phone screen scheduled with candidate
phone_screen_passed     Recruiter marks phone screen as pass
phone_screen_failed     Recruiter marks phone screen as fail
panel_interview_scheduled  Panel interview scheduled
interview_completed     All interviewers have submitted scorecards
hiring_decision_made    Hiring committee approves proceeding to offer
offer_drafted           Recruiter finalizes offer letter
offer_sent              Offer formally sent to candidate
candidate_countered     Candidate submits counteroffer
negotiation_concluded   Both parties agree to final terms
offer_accepted          Candidate formally accepts offer
offer_declined          Candidate declines offer
background_check_initiated  Background check ordered
background_check_passed  Background check returns clear
background_check_failed  Background check returns disqualifying result
hire_confirmed          HR confirms onboarding initiated
rejected_at_stage       Any rejection at current stage
application_withdrawn   Candidate withdraws

SELECTED TRANSITIONS
--------------------
screening → phone_screen  ON phone_screen_scheduled
  Guard: recruiter_confirmed_availability? && candidate_confirmed_availability?
  Actors: recruiter, recruiting_coordinator
  Side-effects: Calendar invite to candidate; notify hiring manager

interview_panel → offer_pending  ON hiring_decision_made
  Guard: all_scorecards_submitted? && quorum_recommends_hire? && headcount_approved?
  Actors: hiring_manager, recruiting_director
  Side-effects: Notify recruiter to prepare offer; log hiring decision with scorecard summary

offer_extended → offer_negotiating  ON candidate_countered
  Guard: counter_is_within_policy_bands? || escalation_approved?
  Actors: candidate (via portal), recruiter
  Side-effects: Alert hiring manager and HR; pause offer expiry timer

offer_negotiating → background_check  ON offer_accepted
  Guard: final_terms_documented? && offer_signed?
  Actors: candidate
  Side-effects: Order background check; notify IT for provisioning prep; notify payroll

background_check → hired  ON hire_confirmed
  Guard: background_check_passed? && start_date_confirmed? && offer_letter_signed?
  Actors: hr_manager
  Side-effects: Create Employee record; trigger onboarding workflow; notify all teams; send welcome email

background_check → rejected  ON background_check_failed
  Guard: disqualifying_result_reviewed_by_hr?
  Actors: hr_manager
  Side-effects: Notify candidate per legal requirements; rescind offer; update headcount forecast

[any non-terminal state] → on_hold  ON role_put_on_hold
  Guard: business_justification_documented?
  Actors: hr_director, cfo
  Side-effects: Notify candidate professionally; pause all pipeline timers; set review date

The AI correctly models offer_negotiating as a distinct state (most CRUD systems would just call this “offer_extended” and add a notes field). The guard on candidate_countered — checking whether the counter is within compensation bands — is something a senior HR professional would add but a developer might miss. The on_hold state, which can be reached from any non-terminal state, correctly handles the real-world situation where hiring freezes.

Example 3: Vendor Management Lifecycle

Prompt:

Design a FOSM for vendor lifecycle management at a mid-size company with procurement controls. Include the RFP/evaluation phase, contract management, and ongoing performance monitoring. Include suspension for compliance failures.

AI Output:

FOSM: Vendor (Lifecycle Management)

STATES
------
prospect              Vendor identified but not yet engaged
rfp_issued            Request for proposal sent to vendor
proposal_received     Vendor has submitted proposal
under_evaluation      Procurement committee is evaluating proposal
approved              Vendor approved; legal preparing contract
contract_negotiation  Contract terms under negotiation
active                Contract signed; vendor is in active use
performance_review    Scheduled or triggered performance review
suspended             Vendor activity paused (compliance or performance issue)
terminated            Vendor relationship ended
blacklisted           Vendor barred from future engagement

SELECTED TRANSITIONS
--------------------
prospect → rfp_issued  ON rfp_sent
  Guard: rfp_approved_by_procurement? && budget_code_assigned?
  Actors: procurement_manager
  Side-effects: Email RFP to vendor contact; log RFP version; set response deadline

under_evaluation → approved  ON vendor_approved
  Guard: evaluation_score >= threshold? && no_disqualifying_flags? && required_certifications_verified?
  Actors: procurement_committee (minimum 2 members)
  Side-effects: Notify vendor; notify legal to begin contract prep; update vendor registry

active → suspended  ON compliance_failure_flagged
  Guard: compliance_issue_documented? && reviewed_by_legal?
  Actors: compliance_officer, legal_team
  Side-effects: Block new purchase orders for this vendor; notify finance; notify affected project managers; set resolution deadline

suspended → active  ON compliance_resolved
  Guard: compliance_officer_confirmed_resolution? && legal_sign_off?
  Actors: compliance_officer
  Side-effects: Re-enable purchase orders; notify affected teams; log resolution

suspended → terminated  ON termination_decision_made
  Guard: 30_day_notice_served? || material_breach_confirmed?
  Actors: cfo, legal_director
  Side-effects: Notify vendor formally; trigger contract termination clause; notify all open PO owners; schedule data deletion

active → blacklisted  ON blacklist_decision_made
  Guard: board_approved? && legal_risk_assessment_complete?
  Actors: board_resolution
  Side-effects: Update vendor registry; notify all business units; document grounds for blacklisting (legal hold)

The blacklisted state — which requires board approval — is a detail only someone who has dealt with real vendor governance issues would include. The guard requiring “minimum 2 members” for vendor approval reflects procurement control requirements. These are the kinds of domain-specific nuances that make the difference between a toy spec and a production-ready one.

Evaluating AI-Generated Lifecycles

AI output is a starting point, not a finished product. Here’s how to evaluate what you get.

What to accept without major changes:

The core set of terminal states (the endpoints) — LLMs are good at identifying these
The names of the happy-path transitions — these tend to be accurate
Common side-effects like “send email” and “log event” — these are almost always right

What to refine:

Guard conditions that are too simple (e.g., payment_received? when you need payment_amount >= invoice.total && payment_cleared_fraud_check?)
States that conflate two distinct stages — watch for states like “under review” that might actually need to be split into “legal review” and “compliance review”
Actor assignments that are too broad — “admin” as an actor is almost always a sign the AI didn’t think carefully about authorization

Red flags:

A transition that goes backward in the lifecycle without a guard that requires an explicit override and documented reason (reversals should be rare and audited, not default behavior)
Guards that reference fields that don’t exist on the object — the AI sometimes invents guards based on what should be true rather than what’s in your system
Missing edge cases for time-based events — LLMs often underspecify the “what if this never happens” paths (the deadline_passed events, the on_hold escape hatches)
Actors that are too vague — “management” is not an actor; “finance_director” is

The evaluation process takes 10–20 minutes for a typical business object. Compare that to the 2–4 hours a traditional specification session would require. The productivity gain is real, and it compounds across every object in your system.

The Expert Is Still Essential AI generates plausible specifications based on general domain knowledge. You know your specific business. The difference between a plausible invoice lifecycle and the correct invoice lifecycle for your consulting firm involves details the AI can't know: your approval thresholds, your specific client relationship types, your contractual obligations, your accounting software's limitations. The expert review step is not optional. It's where the value is created.

The Virtuous Cycle

Here is the argument from the FOSM paper that I want to spend a moment on, because it’s the most important insight in this whole framework:

AI makes FOSM practical. FOSM makes AI safe. Each enables the other.

We’ve established the first part: AI removes the specification bottleneck. But the second part — FOSM makes AI safe — is equally important and less obvious.

When you deploy an AI agent in a CRUD system, you’re giving it access to an arbitrary field-edit interface. The AI can change any field to any value. It can mark an invoice as paid without a payment. It can move a job application to hired without an interview. The action space is essentially unbounded, and the consequences of mistakes are hard to contain.

When you deploy an AI agent in a FOSM system, the picture is completely different. The AI can only trigger transitions that are defined in the state machine. Each transition has guards that must pass. The actor is logged. The current state constrains the available actions to a finite, well-defined set.

This is what the FOSM paper means by “bounded contexts for human-AI collaboration.” The FOSM is the guardrail. The AI operates inside a well-defined action space where mistakes are detectable, reversible, and auditable.

The practical implication: you can deploy AI agents to assist with invoice processing, hiring pipeline management, vendor evaluation, and dozens of other business workflows with far more confidence when those workflows are modeled as FOSMs. The AI isn’t operating on a raw database. It’s operating on a structured lifecycle with explicit rules.

AI generates FOSM specs
    ↓
FOSM provides bounded action space
    ↓
AI agents operate safely within bounded space
    ↓
AI agents generate better specs from real lifecycle experience
    ↓
(repeat)

This virtuous cycle is not theoretical. We’ll implement it directly in Part V when we build the AI bot layer for the Inloop Runway application.

How Software Development Has Changed

I want to step back and be honest about the context we’re working in, because it shapes how you should read the rest of this book.

Since 2024, software development has undergone a practical transformation that is only partly visible in industry headlines. The change is not that AI writes all the code — it’s that the economics of code production have fundamentally shifted.

Engineers who work with AI coding agents — Claude Code, OpenAI Codex, GitHub Copilot in its agentic forms — report a consistent experience: they spend most of their time writing prompts, reviewing AI-generated code, making architectural decisions, and providing domain context. They spend a fraction of their time writing code from scratch.

This is not exaggeration. A senior Rails engineer who would previously spend a week implementing a complete FOSM lifecycle for a new business object — model, state machine, transitions, actors, guards, side-effects, tests — can now do it in a day. The AI writes the boilerplate. The engineer writes the clarity: the FOSM specification that tells the AI what to build.

The most valuable skill in software development in 2026 is not syntax fluency. It is the ability to think clearly about what you want a system to do, at a level of precision that an AI can act on. FOSM specifications are exactly this: precise, structured, unambiguous descriptions of business behavior that AI coding agents can translate directly into working code.

This is why Part I of this book exists. You cannot tell an AI “build me an invoicing system.” You can tell an AI “here is a complete FOSM specification for an invoice lifecycle with 12 states and 28 transitions; implement this using the FOSM engine in our Rails 8 application.” The second instruction produces production-quality code. The first produces a demo.

The Developer as Architect Garry Tan noted that "Rails was designed for people who love syntactic sugar, and LLMs are sugar fiends." This is true in the narrow sense that Rails' conventions give LLMs exactly the scaffolding they need to produce correct code. But the deeper point is that convention-over-configuration is what LLMs need to operate effectively in a codebase at all. When the code follows predictable patterns, the LLM can navigate it, extend it, and test it reliably. FOSM provides the same convention for business logic that Rails provides for application structure.

What This Chapter Is Not

This is the last theory chapter. Chapter 4 begins the code.

I’ve deliberately kept this chapter free of implementation details, because the concepts matter independent of the implementation. The argument that FOSM + AI is a better paradigm for business software is not contingent on Rails, or Ruby, or any specific technology stack. It’s a modeling argument. It would be equally true if we were building in Python, Java, or TypeScript.

But from Chapter 4 onward, we’re going to build something concrete: a complete, production-capable business management platform using Rails 8, a hand-rolled FOSM engine, and AI integration. We’ll take the NDA lifecycle we designed in Chapter 2 and implement it first — states, transitions, guards, side-effects, the full picture in code.

By the time we reach Part V, we’ll have an AI bot that can discuss, trigger, and analyze FOSM transitions in natural language. The bot operates on the bounded action space of the FOSM — it can’t do what the state machine doesn’t allow, and everything it does is logged.

That’s the full picture. FOSM as paradigm. AI as specification engine. Rails 8 as implementation platform. Together, they make it possible for one developer to build what previously required an enterprise team.

Chapter Summary

The specification problem kept state machine modeling impractical for 30 years. Designing complete lifecycles was too expensive and too slow.

AI has solved the specification problem. LLMs can generate complete FOSM specs — states, events, guards, side-effects, actors — in minutes, for any business domain. The three examples in this chapter — invoices, hiring pipelines, vendor management — demonstrate the quality of AI-generated output and what skilled evaluation looks like.

The relationship between AI and FOSM is symbiotic: AI makes FOSM practical by removing the specification bottleneck; FOSM makes AI safe by providing bounded action spaces. The FOSM paper identifies this virtuous cycle as the central mechanism of the new paradigm.

Software development in 2026 is primarily a clarity and architecture discipline. Engineers write specifications. AI writes code. The developers who will build the most are those who can write the clearest specifications — and FOSM gives you the vocabulary to do that for any business object.

From here, we build.

In Chapter 4, we meet Rails 8 — the framework that makes the “one-person framework” vision real, and why convention-over-configuration is exactly what AI-assisted development needs.

What We Built

The historical barrier to FOSM adoption was specification cost: designing complete state machines required deep domain expertise, days of whiteboard sessions, and expensive revision cycles.
LLMs have removed this barrier. They can reason fluently about business domains and generate complete FOSM specifications — states, events, guards, side-effects, actors — in minutes.
AI-generated lifecycles are a strong starting point, not a finished product. The developer's job shifts to domain expert, specification reviewer, and architect.
There is a virtuous cycle: AI generates FOSM specs faster; FOSM makes AI deployments safer by providing constrained action spaces; safer AI generates better software.
The developer-as-prompt-engineer is now the norm, not the exception. Engineers mostly write clarity, not code. The code is reviewed, not authored.
This chapter is the last theory chapter. From here, we build.