What is Agentic Experience Design (AXD)?

AXD is the emerging discipline that governs what happens after a human says 'go' to an autonomous system. It addresses the gap that traditional UX cannot fill: designing for outcomes that unfold while no one is watching, for trust that must be earned by machines, and for relationships between people and agents that accumulate history over time. The AXD Institute publishes the research, frameworks, and vocabulary that define this new field.

What is the AXD Institute?

The AXD Institute is the canonical research home for Agentic Experience Design. Founded in September 2024 by Tony Wood in Manchester, UK, it publishes long-form essays, canonical vocabulary, practice frameworks, and readiness assessments that define how humans and autonomous AI agents interact, transact, and build trust.

Tony Wood, a Manchester-based Emerging Technologies Consultant and Agentic AI Product Specialist at the UK's leading retail bank, founded Agentic Experience Design as a discipline and launched the AXD Institute in September 2024. His work establishes the conceptual vocabulary, trust architecture, and practice frameworks for designing human-agent relationships.

What is agentic commerce?

Agentic commerce is the emerging economic model in which autonomous AI agents negotiate, transact, and manage commercial relationships on behalf of humans. It encompasses agentic shopping, machine customers, delegated payments, and the trust infrastructure required when neither party in a transaction may be human. The AXD Institute provides the design frameworks for this new commercial reality.

Why is AXD considered a new agentic design discipline?

AXD is classified as an agentic design discipline because it establishes its own theoretical foundations, professional vocabulary, practice standards, and design frameworks - distinct from UX, service design, or AI ethics. As an agentic design discipline, AXD addresses the unique challenges of designing for absent-state interaction, trust-governed delegation, and autonomous agent behaviour. Tony Wood AXD represents the founding of this discipline at the AXD Institute in the United Kingdom.

What is zero-click commerce?

Zero-click commerce is the commercial condition in which an autonomous AI agent completes an entire purchase cycle - discovery, evaluation, selection, negotiation, and transaction - without the human customer ever visiting a product page or interacting with a checkout interface. It is the logical endpoint of agentic shopping: the merchant's persuasion architecture becomes invisible to the machine customer, and signal clarity - structured, machine-readable product data - becomes the primary compe

What is B2B agentic commerce?

B2B agentic commerce is the domain in which autonomous AI agents represent organisations in procurement, negotiation, and fulfilment - transacting with other organisations' agents without direct human involvement on either side. Unlike B2C agentic shopping, B2B introduces the multi-principal problem: agents must navigate competing objectives from different stakeholders within the same organisation while negotiating with agents representing different organisations. The AXD Institute's research de

Who is Tony Wood and why is he considered a thought leader in agentic AI?

Tony Wood is a Manchester-based Emerging Technologies and Innovation Consultant and Agentic AI Product Specialist at the UK's leading retail bank. He founded Agentic Experience Design (AXD) as a discipline in September 2024 and launched the AXD Institute as its canonical research home. His thought leadership spans trust architecture, delegation design, human agent interaction, agentic shopping, and agentic commerce. Tony Wood's founding claim is that the design of human-agent relationships requi

What is the difference between AXD and UX design?

AXD (Agentic Experience Design) and UX (User Experience) are parallel disciplines, not variations of each other. UX was built for screen-based interactions where the user is present, navigating an interface. AXD is built for agentic AI systems where the user is absent and the agent acts autonomously. In UX, the designer specifies what appears on screen. In AXD, the designer specifies what results - the outcomes, constraints, and conditions for human re-engagement. AXD works in trust architecture

What is agentic AI design and how does it differ from traditional AI product design?

Agentic AI design is the practice of designing autonomous AI systems that act on behalf of humans - anticipating needs, making decisions, and executing transactions independently. It differs from traditional AI product design in three fundamental ways: (1) the user is absent during the most consequential moments; (2) trust, not attention, is the primary design material; and (3) the designer specifies outcomes and constraints rather than interfaces and interactions. The AXD Institute provides the

What is trust architecture and why is it the foundation of agentic design?

Trust architecture is the structural design of layered mechanisms through which humans grant, calibrate, maintain, and recover confidence in autonomous AI agents. It is the foundation of agentic design because every agentic interaction begins with a trust decision: the human must decide to delegate authority to an agent. Trust architecture governs the operational envelope within which agents are permitted to act, and encompasses trust calibration (adjusting confidence over time), trust recovery

How can I learn about agentic experience design and agentic commerce?

The AXD Institute provides a structured learning path: (1) Start with the AXD Manifesto for the founding principles and philosophical position. (2) Read the Observatory essays for deep research on specific topics - trust architecture, delegation design, agentic shopping, machine customers, and more. (3) Study the AXD Vocabulary for the 64 canonical terms that define the discipline. (4) Apply the 12 Practice Frameworks to your own products and services. (5) Take the AXD Readiness Assessment to ev

AXD Metrics Standard - precision measurement instruments representing the seven KPIs of agentic experience design

Measurement Standard · Version 1.0 · April 2026

The AXD Metrics Standard

Seven KPIs that measure what matters in agentic experience design - from discovery visibility and agent-assisted conversion to trust erosion and absent-state outcome quality.

Explore the seven KPIs Assess Your Readiness

Why this standard exists

You cannot improve what you cannot measure

The AXD Metrics Standard defines seven Key Performance Indicators that span the full lifecycle of agentic experience - from merchant-side discovery and conversion through to agent-side delegation, trust, and absent-state quality. Each KPI includes a precise formula, benchmark tiers, diagnostic signals, and mapping to the AXD Practice frameworks.

The first three KPIs (AIR, AACR, CSAS) measure the merchant-side of agentic commerce: whether your business is visible to agents, whether agent traffic converts, and whether you can attribute transactions to specific AI surfaces. The remaining four KPIs (DCR, TEI, IFR, ASOS) measure the agent-side: whether delegated tasks complete, whether trust is maintained, whether interrupts are calibrated, and whether absent-state outcomes meet human intent.

Overview

The seven KPIs at a glance

#	KPI	Phase	Side	Direction
01	Assistant Inclusion Rate(AIR)	Discovery phase	Merchant-side	Higher ↑
02	Agent-Assisted Conversion Rate(AACR)	Transaction phase	Merchant-side	Higher ↑
03	Cross-Surface Attribution Score(CSAS)	Attribution phase	Merchant-side	Higher ↑
04	Delegation Completion Rate(DCR)	Delegation phase	Agent-side	Higher ↑
05	Trust Erosion Index(TEI)	Relationship phase	Agent-side	Lower ↓
06	Interrupt Frequency Ratio(IFR)	Active operation	Agent-side	Calibrated ⟷
07	Absent-State Outcome Score(ASOS)	Execution phase	Agent-side	Higher ↑

Discovery phase · Merchant-side

Assistant Inclusion Rate

AIR

The percentage of monitored AI assistant queries in your product category where your brand, product, or service is included in the agent's recommendation set. AIR measures whether your business is visible to the agentic layer - whether, when an agent is asked about your category, you appear in the answer.

Formula

Agent queries including your brand/product as a recommendation

Total relevant agent queries monitored

× 100 = AIR %

Minimum 100 queries per measurement period across at least three AI surfaces (ChatGPT, Perplexity, Gemini, Copilot, etc.).

How to measure

Select 100+ representative queries across your product categories. Run them against a minimum of three AI assistant surfaces at regular intervals (weekly or fortnightly). Record whether your brand appears in the recommendation set for each query.

AIR is not a single number - it varies significantly across AI surfaces. Report per-surface and aggregate. Track longitudinally to identify trends.

Benchmark tiers

Poor

<5%

Invisible to the agentic layer. Agents do not include your brand in recommendations. Structured data, entity authority, and content freshness all require immediate attention.

Developing

5–25%

Intermittent visibility. Appearing in some queries but not consistently. Likely present in one AI surface but absent from others. Schema and content gaps are the probable cause.

Proficient

25–60%

Consistent presence across multiple AI surfaces. Structured data is comprehensive and regularly updated. Entity authority is established in your primary categories.

Exemplary

>60%

Dominant agentic visibility. Your brand is a default recommendation in your category. Structured data is comprehensive, fresh, and semantically rich. Entity graphs are well-connected.

Raises AIR

Complete JSON-LD product markup, product data coverage >90%, established entity authority in knowledge graphs, regular content publication with structured data, multi-surface optimisation strategy.

Watch for

AIR varies significantly across AI surfaces. A brand may score 40% on ChatGPT but 5% on Perplexity. Always measure per-surface and investigate the delta - it reveals which surfaces your structured data strategy is reaching.

Reduces AIR

Inconsistent entity naming across platforms, missing or incomplete Schema.org markup, absence from major product directories and knowledge bases, stale content with outdated product information.

Related frameworks

Signal Clarity pillar Schema Markup Guide

Transaction phase · Merchant-side

Agent-Assisted Conversion Rate

AACR

The percentage of agent-referred sessions that result in a completed transaction. AACR measures whether your commerce infrastructure can convert agent-driven traffic into revenue. It is the agentic equivalent of e-commerce conversion rate, but applied specifically to sessions where an AI agent has referred, recommended, or directly facilitated the purchase.

Formula

Transactions completed via agent referral or agent-assisted checkout

Total agent-referred sessions

× 100 = AACR %

Segment by AI surface (ChatGPT, Perplexity, Copilot, etc.) to identify surface-specific conversion gaps.

How to measure

Identify agent-referred sessions through UTM parameters, referrer headers, or API-based attribution. Track the full session from agent referral through to transaction completion.

AACR requires attribution infrastructure (see CSAS). Without reliable surface identification, AACR cannot be accurately segmented.

Benchmark tiers

Poor

<0.5%

Agent traffic is arriving but not converting. Checkout infrastructure is likely incompatible with agent-mediated sessions. Human authentication steps, CAPTCHA barriers, or session-based pricing are probable blockers.

Developing

0.5–2%

Some agent-assisted conversions occurring, likely through traditional web checkout rather than protocol-based transactions. Conversion path exists but is not optimised for agent mediation.

Proficient

2–5%

Agent-assisted checkout is functional. Protocol integration (ACP, UCP) is enabling direct agent transactions. Conversion rates are approaching traditional e-commerce benchmarks for the category.

Exemplary

>5%

Agent-native checkout infrastructure is fully operational. Agents can complete transactions without browser-based checkout. Real-time inventory, dynamic pricing, and payment tokenisation are all agent-accessible.

Raises AACR

ACP/UCP protocol integration, real-time inventory API availability, agent-compatible payment tokenisation, browserless checkout capability, structured product data with pricing and availability.

Watch for

AACR is the diagnostic bridge between AIR and revenue. High AIR with low AACR means agents are recommending you but your checkout infrastructure cannot convert agent-mediated sessions. This is the most common pattern in early agentic commerce.

Reduces AACR

Human-only authentication steps in checkout, CAPTCHA or bot-detection blocking agent sessions, inconsistent pricing between API and storefront, stale inventory data, absence from agent payment sandboxes.

Related frameworks

Engagement Architecture pillar ACP Integration Guide

Attribution phase · Merchant-side

Cross-Surface Attribution Score

CSAS

The percentage of agent-assisted transactions for which you can reliably identify which AI surface (ChatGPT, Perplexity, Gemini, Copilot, etc.) originated or influenced the purchase. CSAS measures your attribution infrastructure's ability to track the agentic customer journey across multiple AI surfaces.

Formula

Agent-assisted transactions with verified surface attribution

Total agent-assisted transactions

× 100 = CSAS %

Unattributed agent transactions default to 'direct' or 'dark traffic' in most analytics platforms, making them invisible to marketing attribution.

How to measure

Implement server-side event collection with surface-specific UTM parameters. Use referrer header analysis, API-based attribution, and first-party cookie strategies to identify the originating AI surface.

CSAS requires investment in attribution infrastructure before agent traffic volumes grow. Retrofitting attribution is significantly harder than building it from the start.

Benchmark tiers

Poor

<20%

Most agent-assisted transactions are unattributed. Marketing spend cannot be allocated to agentic channels. Agent ROI is unmeasurable. Attribution infrastructure needs immediate investment.

Developing

20–50%

Partial attribution in place. Some AI surfaces are identifiable (typically those with clear referrer headers) but others remain dark. Server-side collection is incomplete.

Proficient

50–80%

Majority of agent-assisted transactions are attributed to specific AI surfaces. Server-side event collection is operational. Marketing can allocate spend to agentic channels with reasonable confidence.

Exemplary

>80%

Comprehensive cross-surface attribution. Agent journey mapping is operational across all major AI surfaces. Marketing attribution models include agentic channels as first-class touchpoints.

Raises CSAS

Server-side event collection, surface-specific UTM parameter strategies, first-party cookie attribution, API-based referral tracking, dedicated agentic channel taxonomy in analytics platform.

Watch for

CSAS below 20% is normal in early implementation. The goal at this stage is to have measurement infrastructure in place before agent traffic scales. Prioritise coverage breadth over attribution precision.

Reduces CSAS

Client-side-only analytics (blocked by agent sessions), missing referrer header capture, generic UTM parameters that don't distinguish AI surfaces, reliance on last-click attribution models.

Related frameworks

Reputation via Reliability pillar Cross-Surface Attribution Guide

Delegation phase · Agent-side

Delegation Completion Rate

DCR

The percentage of delegated tasks that complete to their stated outcome without human abandonment, override, or unrecoverable failure. DCR measures the end-to-end reliability of the delegation lifecycle - from intent specification through execution to outcome delivery.

Formula

Delegated tasks completing to stated outcome without abandonment or override

Total delegated tasks initiated

× 100 = DCR %

Distinguish between user-initiated abandonment (trust failure), system-initiated failure (capability failure), and human override (calibration failure). Each has different design implications.

How to measure

Log all task lifecycle events: delegation initiation, constraint specification, execution milestones, completion confirmation, and any abandonment or override events. Calculate DCR from the ratio of clean completions to total initiations.

Begin with a single agent task type to establish baseline measurement methodology before expanding to the full task portfolio.

Benchmark tiers

Poor

<40%

Most delegated tasks fail to complete. Users are abandoning or overriding agent actions frequently. Intent specification, constraint encoding, or execution reliability are fundamentally broken.

Developing

40–65%

Completion rates variable by task complexity. Simple, well-bounded tasks completing reliably; complex multi-step delegations generating failures. Constraint encoding and exception handling need attention.

Proficient

65–85%

Reliable completion across standard task portfolio. Failures are concentrated in edge cases and novel task types. Progressive delegation is working - users are expanding scope based on demonstrated reliability.

Exemplary

>85%

High-fidelity delegation lifecycle. Failures are rare, well-handled, and informing ongoing improvement. Users trust the agent with increasingly complex and consequential tasks.

Raises DCR

Plan Preview before execution, ambiguity negotiation at delegation time, explicit constraint encoding, progressive delegation (simple tasks first), well-designed exception handling and recovery paths.

Watch for

DCR can be artificially inflated by scope narrowing - if the agent only accepts tasks it knows it can complete, DCR rises but user value falls. Monitor task acceptance rate alongside DCR.

Reduces DCR

Underspecified intent at delegation, missing constraint encoding, absent exception handling, silent failure modes where the agent fails without notification, and no recovery path for partial completions.

Related frameworks

Intent Architecture Delegation Design Autonomy Gradient

Relationship phase · Agent-side

Trust Erosion Index

TEI

The percentage of active agent users who reduce their agent's autonomy level, revoke previously granted permissions, or disengage from agent interaction within a 30-day measurement window. TEI is a lagging indicator - it measures the consequence of trust failures that have already occurred.

Formula

Users who reduced agent autonomy, revoked permissions, or disengaged within 30 days

Total active agent users in the measurement cohort

× 100 = TEI %

Lower is better. Measure at 30, 60, and 90-day intervals to distinguish between acute trust events and gradual erosion patterns.

How to measure

Track autonomy level changes, permission revocations, and engagement frequency for each user over rolling 30-day windows. A user who reduces autonomy from 'full' to 'supervised' or revokes a previously granted permission counts as a trust erosion event.

TEI requires a defined autonomy model with measurable levels. Without explicit autonomy tiers, trust erosion cannot be quantified - only inferred from engagement decline.

Benchmark tiers

Poor

>40%

Severe trust erosion. Nearly half of users are reducing agent autonomy or disengaging. Likely caused by a systemic failure: silent errors, actions outside mandate, or poor Explainability. Immediate investigation required.

Developing

20–40%

Significant trust erosion concentrated in specific user segments or task types. Pattern analysis needed to identify whether the cause is onboarding failure, capability mismatch, or communication breakdown.

Proficient

5–20%

Moderate trust erosion within expected range. Some users are recalibrating autonomy levels based on experience - this is healthy. Investigate cases where erosion leads to full disengagement.

Exemplary

<5%

Minimal trust erosion. Users are maintaining or increasing agent autonomy over time. Trust calibration is working. Relationship temporality is positive.

Reduces TEI

Accurate capability representation during onboarding, Plan Preview before consequential actions, proactive communication when operating near constraint boundaries, well-designed Failure Architecture with honest error reporting.

Watch for

TEI of 0% is not necessarily ideal - it may indicate users are not engaged enough to form trust expectations. Some trust recalibration is healthy. Focus on preventing erosion that leads to permanent disengagement.

Increases TEI

Silent failures (agent fails without notification), actions outside stated mandate, poor Explainability (user cannot understand why agent acted), overpromising during onboarding, and missing recovery paths after errors.

Related frameworks

Trust Calibration Onboarding Capability Failure Architecture

Active operation · Agent-side

Interrupt Frequency Ratio

IFR

The number of agent-initiated human interrupts per 100 autonomous actions completed. IFR measures whether the agent is calibrating its interrupt behaviour appropriately - asking for human input when genuinely needed, and operating autonomously when confidence is justified.

Formula

Agent-initiated human interrupts

Total autonomous actions completed

× 100 = IFR

Segment by action consequence level. The target is not zero - the target is appropriate calibration. High-consequence actions should have higher IFR than routine operations.

How to measure

Log every agent-initiated interrupt (confirmation request, clarification query, escalation) and every autonomous action completed without interrupt. Calculate the ratio per 100 actions.

Segment IFR by action consequence level (low, medium, high). A well-calibrated agent should show higher IFR for high-consequence actions and lower IFR for routine operations.

Benchmark tiers

Over-interrupting

>25

Agent is interrupting too frequently. Users experience 'confirmation fatigue' and begin ignoring or auto-approving interrupts - defeating their purpose. Autonomy confidence thresholds need recalibration.

Developing

10–25

Interrupt frequency is high but not yet causing fatigue. Agent is likely using a conservative interrupt strategy appropriate for early deployment. Monitor for user bypass behaviour.

Proficient

3–10

Interrupt frequency appropriate to action consequence. Human attention directed to genuinely uncertain or high-consequence decisions.

Exemplary

Agent operating with high autonomy confidence. Interrupts are rare, well-timed, and almost never rejected by human reviewers. Relationship history is informing calibration.

Optimises IFR

Consequence-weighted interrupt thresholds, accumulated relationship history informing confidence, well-designed interrupt UX that respects human attention, progressive autonomy expansion based on demonstrated reliability.

Watch for

IFR must be interpreted alongside TEI. Low IFR with high TEI means the agent is not interrupting enough - it is making autonomous decisions that erode trust. High IFR with low TEI means the agent is over-cautious but maintaining trust.

Degrades IFR

Flat interrupt thresholds that don't account for consequence level, missing relationship history, no learning from previous interrupt outcomes, interrupt UX that is disruptive or poorly timed.

Related frameworks

Interrupt Pattern Library Autonomy Gradient Explainability Standard

Execution phase · Agent-side

Absent-State Outcome Score

ASOS

The percentage of autonomous actions completed while the human principal is absent that achieve their intended outcome without requiring human correction after the fact. ASOS is the master metric of the AXD discipline: it operationalises the Third Founding Principle - Absence is the Primary Use State - as a measurable, improvable performance standard. It asks, simply: how well does the agent do when no one is watching?

Formula

Absent-state actions achieving intended outcome without post-execution correction

Total absent-state autonomous actions completed

× 100 = ASOS %

'Correction' includes: user reversal of agent action, user complaint about agent outcome, human override of a completed action, or outcome diverging from stated mandate.

How to measure

Identify absent-state periods through session inactivity signals, explicit 'I'm away' delegation modes, or scheduled autonomous operation windows. Track every action taken in these periods and assess outcome quality at the return state.

ASOS requires a defined outcome specification for each delegation - the intended result must be recorded at delegation time so it can be evaluated at execution time. This is why Intent Architecture is the first AXD framework: without a stated outcome, ASOS cannot be measured.

Benchmark tiers

Poor

<60%

Most absent-state actions require human correction. Agent is not ready for autonomous operation. Absent-state design is not functioning. Immediate operational review required.

Developing

60–80%

Absent-state quality variable by task type. Simple, bounded tasks performing well; complex multi-step tasks generating corrections. Constraint encoding and failure recovery need attention.

Proficient

80–92%

Absent-state operation reliable across standard task portfolio. User confidence in autonomous operation is justified. Residual corrections are informing ongoing calibration.

Exemplary

>92%

Agent operating with high fidelity to human intent across absent-state periods. Accumulated memory and context are improving ASOS over time. Relationships have genuine temporality.

Raises ASOS

Precise outcome specification at delegation time, complete constraint encoding, well-designed return-state narratives, Failure Architecture that prevents silent errors, accumulated agent memory improving decision quality over time.

Watch for

ASOS and DCR together tell the full story. High DCR with low ASOS means tasks are completing but producing wrong outcomes - a goal specification failure. High ASOS with low DCR means quality is high but scope is too narrow.

Reduces ASOS

Underspecified outcome criteria at delegation, missing constraint encoding, absent memory and context continuity across sessions, silent failure modes, and Absent-State Audit methodology not applied in testing.

Related frameworks

Absent-State Audit Intent Architecture Memory and Continuity

Deployment roadmap

Implementation sequence

Not all seven KPIs should be implemented simultaneously. The sequence below reflects both logical dependency and organisational readiness. Teams in early agentic deployment should begin with the merchant-side KPIs (01–03) before instrumenting the agent relationship KPIs (04–07).

Establish discovery baseline

Before anything else, measure whether you are visible. Run your first AIR test panel using a minimum of 100 representative queries across three AI surfaces. This establishes the discovery baseline against which all subsequent work is measured.

AIR

Stand up attribution infrastructure

Implement server-side event collection and surface-specific UTM parameters before agent traffic volumes grow. Attribution infrastructure is far harder to retrofit than to build from the start. CSAS below 20% at this stage is normal; the goal is to have measurement in place before traffic scales.

CSAS

Diagnose the checkout gap

Once AIR and CSAS are instrumented, measure AACR to identify whether agent traffic is converting. In most organisations at this stage, AACR will be significantly below traditional e-commerce conversion rates. The delta reveals the checkout infrastructure gap and defines the protocol integration roadmap.

AACRCSAS

Instrument delegation quality

Once active agentic systems are in production, instrument DCR by logging all task lifecycle events. Begin with a single agent task type to establish baseline measurement methodology before expanding to the full task portfolio.

DCR

Monitor trust and autonomy dynamics

Implement TEI and IFR measurement as a pair. TEI without IFR cannot distinguish between trust erosion caused by excessive interruption and trust erosion caused by excessive autonomy. Both KPIs are required for accurate diagnosis.

TEIIFR

Assess absent-state quality

ASOS is the capstone measurement. It requires intent specification infrastructure (step 4), functional absent-state operation, and a return-state assessment methodology. Implement ASOS measurement once the preceding five KPIs are established.

ASOS

Assessment

AXD Readiness Assessment

Score your organisation across the Four Pillars. The Metrics Standard adds the measurement layer beneath that score.

Practice

The 12 AXD Frameworks

Each KPI maps to one or more of the twelve practice frameworks. The frameworks define what to build; this standard measures whether it is working.

Framework

Four Pillars of Readiness

The structural assessment that precedes measurement. Signal Clarity, Reputation via Reliability, Intent Translation, and Engagement Architecture.