Intent Is the Interface

The screen was a constraint we mistook for the product.

We design interfaces. Pixel arrangements on glass. Then we redesign them for smaller glass. Then for round glass on wrists. Then for no glass at all—voice, chat, agents.

                                 FEATURE
                                    │
                                    │
            ┌───────────┬───────────┼───────────┬───────────┐
            │           │           │           │           │
            ▼           ▼           ▼           ▼           ▼

         desktop     mobile      watch       voice       chat  



                                    ×

                         ┌─────────────────────┐
                         │                     │
                         │    3  urgency       │
                         │    3  permission    │
                         │                     │
                         │    ─────────────    │
                         │    45 designs       │
                         │                     │
                         └─────────────────────┘

                              for one feature

If you're shipping to more than two surfaces, you've felt this.

We're designing the wrong unit.

An interface is an appearance. An appearance of what?

A capability. The thing users can actually do.

"Request a ride" is a capability. The app screen is one appearance. The voice command is another. The watch tap is another.

Decades refining shadows. The object casting them got neglected.

Think of capabilities as cells in a petri dish. They exist before you observe them. The microscope determines what you see—not what's there. We'll come back to this.

Capabilities decompose into intents.

Audit any interface. Every action maps to a small set of patterns. Here are eight. There may be more. These cover most of what we build:

         read       ────○        consume
         create     ───○┼        make new  
         update     ───○↺        modify
         delete     ───○×        remove
         send       ───○──▶      execute
         search     ───○?        find
         filter     ───○≡        narrow
         save       ───○◆        persist

Uber's "Request Ride"—three intents:

              ───○┼    create     destination
              ────○    read       price
              ───○──▶  send       confirm

Same capability. Three projections:

        ╭─────────────────────────────────────────────────────────╮
        │                                                         │
        │                                                         │
        │                    R E Q U E S T                        │
        │                       R I D E                           │
        │                                                         │
        │                   ───○┼  ────○  ───○──▶                  │
        │                                                         │
        │                          │                              │
        │                          │                              │
        │            ┌─────────────┼─────────────┐                │
        │            │             │             │                │
        │            ▼             ▼             ▼                │
        │                                                         │
        │       ┌─────────┐  ┌─────────┐  ┌─────────┐             │
        │       │         │  │         │  │         │             │
        │       │ mobile  │  │  voice  │  │  watch  │             │
        │       │         │  │         │  │         │             │
        │       │  form   │  │ "get me │  │   tap   │             │
        │       │ button  │  │ a ride" │  │   last  │             │
        │       │         │  │         │  │   dest  │             │
        │       └─────────┘  └─────────┘  └─────────┘             │
        │                                                         │
        │                                                         │
        │              same intents, different expressions        │
        │                                                         │
        ╰─────────────────────────────────────────────────────────╯

Now add context.

send during calm = button with confirmation.

send during crisis = full-screen, no confirmation, haptic feedback.

Same intent. Different situation. Different expression.

Context isn't screen width. That's one dimension. There are at least six:

         someone late for a flight, using voice


         surface      ○──────────────────────────────●   voice
                      desktop                        


         attention    ○──────────────────────────────●   glance
                      deep                           


         urgency      ○──────────────────────────────●   crisis
                      calm                           


         expertise    ○────────────────●─────────────○   
                      novice           ▲             expert
                                       │
                                   competent


         permission   ○──────────────────────────────●   owner
                      viewer                         


         temporal     ○──────────────────●───────────○   
                      async              ▲           realtime
                                         │
                                     realtime

The interface isn't designed. It's derived.

        ╭─────────────────────────────────────────────────────────╮
        │                                                         │
        │                                                         │
        │     intent:send   +   urgency:crisis   +   voice        │
        │                                                         │
        │                                                         │
        │                          │                              │
        │                          │                              │
        │                          ▼                              │
        │                                                         │
        │                  ┌ ─ ─ ─ ─ ─ ─ ─ ┐                      │
        │                                                         │
        │                  │   size    max  │                     │
        │                      confirm skip                       │
        │                  │   prompt  "Book?"                    │
        │                      timeout 3s   │                     │
        │                                                         │
        │                  └ ─ ─ ─ ─ ─ ─ ─ ┘                      │
        │                                                         │
        │                                                         │
        │                   rules transform intent                │
        │                                                         │
        ╰─────────────────────────────────────────────────────────╯

AI broke two assumptions.

First: interfaces live on screens. Conversation has no viewport. An agent has no interface at all—until it needs to show you something.

Second: users initiate.

                             T H E   O L D   M O D E L

        ╭─────────────────────────────────────────────────────────╮
        │                                                         │
        │                                                         │
        │         ┌──────┐                       ┌───────────┐    │
        │         │      │                       │           │    │
        │         │ user │ ─────initiates─────▶  │ interface │    │
        │         │      │                       │           │    │
        │         └──────┘                       └───────────┘    │
        │              ▲                               │          │
        │              │                               │          │
        │              └─────────  waits  ─────────────┘          │
        │                                                         │
        │                                                         │
        ╰─────────────────────────────────────────────────────────╯

                             T H E   N E W   M O D E L

        ╭─────────────────────────────────────────────────────────╮
        │                                                         │
        │                                                         │
        │         ┌──────┐                       ┌───────────┐    │
        │         │      │                       │           │    │
        │         │agent │ ───────acts─────────▶ │   task    │    │
        │         │      │                       │           │    │
        │         └──────┘                       └───────────┘    │
        │              │                               │          │
        │              │                               │          │
        │              │         ┌──────┐              │          │
        │              │         │      │              │          │
        │              └───────▶ │ user │ ◀──surfaces──┘          │
        │                        │      │                         │
        │                        └──────┘                         │
        │                            │                            │
        │                            │                            │
        │                        responds                         │
        │                                                         │
        │                                                         │
        ╰─────────────────────────────────────────────────────────╯

For fifty years, design assumed you start things. Click. Tap. Navigate. The interface waits.

Agents invert this. They act, then surface for approval. You respond.

You're not initiating. You're reacting. The interface isn't a tool you operate—it's a negotiation you manage.

A reactive interface can't be a static layout. It needs context: what surface, what you're doing, how urgent, whether you need full control or just yes/no.

Systems built around screens will bolt AI on awkwardly.

Systems built around intents will treat AI as another projection surface—one where the agent initiates and the human responds.

Why hasn't anyone done this?

Because it's hard.

State management across projections. Context detection that guesses wrong. UI that shifts and feels uncanny. Rule systems that explode in complexity.

These are real problems. Unsolved.

But "hard" isn't "impossible." And the alternative—45 interfaces per feature, designed by hand—is already impossible. We just pretend otherwise.

The shift:

                    layouts       ─────────▶       rules

                    screens       ─────────▶       capabilities

                    initiating    ─────────▶       reacting

The screen isn't dying. But its monopoly is over.

Voice has no viewport. Chat has no fold. Agents have no chrome. The interfaces that survive won't be designed for screens. They'll be the ones that never assumed a screen in the first place.

Stop designing how things look.

Start designing what things do.

Intent is the interface now.

Remember the petri dish.

                              ○───────────○
                             ╱             ╲
                            ╱               ╲
                           ○        ○────────○
                            ╲      ╱
                             ╲    ╱
                              ○──○



                       capabilities as cells
                       context as lens
                       surfaces as projections

Capabilities live here—breathing, connected. Adjust the lens, see different projections. This is what design tools look like when they stop assuming screens.