Cap URN Structure
Product structure, direction tags, identity cap, normalization, and partial order
Product Structure
A Cap URN is a triple over the Tagged URN domain:
For a Cap URN $c \in C$:
Where:
- i — Input dimension (the
intag value, a Media URN) - o — Output dimension (the
outtag value, a Media URN) - y — Non-direction cap-tags (
op,ext,model,language, etc.)
String Representation
Canonical Form
A Cap URN serializes as:
cap:in="<media-urn>";out="<media-urn>";<cap-tags>
Examples:
cap:in=media:;out=media:
cap:in="media:pdf";op=extract;out="media:object"
cap:in="media:textable;form=scalar";op=prompt;out="media:textable;form=map"
Direction Tags
The in and out tags are required in canonical form:
| Tag | Purpose | Default |
|---|---|---|
in |
Input media type | media: (any) |
out |
Output media type | media: (any) |
Non-Direction Tags
All other tags form the y dimension:
| Common Tag | Purpose |
|---|---|
op |
Operation name |
ext |
File extension hint |
model |
Model identifier |
language |
Language code |
constrained |
Constrained output flag |
Parsing and Normalization
Cap URN processing distinguishes three forms:
| Form | Description |
|---|---|
| Surface syntax | What users write (may omit in/out) |
| Canonical form | Normalized representation (always has in/out) |
| Validation target | Post-normalization structure that rules check |
Surface Syntax
Users may omit direction tags. These are all valid surface syntax:
cap:op=test
cap:in;op=test
cap:in=*;op=test;out=*
Normalization to Canonical Form
During parsing, missing or wildcard direction tags are filled with media::
| Surface Syntax | Canonical Form |
|---|---|
cap:op=test |
cap:in=media:;op=test;out=media: |
cap:in;op=test |
cap:in=media:;op=test;out=media: |
cap:in=*;op=test;out=* |
cap:in=media:;op=test;out=media: |
The value * in direction tags expands to media::
in=* → in=media:
out=* → out=media:
This ensures media: is the unique identity for “any media type”.
Validation Target
Validation rules (CU1, CU2 in Validation Rules) apply to the canonical form, not surface syntax. After normalization:
inandoutare always present- Their values are valid Media URNs
Quoting
Direction spec values containing ; must be quoted:
cap:in="media:pdf;bytes";op=extract;out="media:object"
Without quotes, media:pdf;bytes would parse incorrectly — the ; would be interpreted as a tag separator.
Identity Cap
Definition
The identity cap is:
cap:
Which normalizes to:
cap:in=media:;out=media:
Semantics
The identity cap:
- Accepts any input (
in=media:) - Produces any output (
out=media:) - Has no operation constraints
- Has specificity 0
- Is the top of the Cap partial order
Constant
pub const CAP_IDENTITY: &str = "cap:";
Every capset must include the identity cap (CU1 in Validation Rules).
Dimension Semantics
Input Dimension (i)
The in tag specifies what input the capability accepts:
cap:in="media:pdf";...
Meaning: “This capability requires PDF input.”
Wildcard:
cap:in=media:;...
Meaning: “This capability accepts any input.”
Output Dimension (o)
The out tag specifies what output the capability produces:
cap:..;out="media:object"
Meaning: “This capability produces a JSON object.”
Wildcard:
cap:...;out=media:
Meaning: “This capability may produce any output.”
Cap-Tags Dimension (y)
Non-direction tags specify operation identity and constraints:
cap:...;op=extract;target=metadata
The y dimension is itself a Tagged URN (without prefix), using the same matching semantics.
Accessing Components
Given a Cap URN string, extract:
let cap = CapUrn::from_string("cap:in=media:pdf;op=extract;out=media:object")?;
let input: &str = cap.in_spec(); // "media:pdf"
let output: &str = cap.out_spec(); // "media:object"
let op: Option<&str> = cap.tag("op"); // Some("extract")
| Component | Type | Access |
|---|---|---|
| Input | Media URN string | in_spec() |
| Output | Media URN string | out_spec() |
| Cap-tags | Key-value map | tags, tag(key) |
Specificity
Cap URN specificity is defined in Specificity:
Examples:
cap: → 0
cap:op=extract → 1
cap:in=media:pdf;op=extract;out=media:object → 3
Partial Order
Cap URNs form a partial order (specialization order) in the product space:
graph TD
A["cap:"] --> B["cap:op=extract"]
B --> C["cap:in=media:pdf;op=extract"]
B --> D["cap:op=extract;out=media:object"]
C --> E["cap:in=media:pdf;op=extract;out=media:object"]
D --> E
E --> F["cap:in=media:pdf;v=2.0;op=extract;out=media:object;target=metadata"]
style A fill:none,stroke:#888
style F fill:none,stroke:#888
The ordering follows from the dispatch relation. Higher nodes are more general (subsume nodes below them).
Relationship to Media URNs
Direction Values Are Media URNs
The in and out tag values are themselves Media URNs:
cap:in="media:pdf;bytes";out="media:object"
↑ ↑
Media URN Media URN
Matching Uses Media URN Semantics
When matching direction specs, use Media URN matching:
let provider_in = MediaUrn::from_string("media:bytes")?;
let request_in = MediaUrn::from_string("media:pdf;bytes")?;
// For dispatch: request_in must conform to provider_in
request_in.conforms_to(&provider_in) // true
Common Patterns
Generic Capability
cap:op=transform
Accepts any input, produces any output, performs “transform”.
Typed Transformer
cap:in="media:pdf";op=extract;out="media:object"
Takes PDF, produces object.
Constrained Generation
cap:in="media:textable;form=scalar";op=generate;out="media:textable;form=map";constrained
Takes text prompt, produces structured output with constraints.
Identity (Pass-through)
cap:
The identity morphism. Required in all capsets.
Summary
| Concept | Definition |
|---|---|
| Structure | $C = U \times U \times U$ (product of Tagged URN domain) |
| Components | (in, out, y) |
| Identity | cap: → cap:in=media:;out=media: |
| Direction defaults | Missing or * → media: |
| Canonical form | Always includes in and out |
Cap URNs extend Tagged URNs with three-dimensional structure. The dispatch relation defines how these dimensions interact for routing.