URN Syntax
Tagged URN canonicalization and CAP-specific parsing behavior
Contract
Tagged URNs require a prefix (prefix:...). Prefix and keys are lowercased on parse. Unquoted values are lowercased; quoted values preserve case.
Value-less tags (for example cap:op) parse as * and serialize back in value-less form.
Input: CAP:Op=Extract;Format=pdf Output: cap:format=pdf;op=extract Input: cap:key="UPPER" Output: cap:key="UPPER"
CAP Direction Tags: Parser vs Canonical Form
CapUrn::from_string requires the cap: prefix but does not require explicit in/out tags in input text. Missing or wildcard direction tags are normalized to media:.
cap: -> cap:in=media:;out=media: cap:in -> cap:in=media:;out=media: cap:out -> cap:in=media:;out=media: cap:in=media:text;out -> cap:in="media:text";out=media: cap:in=*;out=* -> cap:in=media:;out=media:
Canonical strings still include both in and out.
Algorithm
- Parse prefix and tags using tagged URN parser.
- Normalize prefix/keys and unquoted values to lowercase.
- Treat value-less tags as wildcard
*. - For CAP URNs, map missing/wildcard
in/outtomedia:. - Serialize tags in sorted order with smart quoting.
Errors
- Missing prefix: parse error.
- Duplicate key: parse error.
- Empty value after
=: parse error. - Invalid CAP prefix (not
cap:):CapUrnError::MissingCapPrefix.
References
tagged-urn-rs/src/tagged_urn.rs:75,tagged-urn-rs/src/tagged_urn.rs:94,tagged-urn-rs/src/tagged_urn.rs:138,tagged-urn-rs/src/tagged_urn.rs:238capdag/src/urn/cap_urn.rs:75,capdag/src/urn/cap_urn.rs:93,capdag/src/urn/cap_urn.rs:138,capdag/src/urn/cap_urn.rs:153tagged-urn-rs/src/tagged_urn.rs:999,tagged-urn-rs/src/tagged_urn.rs:1020,tagged-urn-rs/src/tagged_urn.rs:1770capdag/src/urn/cap_urn.rs:721