Canonicalization Specification

v1.0 Stable ZLAR Canonicalization Specification — April 2026

This specification defines the canonical form for ZLAR Governed Action Receipts and audit trail entries. All implementations that produce or verify signatures over ZLAR JSON structures must follow this specification.

This specification adopts RFC 8785 (JSON Canonicalization Scheme) with a constrained schema that eliminates its known failure modes. Any ZLAR-canonical JSON is also RFC 8785-canonical.

Overview

ZLAR uses JSON structures signed with Ed25519 over their SHA-256 hash. For signatures to verify across implementations (bash, Node.js, Python, Go, Rust), every implementation must produce byte-identical canonical output for the same logical data.

The canonical form is computed by:

  1. Recursively sorting all object keys
  2. Serializing as compact JSON with no whitespace between tokens
  3. Encoding as UTF-8 bytes

Schema Constraints

ZLAR canonical form applies to JSON structures that conform to these type restrictions:

Allowed types

Prohibited types

Rationale: Floating-point serialization is the primary source of cross-language canonicalization failures. JavaScript's JSON.stringify(100.0) produces "100". Python's json.dumps(100.0) produces "100.0". Eliminating floats from the schema eliminates this entire vulnerability class.

Key Ordering

Object keys must be sorted recursively at all nesting levels by Unicode code point order. For the constrained ZLAR schema (ASCII-only keys), this is identical to lexicographic byte order, UTF-16 code unit order (RFC 8785), and ASCII alphabetical order.

Number Representation

Integers must be serialized as decimal digits with no leading zeros, no trailing zeros, no decimal point, and no exponent notation. Negative zero must serialize as 0.

String Serialization

Strings follow standard JSON escaping (RFC 8259). All non-ASCII characters must be output as literal UTF-8 bytes, not escaped to \uXXXX. This matches JavaScript's JSON.stringify() and RFC 8785.

Signing Protocol

  1. Remove the signature field from the structure
  2. Canonicalize per this specification → UTF-8 bytes
  3. Hash with SHA-256 → lowercase hex string (64 characters)
  4. Sign the hex string bytes with Ed25519
  5. Encode the 64-byte signature as base64 (v0) or base64url (v1)

Implementation Guidance

LanguageRecommended Implementation
Node.js/TypeScriptcanonicalize npm package (erdtman) or recursive key-sort + JSON.stringify
Pythonrfc8785 package (Trail of Bits) or recursive key-sort + json.dumps(ensure_ascii=False, separators=(',',':'))
Gogowebpki/jcs
Javatitanium-jcs or erdtman/java-json-canonicalization
Rustserde_json_canonicalizer (not serde_jcs — confirmed UTF-16 sorting bug)
Bashjq -S -c '.' (valid for ZLAR-constrained structures only)

Test Vectors

28 test vectors are published at tests/fixtures/canonicalization-vectors.json. Verified across Node.js, Python, and bash/jq. Implementations must produce byte-identical output for every vector.


Permanent URL: https://zlar.ai/specs/canonicalization
Source: docs/canonicalization-spec.md
License: Apache 2.0 — ZLAR Inc.