Content Signals
Observe content use policy signals from multiple sources and record them as structured observations. Signals record what a publisher has expressed; they never enforce access decisions. Three-state model: allow, deny, unspecified.
Package: @peac/mappings-content-signals
Observation Model
Observation-only
Content signals are observations of what a publisher has expressed. PEAC records these signals as evidence but does not enforce them. The consuming application decides how to act on the observed signals.
| State | Meaning |
|---|---|
allow | The publisher has expressed permission for this use |
deny | The publisher has expressed denial for this use |
unspecified | The publisher has not expressed a preference |
Install
pnpm add @peac/mappings-content-signals
Signal Sources
Four signal sources are supported, each with a normative reference:
| Source | Normative Reference | Transport |
|---|---|---|
robots.txt | RFC 9309 (Robots Exclusion Protocol) | File at site root |
Content-Usage | IETF AIPREF draft (draft-ietf-aipref-attach-00); Structured Fields per RFC 9651 | HTTP response header |
Content-Signal | contentsignals.org specification | HTTP response header |
tdmrep.json | W3C Community Group Final Report (TDM Reservation Protocol); EU DSM Directive 2019/790 Article 4 | JSON file at site root |
Source Precedence
When multiple sources express signals for the same URL, the most specific source takes precedence (DD-137):
tdmrep.json > Content-Signal > Content-Usage (AIPREF) > robots.txt
If tdmrep.json expresses deny for training and robots.txt is permissive, the observation records deny for the training purpose.
Content-Usage Header
The Content-Usage header is parsed as an RFC 9651 Structured Fields Dictionary. Keys form a hierarchy; values are SF Tokens.
| Key | Role | Values |
|---|---|---|
bots | Parent | y (allow), n (deny) |
train-ai | Leaf | y (allow), n (deny) |
train-genai | Leaf | y (allow), n (deny) |
search | Leaf | y (allow), n (deny) |
Content-Usage: bots=n, train-ai=n, search=y
Creating an Observation
import {
createObservation,
parseRobotsTxt,
parseContentUsage,
parseTdmRep,
} from '@peac/mappings-content-signals';
// Pre-fetched inputs (no network I/O in this package)
const robotsTxt = 'User-agent: *\nDisallow: /private/';
const contentUsageHeader = 'bots=n, train-ai=n, search=y';
const tdmRepJson = {
policies: [{
target: '/',
tdm_reservation: true,
}],
};
const observation = createObservation({
url: 'https://publisher.example.com/article',
robotsTxt: parseRobotsTxt(robotsTxt),
contentUsage: parseContentUsage(contentUsageHeader),
tdmRep: parseTdmRep(tdmRepJson),
});
// observation.signals: Record<CanonicalPurpose, SignalState>
// { train: 'deny', search: 'allow', inference: 'unspecified', ... }Purpose Mapping
Signal source keys are mapped to canonical PEAC purposes (CanonicalPurpose):
| CanonicalPurpose | Description |
|---|---|
train | Training AI models on the content |
search | Indexing for search engine results |
user_action | Direct user-initiated access |
inference | Using content during inference (retrieval, grounding) |
index | Cataloging or indexing content |
No-Fetch Architecture
All parsers in @peac/mappings-content-signals receive pre-fetched input. The package contains no network I/O and never fetches URLs. This is consistent with the PEAC no-implicit-fetch invariant (DD-55).
Parsers receive strings
parseRobotsTxt(), parseContentUsage(), and parseTdmRep() accept raw string or object input. The caller is responsible for fetching.
Schema layer only (DD-141)
The package is a Layer 4 mapping with validation-only semantics. No side effects, no I/O, no state.
EU Text and Data Mining
The tdmrep.json source implements the W3C TDM Reservation Protocol, which aligns with EU DSM Directive 2019/790 Article 4. PEAC observes whether a publisher has reserved text and data mining rights but does not enforce compliance. The consuming application is responsible for acting on the observed reservation status.
Links
Observation, Not Enforcement
Content signals produce structured observations that can be attached to PEAC receipts as evidence of what a publisher expressed at the time of an interaction. The observation records the signal state; the consuming application decides whether to proceed, request a license, or abort.