Content Signals
The @peac/mappings-content-signals package parses and records content use policy signals from three sources. It follows an observation-only model: signals are recorded as evidence, never enforced.
Content signal observation records what a publisher declared at the time of access. PEAC never enforces these signals -- enforcement is the responsibility of the consuming application.
Install
pnpm add @peac/mappings-content-signals
Signal sources
| Source | Specification | Syntax |
|---|---|---|
| robots.txt | RFC 9309 (Robots Exclusion Protocol) | User-agent / Disallow directives |
| Content-Usage header | IETF AIPREF draft (draft-ietf-aipref-attach-00), Structured Fields per RFC 9651 | HTTP Structured Fields |
| Content-Signal header | contentsignals.org specification | HTTP header |
| tdmrep.json | W3C Community Group Final Report (TDM Reservation Protocol), EU DSM Directive 2019/790 Article 4 | JSON file |
Source precedence
When multiple sources provide conflicting signals, the following precedence applies (highest to lowest):
tdmrep.json(most specific, publisher-authored)Content-Signalheader (per contentsignals.org)Content-Usageheader (IETF AIPREF)robots.txt(least specific, broadest scope)
Three-state observation model
Every signal resolves to one of three states:
| State | Meaning |
|---|---|
allow | Publisher explicitly permits the specified purpose |
deny | Publisher explicitly denies the specified purpose |
unspecified | No signal found for this purpose |
The unspecified state is not a default allow or deny -- it means no signal was observed. Applications decide how to handle unspecified signals.
Usage
import {
parseRobotsTxt,
parseContentUsage,
parseContentSignal,
parseTdmRep,
resolveSignals,
} from '@peac/mappings-content-signals';
// Parse individual sources (all receive pre-fetched input, no network I/O)
const robots = parseRobotsTxt(robotsTxtContent, 'PEACBot');
const aipref = parseContentUsage(contentUsageHeader);
const contentSignal = parseContentSignal(contentSignalHeader);
const tdmrep = parseTdmRep(tdmrepJsonContent);
// Resolve with precedence
const observation = resolveSignals({
robotsTxt: robots,
contentUsage: aipref,
contentSignal: contentSignal,
tdmRep: tdmrep,
});
// Result: { train: 'deny', search: 'allow', inference: 'unspecified', ... }
All parsers receive pre-fetched input as strings or objects. The package performs no network requests, no DNS lookups, and no file system access. Fetching the source content is the caller's responsibility.
Canonical purposes
Signals map to PEAC's five canonical purposes:
| Purpose | Description |
|---|---|
train | Use content for AI model training |
search | Index and display in search results |
user_action | Display in response to direct user request |
inference | Use as context for AI inference |
index | Crawl and index content metadata |
EU TDM compliance
The tdmrep.json source relates to the EU DSM Directive 2019/790 Article 4, which establishes text and data mining (TDM) rights and reservations. The W3C Community Group Final Report defines the machine-readable format for declaring these reservations.
PEAC's content signal observation records whether a TDM reservation was declared. It does not interpret or enforce the legal implications of the reservation. Legal compliance is the responsibility of the consuming application.