Skip to content
v0.11.2Since v0.11.2

Content Signals

Observe content use policy signals from multiple sources and record them as structured observations. Signals record what a publisher has expressed; they never enforce access decisions. Three-state model: allow, deny, unspecified.

Package: @peac/mappings-content-signals

Observation Model

Observation-only

Content signals are observations of what a publisher has expressed. PEAC records these signals as evidence but does not enforce them. The consuming application decides how to act on the observed signals.

StateMeaning
allowThe publisher has expressed permission for this use
denyThe publisher has expressed denial for this use
unspecifiedThe publisher has not expressed a preference

Install

pnpm add @peac/mappings-content-signals

Signal Sources

Four signal sources are supported, each with a normative reference:

SourceNormative ReferenceTransport
robots.txtRFC 9309 (Robots Exclusion Protocol)File at site root
Content-UsageIETF AIPREF draft (draft-ietf-aipref-attach-00); Structured Fields per RFC 9651HTTP response header
Content-Signalcontentsignals.org specificationHTTP response header
tdmrep.jsonW3C Community Group Final Report (TDM Reservation Protocol); EU DSM Directive 2019/790 Article 4JSON file at site root

Source Precedence

When multiple sources express signals for the same URL, the most specific source takes precedence (DD-137):

Precedence order (highest to lowest)
tdmrep.json  >  Content-Signal  >  Content-Usage (AIPREF)  >  robots.txt

If tdmrep.json expresses deny for training and robots.txt is permissive, the observation records deny for the training purpose.

Content-Usage Header

The Content-Usage header is parsed as an RFC 9651 Structured Fields Dictionary. Keys form a hierarchy; values are SF Tokens.

KeyRoleValues
botsParenty (allow), n (deny)
train-aiLeafy (allow), n (deny)
train-genaiLeafy (allow), n (deny)
searchLeafy (allow), n (deny)
Example header
Content-Usage: bots=n, train-ai=n, search=y

Creating an Observation

observe.tsTypeScript
import {
  createObservation,
  parseRobotsTxt,
  parseContentUsage,
  parseTdmRep,
} from '@peac/mappings-content-signals';

// Pre-fetched inputs (no network I/O in this package)
const robotsTxt = 'User-agent: *\nDisallow: /private/';
const contentUsageHeader = 'bots=n, train-ai=n, search=y';
const tdmRepJson = {
  policies: [{
    target: '/',
    tdm_reservation: true,
  }],
};

const observation = createObservation({
  url: 'https://publisher.example.com/article',
  robotsTxt: parseRobotsTxt(robotsTxt),
  contentUsage: parseContentUsage(contentUsageHeader),
  tdmRep: parseTdmRep(tdmRepJson),
});

// observation.signals: Record<CanonicalPurpose, SignalState>
// { train: 'deny', search: 'allow', inference: 'unspecified', ... }

Purpose Mapping

Signal source keys are mapped to canonical PEAC purposes (CanonicalPurpose):

CanonicalPurposeDescription
trainTraining AI models on the content
searchIndexing for search engine results
user_actionDirect user-initiated access
inferenceUsing content during inference (retrieval, grounding)
indexCataloging or indexing content

No-Fetch Architecture

All parsers in @peac/mappings-content-signals receive pre-fetched input. The package contains no network I/O and never fetches URLs. This is consistent with the PEAC no-implicit-fetch invariant (DD-55).

Parsers receive strings

parseRobotsTxt(), parseContentUsage(), and parseTdmRep() accept raw string or object input. The caller is responsible for fetching.

Schema layer only (DD-141)

The package is a Layer 4 mapping with validation-only semantics. No side effects, no I/O, no state.

EU Text and Data Mining

The tdmrep.json source implements the W3C TDM Reservation Protocol, which aligns with EU DSM Directive 2019/790 Article 4. PEAC observes whether a publisher has reserved text and data mining rights but does not enforce compliance. The consuming application is responsible for acting on the observed reservation status.

Links

Observation, Not Enforcement

Content signals produce structured observations that can be attached to PEAC receipts as evidence of what a publisher expressed at the time of an interaction. The observation records the signal state; the consuming application decides whether to proceed, request a license, or abort.