Canonical Reference

BLISP Research Program

A nine-paper research program on admissibility, deterministic execution, semantic coordinates, cross-system transfer, and agent convergence.

Author: Thomas Dionysopoulos · 9 papers · 156 pages · all published with DOIs

Program DOI 10.5281/zenodo.20459958 Program DOI: 10.5281/zenodo.20459958

Program Abstract

Overview

AI systems increasingly generate computation rather than having humans write it directly. When the generator is stochastic, the execution system must determine which proposals are admissible, which surface forms are equivalent, whether results can be replayed, and where two executions diverge. This program develops a formal and empirical framework for these problems.

Paper 1 establishes the admissibility boundary: a grounding gate that rejects valid-but-unwarranted operations before execution. Paper 2 formalizes the canonical execution boundary: typed specifications, a canonicalization pipeline, 8-layer provenance hashing, and description/identity separation. Paper 3 proves that the operational equivalence is a congruence, enabling a quotient category that gives precise meaning to deterministic execution identity. Paper 4 defines provenance as a semantic factorization with a dependency-indexed composition law, enabling divergence localization and partial replay. Paper 5 measures the empirical fiber structure of 2,200 stochastic proposals under controlled perturbation, demonstrating that surface-form variation is absorbed while provenance-level changes create clean transitions.

Papers 6–7 investigate the semantic structure of operations themselves: a single 7-valued coordinate (DependencyClass) predicts four independent optimizer behaviors at 99.6% accuracy and generalizes to unseen operations at 100%. Paper 8 tests whether this structure transfers to independently-developed systems: the frozen taxonomy predicts execution behavior in Polars and DuckDB at 91.1% combined accuracy, with zero errors from incorrect dependency-shape assignments. Paper 9 asks whether agents reconstruct structurally equivalent execution-identity primitives under task pressure: across three domains and three model families, 7/8 primitives converge above 0.90.

All constructions are operational, registry-relative, and grounded in a running system (BLISP) evaluated in systematic trading research. The architecture is domain-independent; the evaluation is not.

Start Here

Reading Paths by Audience

Researchers

Read Paper 1 (grounding gate), then Paper 2 (execution semantics). Papers 35 for formal quotient semantics, provenance, and fibers. Papers 67 for semantic coordinates as predictive objects. Paper 8 for cross-system transfer. Paper 9 for agent convergence.

Engineers

Start with Paper 1 for the architecture and grounding gate. Paper 2 describes the canonicalization pipeline and 8-layer hashing you would implement. Paper 6 shows how a single coordinate predicts optimizer behavior. Paper 8 demonstrates cross-system portability.

AI Practitioners

Paper 1 addresses valid-but-unwarranted execution in LLM tool use. Paper 5 measures how LLM-generated proposals behave under controlled perturbation. Paper 9 shows that independent agents reconstruct structurally equivalent execution-identity primitives under task pressure.

Investors

Paper 1 establishes the core value proposition. Paper 8 proves the taxonomy transfers across systems. Paper 9 demonstrates convergent reconstruction by agents, implying the structure is natural and worth materializing. The papers portal provides the complete picture.

If you read only one paper

Read Paper 1: The Grounding Gate. It introduces the core problem (valid-but-unwarranted execution), the grounding gate architecture, and the empirical evaluation. No prerequisites. 13 pages.

Structure

Dependency Graph

Each paper depends on all preceding papers. The program is a linear chain, not a DAG.

Paper 1
Grounding Gate
empirical
-->
Paper 2
Canonical Exec
empirical
-->
Paper 3
Categories
formal
-->
Paper 4
Provenance
formal
-->
Paper 5
Fibers
empirical
Paper 6
Semantic Structure
empirical
-->
Paper 7
Predictive Objects
empirical
-->
Paper 8
Cross-System
empirical
-->
Paper 9
Convergence
empirical
Empirical papers contain experiments and data. Formal papers contain definitions, propositions, and proofs.
Green = foundation (Papers 1–5). Cyan = semantic coordinates → transfer → convergence (Papers 6–9).
Papers

All Nine Papers

PAPER 1
The Grounding Gate: Admissibility and Replay Guarantees for AI-Driven Research

AI systems that generate computational pipelines from natural language may propose operations that are structurally valid but semantically unwarranted. This paper presents a grounding gate: a mandatory admissibility boundary between AI-proposed operations and deterministic execution. The system discovers which capabilities match the user's terms by querying a live registry (236 capabilities) and rejects proposals whose names lack discovery evidence. Evaluated on 30 prompts: unwarranted execution reduced from 23.3% to 10.0% (Fisher exact p = 0.027). Replay produces bit-identical hashes across 50 runs. Grounding overhead under 14 ms.

Prerequisites: None
PAPER 2
Canonical Execution Semantics for Stochastic Program Generators

When the generator of computation is stochastic, independently generated programs that represent the same intended computation arrive in different surface forms. This paper presents the canonical execution boundary: an architectural invariant beyond which stochasticity does not propagate. Four mechanisms enforce the boundary: typed specifications, a canonicalization pipeline (278 surface forms to 235 canonical operations), 8-layer execution hashing, and description/identity separation. Evaluated on 1,200 stochastic LLM generations with 50-run replay determinism and provenance stability under registry evolution.

Prerequisites: Paper 1
PAPER 3
Execution Categories for Stochastic Program Generators: Quotient Semantics for Deterministic Executable Identity

The operational equivalence generated by the system's rewrite rules (alias resolution, argument-order normalization, canonical form selection) forms a congruence: equivalent subexpressions remain equivalent under arbitrary well-typed pipeline composition. This is the central formal result of the program. The resulting quotient category gives precise meaning to deterministic execution identity. Content-addressed hashing serves as a computable operational witness of quotient membership. A projection connects stochastic proposals to their execution classes, with fibers measuring collapse from surface diversity to canonical identity.

Prerequisites: Papers 1-2
PAPER 4
Provenance Algebra for Deterministic AI Execution: Replay Semantics for Stochastic Program Generators

Provenance for deterministic execution systems is not metadata but a semantic factorization of execution identity. A provenance map decomposes each execution equivalence class into an 8-layer hash record with declared dependencies. A dependency-indexed composition law establishes that pipeline provenance is determined by stage provenance and the declared dependency map. This enables replay equivalence by hash comparison, divergence localization to specific semantic layers, partial replay of only changed layers, and provenance-preserving registry evolution where discovery aliases are invisible at all eight layers.

Prerequisites: Papers 1-3
PAPER 5
Proposal Collapse and Execution Fibers in Stochastic Program Generation

Two distinct kinds of variation emerge when stochastic generators propose executable specifications: surface-form variation (absorbed by canonicalization, intra-fiber) and execution ambiguity (changing execution identity, inter-fiber). Across 2,200 proposals with controlled perturbations: synonym rewording stays within fibers (rho = 0.985), metric and family substitutions produce zero same-fiber mass (rho = 0.000) with perfect per-variant stability (sigma = 1.000). The execution adjacency graph is sparse (density = 0.095, 10 connected components). The key finding is that provenance-level changes create clean, stable transitions between execution classes, not noisy instability.

Prerequisites: Papers 1-4
PAPER 6
The Semantic Structure of Execution: An Empirical Study of Predictive Coordinates in Computational Operations

A single 7-valued coordinate (DependencyClass) classifies operations by data-dependency shape and predicts four independent optimizer behaviors—fusion eligibility, window semantics, pipeline position, and state management—with 99.6% accuracy (243/244 behavior predictions, z = 13.0, p < 10−38 vs random baseline). The coordinate is not a descriptive label; it is a predictive object that determines execution behavior from semantic structure alone.

Prerequisites: Papers 1-5
PAPER 7
Semantic Coordinates as Predictive Objects in Time-Series Computation

A frozen taxonomy trained on 61 operations generalizes to 25 unseen operations at 100% accuracy (100/100 holdout predictions) with zero recalibration. Coordinate ablation confirms that the full coordinate is minimal—removing any single dimension degrades prediction. Random baselines with equivalent cardinality achieve chance accuracy. The result establishes semantic coordinates as predictive objects: they predict optimizer behavior, not merely describe it.

Prerequisites: Papers 1-6
PAPER 8
Dependency Shape Predicts Execution Behavior Across Independent Data Processing Systems

A frozen 8-valued dependency-shape taxonomy, built without inspecting either target system, predicts three execution behaviors (streaming, buffering, warmup) in Polars (Rust, morsel-driven) and DuckDB (C++, push-based). Buffering predictions reach 96.7% accuracy in both systems. Combined accuracy across 180 predictions is 91.1%, with zero errors from incorrect dependency-shape assignments. All errors trace to architectural choices and API conventions, not to the taxonomy itself.

Prerequisites: Papers 1-7
PAPER 9
Agents Reconstruct Execution Identity Algebra Under Task Pressure

Independent frontier model families (Anthropic, OpenAI, Google), working on independent domains (finance, SQL, build/CI), reconstruct structurally equivalent execution-identity primitives under task pressure. Nine question tiers of increasing difficulty elicit eight primitives: normalization, canonical identity, equivalence classes, grouping, composite rewriting, replay mappings, computation DAGs, and policy checking. 7/8 primitives converge above 0.90 across 55 runs. Reconstruction is convergent, staged, and expensive (~178,000 tokens per reconstruction). A reference implementation materializes the same eight primitives as persistent, composable, domain-portable infrastructure at zero marginal query cost.

Prerequisites: Papers 1-8
Reference

Reading Order and Artifacts

# Paper Type Pages DOI Release
1 The Grounding Gate Empirical 13 20456984 v1
2 Canonical Execution Semantics Empirical 23 20457255 v1
3 Execution Categories Formal 14 20457403 v1
4 Provenance Algebra Formal 15 20457667 v1
5 Execution Fibers Empirical 12 20457990 v1
6 The Semantic Structure of Execution Empirical 17 20612709 v1
7 Semantic Coordinates as Predictive Objects Empirical 14 20706294 v1
8 Dependency Shape Predicts Execution Behavior Empirical 17 20706086 v1
9 Cross-Family Convergence Empirical 17 20706156 v1

Total: 156 pages across 9 papers. All published as open-access working papers under CC-BY-4.0. Each GitHub release contains PDF, LaTeX source, experiment data (where applicable), verification scripts, CITATION.cff, and .zenodo.json.

Reproducibility

Experiment Data

Seven of the nine papers include computational experiments with published datasets.

PaperDatasetSize
Paper 1 30-prompt evaluation (5 categories, 4 families, 9 metrics) prompts_30.json
Paper 2 1,200 LLM generations (30 prompts x 4 temps x 10 reps), replay CSV, provenance CSV experiment-data.tar.gz
Paper 3 Theoretical paper, no experiment data --
Paper 4 Theoretical paper, no experiment data --
Paper 5 2,200 proposals (1,200 baseline + 1,000 perturbations), fiber stats, adjacency graph experiment-data.tar.gz
Paper 6 61-operation taxonomy, 4 optimizer behavior predictions, conditional MI analysis, holdout data cargo test
Paper 7 25-operation holdout generalization, coordinate ablation, random baseline comparison cargo test
Paper 8 30 operations × 2 systems × 3 behaviors (180 predictions), Polars + DuckDB reproduce.sh
Paper 9 55 runs across 3 model families × 3 domains × 9 question tiers, ~178k tokens per run reproduce.sh

All datasets are included in their respective GitHub releases. Verification scripts are provided for each paper.

Citation

How to Cite

Research Program

To reference the program as a whole:

Dionysopoulos, T. (2026). BLISP Research Program: Admissibility, Deterministic Execution, Provenance, and Capability-Grounded AI Systems. Zenodo. https://doi.org/10.5281/zenodo.20459958

@misc{blisp2026program,
  title        = {BLISP Research Program: Admissibility, Deterministic Execution,
                  Provenance, and Capability-Grounded AI Systems},
  author       = {Dionysopoulos, Thomas},
  year         = {2026},
  doi          = {10.5281/zenodo.20459958},
  publisher    = {Zenodo},
  url          = {https://doi.org/10.5281/zenodo.20459958},
  note         = {9-paper program; all papers published with DOIs}
}

Individual Papers

#BibTeX KeyDOI
1 dionysopoulos2026grounding 10.5281/zenodo.20456984
2 dionysopoulos2026canonical 10.5281/zenodo.20457255
3 dionysopoulos2026categories 10.5281/zenodo.20457403
4 dionysopoulos2026provenance 10.5281/zenodo.20457667
5 dionysopoulos2026fibers 10.5281/zenodo.20457990
6 dionysopoulos2026semantic 10.5281/zenodo.20612709
7 dionysopoulos2026predictive 10.5281/zenodo.20706294
8 dionysopoulos2026transfer 10.5281/zenodo.20706086
9 dionysopoulos2026convergence 10.5281/zenodo.20706156

Full BibTeX entries with DOI fields are available on each paper card.