Dyadic Systems

Dyadic Systems: Relational Dynamics, Tensor Structure, and Pair-Level Phenomena

Icosapersonality assessmentpsychometricsdyadic assessmentrelational dynamicscouplesinteraction tensor

Abstract

This synthesis consolidates findings from 17 synthetic-evidence studies that collectively benchmark the Icosa model’s dyadic assessment engine across its core computational mechanisms: cross-domain transmission, fault cascade topology, interaction tensor structure, Gateway compatibility, relational basin stability, formation pattern recognition, trap dynamics, and external framework alignment. The studies encompass 9 formula verification benchmarks, 6 synthetic benchmarks, and 2 applied synthetic benchmarks, yielding 61 hypothesis tests across more than 19,000 computationally generated dyadic pairs. Of 61 tests, 49 reached reportable significance with adequate practical effect sizes, 5 were statistically significant but fell below pre-registered practical thresholds, 6 returned null results, and 1 survived FDR but not within-study Holm correction. No circularity flags were raised across any study. Hidden validation arms — present in 15 of 17 studies — confirmed that 11 dyadic constructs captured latent relational information beyond what resonance coupling alone could index, while 4 returned null, establishing clear boundaries around what the engine’s structural metrics can and cannot recover. All evidence is synthetic or model-internal. These results establish implementation fidelity, identify architectural stress points requiring recalibration, and define the external evidence program needed to evaluate operational readiness.

Scope and Evidence Taxonomy

The 17 studies reviewed here operate entirely within the Icosa model’s computational environment. No human relationship outcomes, clinical populations, or external criterion measures were used. The evidence taxonomy governing this synthesis distinguishes four categories, and each study’s findings must be interpreted within its category’s inferential ceiling.

Formula verification (9 studies) tests whether implemented code reproduces the behavior prescribed by the model’s mathematical specifications. A strong result in this category confirms that the engine computes what it claims to compute. It does not confirm that what it computes is clinically useful, externally valid, or predictive of real relational outcomes. The formula verification studies cover: basin stability, cross-band Coherence pairing, formation compatibility, Gateway prediction, interaction types, provision scoring (TMRC), risk-protection balance, tensor structure, and trap dynamics.

Synthetic benchmarks (6 studies) probe emergent properties, boundary conditions, and stress-test behaviors using computationally generated dyadic profiles. Results in this category reveal how the engine behaves at its limits — what emerges from its specifications that was not explicitly programmed, where metrics degrade, and where structural assumptions break. The synthetic benchmarks address: cross-domain flow asymmetry, cross-fault cascade vulnerability, domain channeling, emergent phenomena, shadow alignment, and D40 channel recovery.

Applied synthetic benchmarks (2 studies) test scenario-level usefulness within the simulator, asking whether the engine’s constructs can distinguish meaningfully different relational configurations under controlled conditions. The applied benchmarks cover: D40 lock-versus-buffer forecasting and dyadic template alignment with Gottman, attachment, and EFT frameworks.

All studies used fixed-seed synthetic corpora (predominantly N = 1,000 dyadic pairs; the template alignment study used N = 5,000). All applied Holm-Bonferroni family-wise correction and program-level FDR control. Pre-registered practical significance thresholds varied by analysis type. None of the 17 studies raised circularity audit flags, meaning no result is governed by shared-anchor or expected-formula-check downgrades. This clean circularity profile simplifies interpretation but does not elevate the evidence beyond what synthetic data can support.

Major Patterns

The Engine’s Core Metrics Show High Implementation Fidelity

Across the formula verification studies, the Icosa dyadic engine’s primary constructs reproduce their mathematical specifications with large effect sizes. Three results anchor this conclusion.

First, the interaction tensor’s channel-level alignment predicts dyadic Coherence at r_s = .87 (N = 1,000), confirming that local alignment operations across the 4x5 capacity-domain grid compose faithfully into the system-level integration index. This is the strongest single association in the entire study set.

Second, Gateway compatibility predicts dyadic Coherence at r_s = .77 (N = 1,000), confirming that the nine structurally privileged Gateway centers carry disproportionate weight in determining relational integration, as the model’s geometry prescribes.

Third, the relational grid bottleneck mean predicts dyadic Coherence at r_s = .85 (N = 1,000), establishing that the engine operates as a weakest-link system in which the most constrained interaction channels dominate the global metric when partners occupy different Coherence bands.

These three results are formula verification outcomes. They confirm that code matches specification. They do not confirm that the specifications themselves capture anything about real relationships. But they do establish a necessary precondition: the engine is internally self-consistent at the level of its primary constructs.

Hidden Validation Arms Reveal Structural Discrimination

Fifteen of 17 studies included hidden validation arms that tested whether the engine’s dyadic constructs captured latent relational information beyond resonance coupling — the simplest available proxy for relational similarity. Eleven hidden arms reached reportable significance. The strongest discriminations were: tensor alignment over resonance coupling for hidden channel potential (delta = 0.44), cross-band bottleneck over coupling (delta = 0.43), Gateway compatibility over coupling (delta = 0.40), and interaction-type bond merge quality over coupling (delta = 0.38).

These results carry more inferential weight than the primary hypothesis tests because the hidden arms pit the engine’s composite metrics against a structurally simpler alternative. The consistent pattern — that multi-component dyadic constructs outperform resonance coupling by margins of 0.27 to 0.44 — indicates that the engine’s geometric architecture preserves information that generic profile similarity discards. This is not external validation, but it is a form of internal construct discrimination that a poorly specified engine would fail.

Four hidden arms returned null: cross-domain asymmetry showed no coupling to destabilization deltas (r_s = .01, p = .806), cross-fault cascade depth did not outperform aggregate burden for global severity, emergent trap count did not predict hidden lock risk beyond individual severity, and shadow alignment did not outperform total alignment for lock risk. These nulls mark specific inferential boundaries that subsequent sections address.

Protective Constructs Outperform Risk Constructs

A recurring asymmetry across the benchmark suite is that the engine’s protective and integrative metrics behave more consistently than its risk and vulnerability metrics. Harmonic lock score correlated with dyadic Coherence at r_s = .74, functional dyadic capacity (TMRC) with structural safety at r_s = .52, and structural safety itself emerged as the strongest single predictor across the formation compatibility study, outperforming resonance coupling for hidden channel potential.

By contrast, risk constructs showed mixed fidelity. Dysregulated lock score fell below the practical threshold for predicting collision risk (r_s = .097, threshold = .10). Total active relational traps failed the practical threshold for predicting dyadic Coherence (r_s = -.26, threshold = -.40). Cascade asymmetry showed only a small effect on structural safety (r_s = -.14). Enmeshment basin depth produced a weaker-than-expected correlation with identity differentiation loss (r_s = -.30), and the adversarial basin’s association with conflict amplification was small (r_s = .19).

This pattern suggests that the engine’s aggregation logic preserves protective signal more faithfully than it preserves risk signal. One structural explanation: protective metrics (harmonic locks, TMRC, structural safety) aggregate across centered states where the 4x5 grid’s geometric properties are most regular, while risk metrics aggregate across pathological states where the grid’s distributional properties may be more heterogeneous and less amenable to linear or rank-order summary.

Cross-Domain Transmission Produces Emergent Directional Asymmetry

The cross-domain asymmetry benchmark revealed that symmetric adjacency weights produce directional flow asymmetry as an emergent property of profile-level heterogeneity. The absolute magnitude of asymmetry between Physical-to-Emotional/Mental flow and the reverse direction was large (d = 0.967 across 1,000 dyads — among the largest effects in the entire study set), while the directional preference itself was medium (delta = 0.061). Critically, the proposed boundary conditions (domain balance, receiver openness, complementarity) explained only 1.2% of the variance in asymmetry magnitude, and asymmetry showed zero coupling to independently generated destabilization deltas.

This result has two implications. Architecturally, any dyadic construct that aggregates cross-domain transmission without accounting for directional bias may obscure meaningful structural information. Inferentially, the asymmetry appears structurally self-contained: it is a stable emergent property of how heterogeneous profiles interact with the transmission function, but it does not propagate into the engine’s downstream outcome metrics. Whether this encapsulation reflects a stable property of the model’s design or a gap in how destabilization is computed remains open.

Relational Traps Operate as Distinct Pathways, Not a Cumulative Index

The trap dynamics study produced the most diagnostically informative dissociation in the benchmark suite. Total active relational traps failed as a graded predictor of Coherence (r_s = -.26, below the .40 practical threshold), yet individual traps — pursue-withdraw and inflation-deflation — carried adequate, pathway-specific associations with their designed target outcomes. Structural safety captured hidden channel potential beyond coupling (delta = 0.27), confirming that the safety architecture is not redundant with the trap system.

This result directly challenges any implementation that treats relational trap count as a summary severity index. The engine already encodes pathway-specific discrimination at the trap level; the aggregate score compresses qualitatively different geometric mechanisms into a single number and loses information in the process. The dyadic Coherence formula’s current aggregation pathway for relational traps is a candidate for decomposition.

Template Alignment Confirms Cross-Framework Structural Convergence

The applied synthetic benchmark aligning Icosa constructs with Gottman, attachment, and EFT frameworks produced the broadest positive result set: 5 of 5 hypotheses reached reportable significance across 5,000 dyadic pairs. Complementarity-collision alignment (r_s = .50) tracked Gottman’s foundational ratio principle. Attachment security composites predicted cross-partner emotional regulation (r_s = .44). Relational bottleneck means predicted composite relationship quality (r_s = .47). Transmission efficiency outperformed resonance coupling for hidden channel potential (delta = .29).

The weakest link was repair capacity, which predicted cascade asymmetry at r_s = -.13 — statistically significant but small. This indicates that the engine’s Gateway-mediated repair mechanism, while functional, does not yet compress the conditional dynamics that clinical repair theory emphasizes: repair matters most when conflict is dense, a moderation pattern the current benchmark did not test.

These results show that the Icosa model’s geometric primitives compress structural distinctions from three independent theoretical traditions into a single substrate. This is alignment within a synthetic environment, not validation against real clinical data. But it establishes that the model’s dyadic architecture is not structurally disconnected from the constructs that relationship science has independently prioritized.

Null and Below-Threshold Findings

Six tests across 5 studies returned null results. Five tests across 5 studies reached statistical significance but fell below pre-registered practical thresholds. One test (escape-center distribution in the emergent phenomena study) survived FDR but not Holm correction. These results are enumerated here because the system prompt requires their visibility and because they collectively define where the engine’s current implementation underperforms its specification.

Full nulls:

Domain channeling mean flow comparison: observed versus permuted flow difference was exactly zero (delta = 0.000, p = 1.000). Domain channeling carries information only in the joint distributional pattern, not in marginal per-domain magnitudes.
Domain channeling dominant domain hit rate: chi-squared test returned non-significant (V = .124, p = .575). Modal domain assignment does not exceed frequency-matched chance.
Cross-domain asymmetry hidden arm: asymmetry delta uncorrelated with destabilization delta (r_s = .008, p = .806). The emergent asymmetry is structurally self-contained.
Cross-fault cascade hidden arm: cascade depth did not outperform aggregate burden for global severity (delta = -.07, p = .096). Cascade topology indexes relational structure, not individual pathology.
Emergent phenomena hidden arm: emergent trap count did not predict hidden lock risk beyond severity (R-squared = .001, p = .780). Emergent trap presence is categorical, not graded.
Shadow alignment hidden arm: shadow ratio did not outperform total alignment for lock risk (delta = .02, p = .679). Shadow alignment captures interaction dynamics but not structural-level constraints.

Below practical threshold:

Cross-domain asymmetry boundary conditions: R-squared = .012 (threshold: .05). Domain balance, receiver openness, and complementarity collectively account for negligible variance in asymmetry magnitude.
Interaction types fault-cross-to-Coherence: r_s = -.49 (threshold: -.50). A near-miss that may reflect dilution across capacity rows rather than genuine absence of signal.
Provision score TMRC-to-Coherence: r_s = .49 (threshold: .50). The .008 gap indicates TMRC measures something distinct from overall Coherence.
Risk-protection dysregulated lock-to-collision: r_s = .097 (threshold: .10). The collision risk formula appears insufficiently sensitive to lock-based entrainment.
Trap dynamics total traps-to-Coherence: r_s = -.26 (threshold: -.40). Aggregate trap count is an unreliable graded predictor.

FDR-only:

Emergent phenomena escape-center distribution: chi-squared test reached p = .026 and V = .183 but did not survive Holm correction. The escape-center distribution difference between observed and expected is directionally present but not confirmed at the family-wise level.

Architectural Implications

The benchmark suite reveals five specific architectural issues that warrant attention before external validation studies.

1. Risk metric recalibration. The consistent underperformance of risk constructs relative to protective constructs suggests that the engine’s aggregation logic for pathological states requires targeted revision. Dysregulated lock score, cascade asymmetry, and aggregate trap count all underperformed their specifications. The most productive repair strategy is not global formula revision but targeted stress testing: generate dyads with artificially elevated risk prevalence and decompose the aggregation pathway to identify where signal attenuates.

2. Trap aggregation decomposition. The dyadic Coherence formula’s single aggregation pathway for relational traps compresses qualitatively different geometric mechanisms. The trap dynamics results demonstrate that specific traps carry distinct predictive profiles that the aggregate score discards. Decomposing this pathway into trap-weighted sub-components would better align the summary metric with the discrimination the engine already encodes at the trap level.

3. Directional flow indexing. Cross-domain asymmetry is a large, stable emergent property that no current summary construct captures. Adding directional flow indices alongside undirected transmission totals would make this information available to downstream consumers (formation family classification, dyadic Coherence, narrative generation) without requiring formula revision.

4. Gateway repair conditioning. Repair capacity’s weak prediction of cascade asymmetry in the template alignment study (r_s = -.13) and the broader near-miss on provision scoring suggest that the Gateway-mediated repair mechanism needs conflict-conditional weighting. Clinical theory predicts repair matters most where conflict is densest. The current uniform aggregation may dilute a conditional signal.

5. Bottleneck-first intervention targeting. The cross-band pairing results establish that unrealized relational capacity concentrates at the floor of the interaction grid. Dyadic centering paths that target the weakest channels first should yield disproportionate gains in Coherence. This is an architectural prescription that can be implemented and verified without external data.

Boundaries and Limitations

Three structural limitations constrain what this benchmark suite can establish.

Synthetic data ceiling. All 17 studies used computationally generated profiles. The distributions, correlations, and boundary conditions observed here reflect the d40 simulator’s parameter space, not the parameter space of real human couples. Any property that depends on non-uniform population clustering — attachment style distributions, cultural moderation of relational dynamics, clinical comorbidity patterns — is invisible to these benchmarks.

Uniform random pairing. With the exception of the template alignment study’s targeted framework-matching, all studies used uniform random pairing of synthetic profiles. Real couples do not pair randomly. Assortative mating, attachment-style matching, and social selection effects may amplify or attenuate every effect reported here. The cascade, basin, and trap dynamics results are particularly sensitive to pairing assumptions, since their predicted effects depend on how vulnerability structures interlock.

Absence of longitudinal dynamics. All results are cross-sectional snapshots of the engine’s output given two static profiles. The model’s theoretical framework includes developmental trajectories, centering paths, and state transitions that these benchmarks do not test. Any construct that claims to index relational growth potential (hidden channel capacity, repair resources, catalytic interactions) is benchmarked here only as a static correlate, not as a temporal predictor.

Next-Step Research Priorities

The benchmark suite converges on four research priorities, ordered by their capacity to resolve the most consequential open questions.

Priority 1: Persona-constrained replication. Repeat the core verification studies using persona-template-generated profiles rather than unconstrained parameter sampling. Clinical clustering in the profile space may attenuate or amplify effects that uniform generation averages out. This is the single highest-leverage next step because it tests whether the engine’s properties hold under distributional conditions closer to its intended use case, without requiring human data.

Priority 2: Capacity-row decomposition. Decompose the cross-domain asymmetry, fault-cross-to-Coherence near-miss, and hidden-arm discrimination effects by individual capacity row. The Move row houses 15 of the model’s 50 traps and anchors the theoretically strongest cross-partner channel. Determining whether aggregate effects concentrate in specific rows or distribute uniformly would sharpen both the model’s theoretical architecture and its benchmark coverage.

Priority 3: Conflict-conditional repair testing. Stratify the template alignment benchmark by collision risk level and test whether repair capacity’s association with cascade asymmetry strengthens under high-conflict conditions. A positive result would confirm that the Gateway architecture captures context-dependent activation rather than static structural alignment, moving the framework toward dynamic relational modeling.

Priority 4: External criterion pilot. Design a small-sample pilot (N = 50-100 couples) testing whether the engine’s strongest synthetic-benchmark constructs — Gateway compatibility, bottleneck Coherence, structural safety, complementarity-collision alignment — show any association with externally measured relationship outcomes (satisfaction, stability, observed conflict behavior). This study would provide the first evidence outside the model’s own computational environment and would determine whether synthetic benchmark fidelity translates into operational signal.

Conclusion

Seventeen synthetic-evidence studies establish that the Icosa dyadic engine computes what its mathematical specifications prescribe, with high fidelity for integrative and protective constructs and identifiable gaps for risk constructs and aggregated trap metrics. The engine’s composite metrics consistently outperform resonance coupling in capturing latent relational information (11 of 15 hidden arms significant, deltas ranging from 0.27 to 0.44), confirming that the 4x5 geometric architecture preserves structural discrimination that simpler similarity measures discard. Cross-framework alignment with Gottman, attachment, and EFT constructs holds at medium-to-large effect sizes within the synthetic environment.

These results do not establish clinical validity, predictive utility, or operational readiness. They establish a verified computational substrate: internally consistent, architecturally legible, and precisely bounded in what it does and does not capture. The five architectural refinements identified — risk metric recalibration, trap aggregation decomposition, directional flow indexing, conflict-conditional repair weighting, and bottleneck-first intervention targeting — are all implementable within the current engine and verifiable with the same synthetic benchmark infrastructure. The four research priorities — persona-constrained replication, capacity-row decomposition, conflict-conditional repair, and external criterion piloting — represent the path from verified architecture to evidence of operational value. The engine is ready for the next phase of testing; the next phase of testing is not synthetic.