Coherence Metrics

Coherence Metrics: Internal Consistency of the Integration Index

Icosapersonality assessmentpsychometricscoherence scoreinternal consistencyconvergent metrics

Overview

This whitepaper synthesizes the available internal evidence on the Icosa model’s Coherence metric, drawing on one completed study that examines the internal consistency of Coherence against resonance total, the model’s measure of cross-domain dissonance propagation. All evidence presented here derives from synthetic profiles generated within the Icosa computational framework. No human-respondent data, external criterion measures, or clinical outcome data inform these findings. The synthesis is organized around what the study reveals about the architecture’s internal behavior, the boundary conditions it surfaces, the nulls it does and does not produce, and the research priorities that follow.

The Coherence metric is a composite summary score quantifying overall personality integration across the Icosa model’s 20 centers, organized in a 4x5 capacity-by-domain structure. It aggregates five distinct input streams: capacity good-flow contributions, domain stability indices, asymmetric under-penalty weighting, Gateway healing power, and Trap/Basin penalties. Resonance total, the second primary summary metric, captures the cumulative magnitude of dissonance propagation across domain boundaries. The two metrics share a common data source (the 20 Harmony scores) but process those scores through different computational pathways. Understanding their relationship is foundational to determining whether the model’s summary architecture behaves in a disciplined, self-consistent manner.

Benchmark Scope and Evidence Classification

The synthesis draws on a single study:

coherence-convergent (“Internal Consistency: Coherence and Resonance Summary Metrics”). Evidence type: internal consistency. This study generated 500 synthetic profiles using the d40 engine with a fixed random seed (seed = 42) and computed Pearson’s correlation between Coherence and resonance total. It tested one hypothesis (H1): that Coherence and resonance total are negatively correlated, with a pre-specified adequacy threshold of |r| = .40.

Under the evidence taxonomy governing this synthesis, this study qualifies as a synthetic benchmark showing simulated behavior and boundary conditions. The profiles are computationally generated with unconstrained random inputs. The distributional properties of these synthetic profiles may diverge from those produced by human respondents, particularly at extremes. The study establishes properties of the computational model’s internal architecture; it does not and cannot speak to external criterion validity, diagnostic performance, or clinical utility.

The study’s circularity audit returned clean: no flags raised, no unresolved findings, zero allowed/disallowed overlaps. Coherence and resonance total are computed through separate algorithmic pathways, and no shared-ancestry overlap was detected between them. This means the observed correlation can be interpreted as reflecting a genuine structural relationship between two independently derived summary metrics, rather than an artifact of shared formula components.

One further classification note: the coherence-convergent study was originally assigned to the analysis category “verification” (pre-specified expected |r| > 0.80), meaning it was designed to confirm that two related metrics behave consistently rather than to discover an unknown relationship. The observed medium effect (r = -.48) fell below that verification threshold, and this is interpreted as evidence of non-redundancy rather than implementation divergence: resonance total is not a direct input to the Coherence formula, so a lower-than-expected correlation reflects architectural independence, not a broken relationship.

Two originally planned hypotheses, H2 (fault lines vs. Coherence) and H3 (grid completion vs. Coherence), were removed prior to analysis as tautological. Both fault-line severity and grid completion are direct components of the Coherence formula itself. Had they been retained, they would almost certainly have produced the inflated correlations characteristic of shared-formula artifacts, which would have been flagged under circularity governance as expected_formula_check results. Their removal strengthens the interpretive clarity of the surviving hypothesis.

Major Findings

H1: Coherence and Resonance Total Are Inversely Correlated at Medium Strength

Coherence and resonance total showed a medium-strength negative correlation: r(498) = -.48, p < .001, 95% CI [-.54, -.41]. The effect size exceeded the pre-specified adequacy threshold of |r| = .40. The result survived all correction methods applied (raw, Holm, and FDR) and met all criteria for final reportable status.

The signal profile for this study is clean: one reportable finding, zero below-practical-threshold, zero exploratory positives, zero FDR-only, zero raw-only, zero nulls, zero not-evaluable. This is the simplest possible signal profile for a single-hypothesis study, with no ambiguous or borderline results to manage.

The medium effect size is the most informative aspect of this result, and its interpretation requires care. A near-perfect inverse correlation (r > -.80) would have indicated functional redundancy between the two metrics, collapsing what are designed to be two distinct summary perspectives into a single underlying dimension. A trivial correlation (r < -.20) would have raised concern that the model’s summary architecture produces metrics bearing no consistent relationship to each other. The observed r = -.48 sits between these failure modes, indicating shared but non-redundant variance. Approximately 23% of the variance in resonance total is accounted for by Coherence, leaving roughly 77% of resonance total’s variance attributable to processes outside the Coherence formula.

This partial overlap is architecturally desirable. It indicates that the model’s two primary summary statistics converge enough to confirm coherent internal structure, yet diverge enough to justify their separate existence within the summary architecture.

Distributional Characteristics and Boundary Behavior

The distributional shapes of the two metrics reveal structural properties that extend beyond the central correlation finding.

Coherence scores showed moderate negative skew (skewness = -0.68, excess kurtosis = 0.29), indicating a slight clustering toward higher values with a tail extending into lower Coherence bands. Resonance total was positively skewed (skewness = 1.97, excess kurtosis = 3.98), reflecting a distribution in which most profiles had low-to-moderate dissonance propagation, with a minority exhibiting substantially elevated resonance. Both variables departed from strict normality on Shapiro-Wilk tests (Coherence: W = .967, p < .001; resonance total: W = .666, p < .001), though at N = 500, Pearson’s r remains tolerant of these departures.

The asymmetry between distributions carries an architectural implication. Cross-domain dissonance propagation appears relatively uncommon at moderate-to-high Coherence levels but escalates nonlinearly as Coherence declines. The relationship between the two metrics may be floor-and-ceiling bounded rather than strictly linear: high-Coherence profiles are consistently low on resonance total, but among low-Coherence profiles, the range of resonance total values expands considerably. This is consistent with a model in which personality integration acts as a buffer against dissonance propagation up to a threshold, below which the buffering capacity degrades and propagation dynamics become increasingly variable.

If this pattern holds under further investigation, it suggests that the Coherence formula’s asymmetric under-penalty weighting preferentially captures a class of dysfunction (withdrawal, under-expression) that does not always produce maximal cross-domain propagation. The nonlinear escalation at low Coherence levels parallels the asymmetric valuation described in prospect theory: below a certain integration threshold, each additional unit of dysfunction produces disproportionately larger propagation effects. This is a hypothesis generated by the distributional asymmetry, not a confirmed finding; it requires band-stratified analysis to test directly.

Null and Below-Threshold Results

This study produced no null hypotheses, no below-threshold findings, and no exploratory positives. The single tested hypothesis was confirmed at the pre-specified threshold across all correction methods.

However, the absence of nulls must be interpreted in context. The study tested one deliberately conservative hypothesis about the relationship between two summary metrics known to share a common data source. The probability of a null result for a correlation between two composites derived from the same 20 Harmony scores was low from the outset. The informative question was never “is there a correlation?” but rather “what magnitude is the correlation, and what does that magnitude tell us about architectural redundancy?” The r = -.48 answer to that question is informative, but the absence of nulls should not be mistaken for a broad absence of boundary conditions or failure modes in the Coherence metric.

What remains untested is substantially larger than what has been tested. No study in this synthesis examines: Coherence’s relationship to any external criterion; the stability of Coherence across repeated assessments; the sensitivity of Coherence to targeted interventions; the behavior of Coherence at the extremes of its range; the discriminant validity of Coherence against unrelated constructs; or the Coherence metric’s behavior in human-respondent data as opposed to synthetic profiles. These gaps are expected at this stage of the research program. They are noted here to prevent the single positive finding from being overweighted.

Architectural Implications

Non-Redundancy of Summary Metrics

The 77% unshared variance between Coherence and resonance total has a direct architectural consequence: the two metrics should be interpreted jointly rather than treated as interchangeable. Two profiles occupying the same Coherence Band can differ substantially in their dissonance propagation patterns. A profile with moderate Coherence (Strained band, 44-64) and low resonance total indicates broadly distributed low-grade dysfunction that has not yet cascaded across domain boundaries. The same Coherence score paired with high resonance total indicates fewer impaired centers whose impairments are actively propagating dissonance into adjacent domains. These two profiles would call for different intervention strategies within the Centering Path framework: the first favoring broad-spectrum capacity rebalancing, the second targeting specific propagation pathways.

This architectural implication holds within the synthetic benchmark and is consistent with the computational design of the two metrics. Whether it produces meaningful clinical differentiation in human-respondent profiles is an empirical question that has not been tested.

Centering Path Optimization

The partial independence of resonance total from Coherence suggests that optimizing Coherence alone during Centering Path computation may not fully suppress cross-domain dissonance. If the 77% of unshared variance in resonance total reflects propagation dynamics that the Coherence formula does not capture, path algorithms that incorporate resonance reduction as a secondary optimization objective could identify intervention targets that Coherence-only optimization misses. This is an architectural hypothesis, not a validated recommendation. It would require comparative benchmarking of single-objective vs. dual-objective path optimization to evaluate.

Implications for Narrative Report Generation

The non-redundancy finding also bears on how Coherence and resonance total are used in narrative report generation. If the two metrics were functionally redundant, including both in a narrative report would produce redundant prose — the resonance summary would merely restate what the Coherence summary already conveyed. The 77% unshared variance indicates that this is not the case: a narrative section interpreting resonance total should surface information that a Coherence-only narrative misses. Specifically, resonance total can identify profiles where dysfunction is concentrated in a small number of centers but propagating aggressively across domain boundaries — a pattern that would be invisible to a Coherence-only interpretation, which averages dysfunction across all 20 centers and may mask localized but severe propagation.

This has implications for report architecture. Narrative builders that present Coherence and resonance total in separate sections, with cross-referencing between them, would leverage the non-redundancy demonstrated here. Narrative builders that derive both sections from Coherence alone, or that present resonance total as a mere elaboration of Coherence, would discard the independent information that resonance total contributes. The synthetic evidence supports the former design, though the magnitude of narrative differentiation achievable in practice remains an empirical question dependent on human-respondent profile distributions.

Network-Theoretic Framing

From a network perspective on personality architecture, the result is consistent with a model whose summary statistics function as partially overlapping projections of a common underlying network. In network psychometrics, the relationship between aggregate indices derived from the same node set depends on how each index weights and combines node-level information (Borsboom & Cramer, 2013; Epskamp et al., 2018). Two indices using different aggregation rules should show moderate convergence when the network has coherent structure and diverge when the network is fragmented or chaotic. The Icosa model’s internal consistency, as measured here, indicates that its aggregation rules produce convergent but non-identical summary views of the 20-center system (Cramer et al., 2010). This framing does not add independent evidence; it contextualizes the finding within a broader theoretical framework that may guide future hypothesis generation.

What the Effect Size Does Not Tell Us

It is important to delineate what the r = -.48 correlation establishes and what it does not. The correlation establishes that within the synthetic profile space generated by the d40 engine, higher Coherence is reliably associated with lower resonance total, at a magnitude that indicates meaningful but non-redundant overlap. It does not establish that Coherence causes low dissonance propagation, that improving Coherence in a real individual would reduce their resonance total, that the relationship is linear across the full range of either metric, or that the same magnitude would hold in human-respondent data. The correlation is a property of the computational architecture as exercised by a particular synthetic generation process. Treating it as a property of human personality would be an inferential error that the evidence taxonomy explicitly prohibits. Every downstream implication noted in this whitepaper — joint interpretation, dual-objective path optimization, narrative report architecture — is conditional on the relationship holding in human-respondent data, which has not been tested.

Circularity Governance Assessment

The circularity audit for this study was clean. No hypotheses were flagged for shared-formula ancestry, shared-anchor benchmarks, or expected-formula-check status. The removal of H2 and H3 prior to analysis was the primary circularity governance action, and it was the correct one: both hypotheses would have tested tautological relationships between Coherence and its own formula components.

The clean circularity status means the H1 correlation can be interpreted at face value as evidence of a structural relationship between two independently computed summary metrics. No downgrading of evidence strength is required under the governance rules.

For future studies in this domain, circularity governance should be especially vigilant around any hypothesis that correlates Coherence with constructs that feed into the Coherence formula (capacity good-flow, domain stability, Gateway healing power, Trap/Basin counts). Such hypotheses are formula verification checks, not discovery, and must be labeled accordingly. The present study demonstrates that removing tautological hypotheses before analysis produces cleaner, more interpretable results than flagging them post hoc.

Limitations of the Current Evidence Base

The evidence base for this synthesis is narrow in multiple dimensions.

Single study. Only one study has been completed in the coherence domain. The synthesis rests on a single correlation computed from a single synthetic dataset. No replication, no extension, and no converging evidence from different analytic approaches are available.

Synthetic data only. All 500 profiles were computationally generated using the d40 engine with unconstrained random inputs. The distributional properties of synthetic profiles may differ from those of profiles derived from human respondents, particularly in the prevalence of extreme configurations, the joint distribution of Coherence and resonance total at the tails, and the base rates of different Coherence Bands. Findings describe properties of the computational model, not properties of human personality.

Single-seed design. The fixed seed (seed = 42) ensures exact reproducibility but means the results reflect one specific draw from the generative distribution. The stability of the r = -.48 estimate across different seeds has not been established.

No external criteria. No external validator, clinical outcome, or human-judgment measure has been tested against Coherence. The internal consistency demonstrated here is a necessary but not sufficient condition for the metric’s utility in practice.

No band-stratified analysis. The distributional asymmetry observed in the data suggests a nonlinear relationship between Coherence and resonance total that the global Pearson correlation may understate at the extremes. No band-stratified or nonlinear analysis has been performed.

No dyadic extension. The study examined individual profiles only. Whether the Coherence-resonance relationship holds, strengthens, or weakens in dyadic profiles has not been tested.

Research Priorities

The following priorities are ordered by their expected contribution to advancing the coherence evidence base from its current synthetic-only state toward external validation readiness.

Priority 1: Multi-seed stability analysis. Repeat the H1 analysis across 100 random seeds of N = 500 each to establish the confidence interval for the r = -.48 estimate and determine whether it is stable across draws from the synthetic profile space. This is the lowest-cost, highest-value next step: it converts a single-point estimate into a distribution, which is necessary before any downstream analysis can treat the correlation magnitude as a reliable parameter.

Priority 2: Band-stratified nonlinear analysis. Stratify the Coherence-resonance correlation by Coherence Band (Thriving, Steady, Strained, Burdened, Severe) to test whether the relationship strengthens nonlinearly in the lower bands, as the distributional asymmetry predicts. If cross-domain propagation accelerates disproportionately at low Coherence, the resonance total metric may prove most informative precisely where clinical need would be greatest. This analysis can be performed on existing synthetic data and does not require new data collection.

Priority 3: Internal discriminant validity. Test Coherence against constructs that should be weakly related to it within the Icosa framework (e.g., specific single-domain scores, individual Harmony center values) to establish that the metric is not simply recapitulating any single input. This would complete the internal consistency picture by demonstrating both convergent validity (with resonance total, already shown) and discriminant validity (against components that should not dominate the composite).

Priority 4: Cross-engine comparison. Replicate the H1 analysis using profiles generated by the c135 and p180 engines to determine whether the Coherence-resonance relationship is stable across different assessment engines or whether it varies with engine-specific distributional properties. Differences across engines would indicate that the relationship is partially an artifact of engine-specific profile generation, while consistency would strengthen the claim that it reflects a stable property of the summary architecture.

Priority 5: Human-respondent replication. Once human-respondent profiles are available in sufficient numbers (minimum N = 200, ideally N = 500), replicate the internal consistency analysis to determine whether the computational model’s metric architecture holds under the distributional constraints of real assessment data. This is the critical bridge from synthetic evidence to operational evidence and cannot be bypassed or approximated.

Priority 6: Dual-objective path optimization benchmark. If Priorities 1-2 confirm the stability and nonlinearity of the Coherence-resonance relationship, conduct a comparative benchmark of single-objective (Coherence only) vs. dual-objective (Coherence + resonance reduction) Centering Path optimization to determine whether the dual-objective approach identifies intervention targets that single-objective optimization misses. This has direct architectural implications for the path computation algorithm.

Conclusion

The coherence evidence base currently rests on a single synthetic benchmark study that demonstrates one clear finding: the Icosa model’s two primary summary metrics, Coherence and resonance total, are internally consistent but not redundant, with a medium inverse correlation (r = -.48) indicating 23% shared variance and 77% independent variance. This is the architecturally desirable outcome for a model that computes two summary statistics intended to capture related but distinct aspects of profile quality.

The finding is clean under circularity governance, statistically stable across all correction methods, and practically significant by the pre-specified threshold. The distributional asymmetry between the two metrics generates a testable hypothesis about nonlinear escalation of dissonance propagation at low Coherence levels that has not yet been investigated.

What the evidence does not establish is equally important. No external criterion validity, no clinical utility, no cross-engine stability, no human-respondent replication, and no band-stratified nonlinear analysis are available. The internal consistency demonstrated here is a necessary foundation for a summary metric, not a sufficient one. The research priorities outlined above trace a path from the current single-study synthetic evidence base toward the external validation that would be required before Coherence could be treated as operationally ready for clinical or applied use.

The most productive immediate next steps are low-cost extensions of the existing synthetic evidence: multi-seed stability analysis and band-stratified nonlinear testing. These require no new data collection and would substantially sharpen the current findings before human-respondent data becomes available.

Downloads

Replication materials for the component studies in this paper.

Internal Consistency: Coherence and Resonance Summary Metrics