Framework Translation

Framework Translation: Cross-System Mapping Fidelity

Icosapersonality assessmentpsychometricsBig FiveMBTIEnneagramframework translation

Scope and Evidence Status

This whitepaper synthesizes two synthetic benchmark studies that quantify information loss when Icosa personality profiles are translated through conventional personality frameworks and reconstructed. Both studies used computationally generated profiles — 1,165 in each case (1,000 random plus 165 clinical archetype configurations) — and deterministic algebraic crosswalk mappings implemented within the Icosa framework. The results characterize the mathematical behavior of these translation pipelines, not their performance with human-collected data. No human outcomes, diagnostic performance claims, or treatment effectiveness findings are reported. All statistical results describe the architecture of the crosswalk translations under idealized synthetic conditions.

The evidence taxonomy for both studies is synthetic benchmark: the profiles are model-generated, the crosswalk mappings are internal to the Icosa system, and the fidelity metrics measure round-trip preservation within the model’s own metric space. Readers should interpret these findings as characterizing the geometry of cross-framework compression — which constructs survive translation and which do not — rather than as validation against an external criterion.

Neither study triggered circularity audit flags. Both used independent synthetic data generation and deterministic mappings without shared benchmark anchors that would require downgrading the evidence.

Study Architecture

The two studies address complementary questions about the same translation problem.

Study 1 (Cross-Framework Compression Benchmark) compared three target frameworks: Big Five, MBTI, and Enneagram. Each Icosa profile was mapped to each framework’s representational space and then reconstructed back into Icosa coordinates. The study tested eight hypotheses using correlation-comparison tests with Holm-Bonferroni correction, asking which framework preserves the most information and where the largest asymmetries appear. The signal profile was 6 reportable, 1 below practical threshold, and 1 null.

Study 2 (Big Five Crosswalk Round-Trip Fidelity) examined the Big Five translation in isolation at three levels of structural granularity: capacity means (the four processing modes), domain means (the five experiential columns), and individual cell values (specific intersections in the 4x5 grid). It tested twelve Pearson correlations with Holm-Bonferroni correction. The signal profile was 12 reportable, with zero nulls and zero below-threshold results.

Together, the two studies test 20 hypotheses across 1,165 profiles, producing 18 reportable results, 1 below-threshold result, and 1 null. All reportable results survived both Holm-Bonferroni sequential correction and program-level FDR correction (436 total tests across the research program).

Major Patterns

Compression is domain-selective, not uniform

The single most consistent finding across both studies is that cross-framework translation does not lose information uniformly. Round-trip fidelity depends on the structural alignment between the specific Icosa construct and the target framework’s representational architecture. This selectivity is measurable and systematic.

In Study 1, the Mental Domain showed the largest compression asymmetry of any metric. The Big Five outperformed MBTI in preserving Mental Domain means by delta = 0.538 (p < .001, large difference) and outperformed Enneagram by delta = 0.491 (p < .001, moderate difference). These are the two largest effect sizes in the entire study. The Spiritual Domain showed a consistent but smaller Big Five advantage over both MBTI (delta = 0.273) and Enneagram (delta = 0.271), both significant at p < .001.

In Study 2, which traced the Big Five round-trip at finer resolution, the same pattern appeared within the single framework. Mental Domain fidelity led all four non-physical domains at r = .58 (large effect), followed by Spiritual (r = .41), Relational (r = .40), and Emotional (r = .36), all medium effects. The Mental Domain’s advantage likely reflects tighter structural alignment with Openness to Experience and Conscientiousness, which carry strong cognitive-processing content that maps onto the four Mental-column Harmonies (Curiosity, Acuity, Identity, Agency).

Capacity-level fidelity follows a behavioral-observability gradient

Study 2 revealed a clear ordering among the four Capacities. Focus (attending) and Move (expressing) showed the highest round-trip fidelity at r = .65 each (large effects). Open (receiving) was next at r = .54 (large effect). Bond (connecting) was weakest at r = .33 (medium effect), narrowly clearing the practical threshold of r = .30.

This gradient tracks a meaningful structural property: Focus and Move describe processing modes with relatively direct behavioral signatures — attentional allocation and expressive output — that map readily onto Big Five factors developed through lexical and behavioral-descriptor traditions. Bond captures integrative, relational processing that operates between persons rather than within a single behavioral stream. The Big Five encodes some Bond variance through Agreeableness and facets of Extraversion, but distributes it across multiple factors rather than preserving it as a unified construct.

Finer resolution amplifies information loss

Both studies demonstrate that fidelity degrades as measurement resolution increases. In Study 1, the Big Five crosswalk showed significantly lower round-trip fidelity for grid-level cell health (Sensitivity, Open x Physical) than for the corresponding capacity-level mean (Open Capacity), with delta = 0.134 (p < .001, small difference exceeding the 0.10 minimum threshold).

In Study 2, cell-level correlations were systematically lower than their parent capacity or domain correlations. The highest cell-level fidelity was Move x Physical (Vitality) at r = .45, and the lowest was Bond x Relational (Belonging) at r = .23, a small effect. The Belonging cell sits at the intersection of the two Icosa constructs most weakly represented in the Big Five — Bond Capacity and the Relational Domain — and the compression loss compounds at the cell level because both the row and the column are attenuated.

This resolution gradient has direct architectural implications for any system that uses crosswalked profiles to compute higher-order constructs. Icosa Gateway states depend on specific cell values (e.g., the Belonging Gate depends on Bond x Relational), and these cells carry disproportionate reconstruction noise when derived from crosswalked rather than natively assessed data.

The Big Five is the strongest conventional target overall

Study 1 compared three frameworks head-to-head. At the capacity level, the Big Five preserved Open Capacity means with significantly higher fidelity than MBTI (delta = 0.301, p < .001, moderate difference). At the domain level, the Big Five advantages over MBTI ranged from delta = 0.273 (Spiritual) to delta = 0.538 (Mental). Across the resolutions directly tested against MBTI, the Big Five was consistently superior. Against the Enneagram, the Big Five retained clear advantages at the tested domain-level resolutions, but the tested capacity-level comparison was null.

The Enneagram showed an asymmetric profile: it matched Big Five fidelity at the capacity level but lost substantially more information at the domain level. This pattern is discussed further in the null-results section below.

Null and Below-Threshold Results

Two results from Study 1 require explicit treatment as they bound the evidence.

Null: Enneagram capacity-level parity with the Big Five

The Big Five vs. Enneagram comparison for Open Capacity mean fidelity did not reach significance (delta = 0.021, p = .496, below the 0.05 minimum meaningful threshold). The Enneagram crosswalk preserved capacity-level means at a level statistically indistinguishable from the Big Five. This was the most unexpected result in either study.

One structural interpretation is that the Enneagram’s nine types, each defined by a motivational core, may implicitly capture capacity-row variation (how a person processes) more effectively than domain-column variation (where a person experiences). If Enneagram types differentially represent patterns of receptivity, vigilance, attachment, and expressiveness, then the row-averaged capacity mean would be well-reconstructed even though column-specific domain information compresses away. This interpretation is consistent with the Enneagram’s sharp domain-level degradation (where the Big Five advantages of delta = 0.271 to 0.491 all reached significance), but it remains speculative. The null stands: at the capacity-aggregation level, the Enneagram and Big Five crosswalks are statistically interchangeable.

Below practical threshold: Coherence reconstruction

Coherence reconstruction through the Big Five crosswalk produced r(1163) = .74, p < .001 — statistically significant but below the a priori practical threshold of r = .75. This narrow miss (approximately 1.4 percentage points of correlation) points to a meaningful architectural boundary rather than a fluke.

Coherence is computed from the full 20-center state pattern, including asymmetric penalties for over-expression and interactions among Gateways, Traps, and Basins. A five-factor translation flattens the 20-center state space into five continuous scores that carry no information about over/under asymmetry or cross-center interaction patterns. That r still reached .74 indicates a substantial portion of Coherence variance is driven by simple mean levels (which the Big Five does preserve), but the last few percentage points depend on structural configuration that no coarse framework retains. The .75 threshold was set a priori as a strong-fidelity benchmark; a lower threshold (e.g., .70) would have yielded a different conclusion. The result is reported at its declared threshold: Coherence does not meet the benchmark for strong absolute fidelity through any single framework’s crosswalk.

Distributional Artifacts in Reconstructed Variables

Both studies flagged distributional anomalies in specific reconstructed variables that warrant separate discussion because they represent a form of information loss that correlation coefficients alone do not capture.

In Study 2, the reconstructed Bond Capacity mean showed extreme non-normality (skewness = -1.73, excess kurtosis = 12.68, Shapiro-Wilk W = .35, p < .001). The reconstructed Emotional Domain mean was similarly distorted (skewness = 3.35, excess kurtosis = 12.69, W = .37, p < .001). The reconstructed Bond x Relational cell showed comparable leptokurtic compression (skewness = -1.44, excess kurtosis = 10.15, W = .41, p < .001). In Study 1, the Big Five-reconstructed Coherence variable also showed marked non-normality (skewness = -0.26, excess kurtosis = -1.64, W = .78), though less extreme than the Bond and Emotional reconstructions.

These distributional distortions represent a category of information loss distinct from correlation-based fidelity metrics. A Pearson correlation measures rank-order preservation — whether profiles that score high on the original metric also score high on the reconstructed metric. But even where rank-order is partially preserved, the reconstructed variance may be severely restricted: the crosswalk compression concentrates many profiles into a narrow reconstructed range, producing a leptokurtic or heavily skewed distribution rather than the more uniform spread of the original. Two profiles that differ meaningfully on original Bond Capacity may be mapped to nearly identical reconstructed values.

The practical consequence is that for constructs where these distortions are severe — particularly Bond-row and Emotional-column variables — any downstream computation that assumes approximately normal input distributions, or that depends on the magnitude of differences between profiles rather than their rank order, would be affected by crosswalked data in ways that raw Icosa data would not produce. Normative percentile lookups, group-difference statistics, and threshold-based classification systems would all be sensitive to this distributional compression.

Synthesis Across Studies

The two studies converge on a structural map of crosswalk fidelity that can be stated concisely. When an Icosa profile passes through a conventional personality framework and back, the preserved signal follows a hierarchy:

Best preserved (large effects, r = .58-.65): Focus and Move Capacity means; Mental Domain means. These constructs have natural analogues in Big Five factor space and survive round-trip compression with moderate-to-high rank-order fidelity.
Moderately preserved (medium effects, r = .40-.54): Open Capacity means; Spiritual and Relational Domain means; higher-fidelity cell values (Move x Physical, Focus x Mental). These constructs lack one-to-one factor correspondences but receive partial signal from distributed Big Five factor loadings.
Weakly preserved (small-to-medium effects, r = .23-.36): Bond Capacity means; Bond x Relational (Belonging) cell values; Emotional Domain means. These constructs sit at the intersection of integrative-relational processing and experiential domains that the Big Five represents only indirectly.
Below threshold: Coherence (r = .74, below the .75 practical benchmark). The global integration index depends on nonlinear interactions among the full 20-center state that no five-factor compression can carry.
Not tested: Physical Domain fidelity (the Big Five lacks a somatic factor); Gateway-level fidelity; individual cell values for 16 of the 20 Harmonies.

This hierarchy is not arbitrary. It tracks the degree to which each Icosa construct has a natural analogue in Big Five factor space. Behaviorally overt processing modes (Focus, Move) translate well because the Big Five was developed through lexical analysis of behavioral descriptors. Integrative-relational constructs (Bond, Belonging) lose the most signal because they describe between-person processing that disperses across multiple Big Five factors during the forward mapping and cannot be reassembled during reconstruction.

The cross-framework comparison adds a second dimension to this map. MBTI’s categorical type structure imposes additional compression loss relative to the Big Five at every tested level, with the Mental Domain showing the largest degradation. The four Harmonies spanning the Mental column (Curiosity, Acuity, Identity, Agency) encode graded cognitive-processing variation across all four Capacities. MBTI’s binary Thinking/Feeling and Sensing/Intuition axes collapse this graded variation into categorical preferences, discarding the within-type variation that the Icosa model’s Mental Domain preserves. The Enneagram’s motivational archetypes encode cognitive patterns only indirectly, through the lens of core fears and desires rather than through processing-cycle mechanics. The result is that both MBTI and Enneagram lose roughly half a correlation unit of Mental Domain information in round-trip translation — a compression asymmetry approximately twice the magnitude observed for any other metric in either study.

The Enneagram’s motivational archetypes show a surprising split: capacity-level parity with the Big Five (the null result), but sharp domain-level degradation where the Big Five advantages ranged from delta = 0.271 (Spiritual) to delta = 0.491 (Mental). This dissociation suggests the Enneagram’s type structure may encode capacity-row variance (processing mode) more effectively than domain-column variance (experiential content). If Enneagram types differentially represent patterns of receptivity, vigilance, attachment, and expressiveness — the four Capacities — then row-averaged capacity means would be well-reconstructed even though column-specific domain information compresses away. This interpretation is consistent with the data but remains speculative; it could be tested by examining which Enneagram types map to which Capacity profiles in the crosswalk equations.

Architectural Implications

These results have several implications for systems that use or plan to use crosswalk translations.

Crosswalked profiles are not interchangeable with native profiles. The fidelity gradient means that downstream computations based on crosswalked data will be selectively unreliable. Specifically, any computation that depends on Bond-row cells, Relational-column cells, or the Coherence index will carry disproportionate noise when the input profile was reconstructed from Big Five (or especially MBTI) scores rather than natively assessed. Gateway states for the Belonging Gate (Bond x Relational) and Feeling Gate (Bond x Emotional) would be the least reliable crosswalk-derived constructs.

Framework translations should carry fidelity metadata. Rather than treating a crosswalked profile as equivalent to a native profile, the system should attach per-construct confidence estimates derived from the fidelity map. This allows downstream consumers to weight or discount specific cells based on their known round-trip fidelity.

MBTI-derived conversions should not be used for domain-level inference. The Mental Domain compression asymmetry (delta = 0.538 vs. Big Five) means that MBTI-based reconstruction of Mental Domain standing is unreliable enough to disqualify it for domain-level applications. Capacity-level aggregates from MBTI are also substantially worse than from Big Five or Enneagram.

The Enneagram’s capacity-level parity does not extend to structural parity. The null result for capacity means could mislead users into treating Enneagram crosswalks as equivalent to Big Five crosswalks. The domain-level data shows they are not: the Enneagram loses substantially more information at the column level.

The Physical Domain and Gateway-level constructs remain untested. The current benchmark does not include the Physical Domain (the Big Five lacks a somatic factor) or Gateway-level metrics (which depend on specific cell configurations). These are critical gaps because Physical health and Gateway states are among the most clinically relevant Icosa constructs.

Limitations

All findings are bounded by the synthetic nature of the evidence.

The profiles were computationally generated across the full Icosa parameter space, including 165 clinical archetype configurations, but they do not capture the distributional properties of real-world personality data. Base-rate skewness toward centered states in non-clinical populations, response-style artifacts, acquiescence bias, restricted range in specific subpopulations, and measurement error at the item level would all introduce additional sources of fidelity loss not modeled here. The fidelity values reported should be treated as upper bounds on what a crosswalk can achieve under idealized conditions.

The crosswalk mappings are deterministic algebraic translations. They do not model the probabilistic, many-to-one nature of real framework translations (where, for instance, multiple Icosa profiles might produce the same MBTI type designation). The round-trip benchmark tests the mathematical invertibility of the mapping equations, not the empirical recoverability of profile information through assessment instruments.

The selection of specific constructs for hypothesis testing (Open Capacity and Physical cell for Study 1’s H1 and H2; four probe cells rather than all 20 for Study 2’s H3) means the full 4x5 fidelity matrix is incompletely characterized. The Bond-row attenuation pattern is clear from the tested constructs, but whether it is uniform across all five Domain columns or concentrated at specific intersections remains an open question for 16 of the 20 cells.

Statistical power for Study 1’s correlation-comparison tests was not computed (reported as 0.0 in both studies, indicating power analysis was not performed). Given the large sample size (N = 1,165) and the magnitude of the detected effects, post-hoc power is not a concern for the significant results, but the null result for Enneagram capacity-level fidelity (H1-a2) should be interpreted with caution: the study may have been underpowered to detect very small differences (delta < 0.05) between frameworks at this aggregation level.

Research Priorities

The following extensions are ordered by their potential to change the current fidelity map or fill its most consequential gaps.

Priority 1: Complete the 4x5 cell-level fidelity matrix. Study 2 tested four probe cells; the remaining 16 Harmonies have unknown round-trip fidelity. A complete matrix would reveal whether Bond-row attenuation is uniform or concentrated at specific domain intersections, and whether any cells in the Open or Focus rows show unexpectedly low fidelity. This is the most direct extension of the current work and requires no new infrastructure.

Priority 2: Gateway-level fidelity benchmark. The Icosa model’s nine Gateways govern Trap escape and Basin disruption — functionally consequential constructs for the model’s intervention logic. If Gateway states compress away in crosswalk translation, the information lost is not merely descriptive. This benchmark would determine whether crosswalked profiles can support Gateway-level inference or whether Gateway computation should be restricted to natively assessed data.

Priority 3: Physical Domain inclusion via six-factor models. The Big Five lacks a somatic factor, leaving the Physical Domain untested. A crosswalk to HEXACO (which adds Honesty-Humility) or a domain-specific supplement with body-awareness content would allow Physical Domain fidelity to be estimated and fill the largest structural gap in the current map.

Priority 4: Human-sample replication. The synthetic fidelity gradient (Focus/Move > Open > Bond; Mental > Spiritual/Relational > Emotional) is a specific, testable prediction. Repeating the round-trip protocol with empirically collected Icosa profiles would establish whether this gradient holds under real distributional conditions, where measurement noise and response tendencies may either preserve or disrupt the observed hierarchy.

Priority 5: Probabilistic crosswalk modeling. The current deterministic mappings assume a one-to-one algebraic translation. A probabilistic model that captures the many-to-one nature of real framework translations (multiple Icosa profiles mapping to the same Big Five score range) would provide more realistic fidelity bounds and could quantify the additional information loss introduced by measurement imprecision.

Summary

Two synthetic benchmarks, testing 20 hypotheses across 1,165 computationally generated Icosa profiles, establish that cross-framework compression is domain-selective: the information lost in round-trip translation through Big Five, MBTI, and Enneagram crosswalks depends systematically on the structural alignment between each Icosa construct and the target framework’s representational architecture. The Big Five was the strongest translation target overall and outperformed MBTI at every tested resolution, but one tested capacity-level comparison with the Enneagram was null. Focus and Move Capacities survive best (r = .65); Bond Capacity survives least (r = .33). The Mental Domain shows the largest cross-framework asymmetry (Big Five advantage over MBTI: delta = 0.538). Coherence reconstruction falls just below the practical threshold (r = .74 vs. required .75). One null result — Enneagram capacity-level parity with the Big Five — constrains the interpretation: aggregate preservation does not guarantee structural preservation. Sixteen of 20 cell-level fidelities, all Gateway-level metrics, and the Physical Domain remain untested. The immediate priority is completing the 4x5 cell-level fidelity matrix and benchmarking Gateway-level compression to determine whether crosswalked profiles can support the model’s most functionally consequential constructs.

Downloads

Replication materials for the component studies in this paper.

Synthetic Benchmark: Cross-Framework Compression in Big Five, MBTI, and Enneagram Translation

Read Paper (PDF) Replication Data (ZIP)

Synthetic Benchmark: Big Five Crosswalk Round-Trip Fidelity

Read Paper (PDF) Replication Data (ZIP)