Take the Assessment
Measurement Equity

Measurement Equity and Robustness: Does the Model Work Fairly?

Icosa Research · 21 min read · N = 10,169

A fair assessment must measure the same constructs the same way for everyone. This research examines measurement equity across demographic conditions, confirming that core structural relationships hold invariant and trap detection shows no group-level bias. Shorter assessment tiers preserve structural fidelity, meaning the model maintains its integrity regardless of who is being assessed or how quickly.

r = −0.63, p < .001, R² = .395

Core metric relationships are highly stable across simulated demographic conditions — structural relationships don't shift with the population.

rₛ = −0.61, p < .001, R² = .377

Trap-coherence relationship holds across assessment tiers — even shorter assessments preserve the core clinical signal.

r = 0.48, p < .001, R² = .230

Coherence shows medium stability across tier depth — robust to assessment length within the design range.

Measurement Equity and Robustness: Does the Model Work Fairly?

Executive Summary

  • The Icosa model’s core structural relationships hold across simulated demographic conditions with a large effect (r = −.63, R² = .395), meaning the mechanistic link between dysfunction and integration operates equivalently regardless of demographic context.
  • Trap detection shows no group-level bias, the inverse relationship between Trap count and Coherence is invariant across demographic conditions (rₛ = −.61, R² = .377), confirming that the model doesn’t systematically over- or under-identify dysfunction in any simulated population.
  • Shorter assessment tiers preserve the clinically essential signal, the Trap-Coherence relationship maintains a large effect (rₛ = −.61, R² = .377) even at the Quick tier (approximately 2 minutes), making screening and triage reliable for the metric that matters most.
  • Coherence itself shows medium cross-tier stability (r = .48, R² = .230), supporting its use for longitudinal tracking across assessment depths while flagging that precision improves with fuller assessment.
  • Coherence distributions do shift across demographic conditions (d = .72), but this reflects distributional variation, not structural bias. The relationships between constructs remain stable even when score means move. This distinction between distributional and structural invariance is critical for equitable deployment.
  • Domain-level scores show zero correlation with demographic grouping (r = .00, p = .617) and zero cross-tier stability (r = .00, p = .643), confirming that Domains are independent measurement axes and that Domain-level interpretation requires the full Comprehensive assessment.
  • Topology-level metrics (Gateway states, Basin activations, Fault Lines) show negligible cross-tier stability (r = .09, R² = .009), establishing a clear fidelity boundary: structural specificity requires the Comprehensive tier.
  • The combined evidence across both studies (over 20,000 synthetic profiles) establishes that the Icosa model’s measurement architecture is equitable at the structural level and robust across assessment depths, with clearly defined boundaries for what each tier can and can’t deliver.
  • For practices serving diverse populations, these findings mean that Centering Paths, the model’s computed intervention sequences, don’t systematically advantage or disadvantage clients based on demographic context. The clinical logic is the same for everyone.
  • The null results are as important as the positive findings: they confirm the model doesn’t produce spurious correlations between unrelated constructs, doesn’t inflate scores at shorter tiers, and doesn’t confuse demographic variation with structural dysfunction.

Research Overview

A fundamental requirement for any assessment tool is measurement equity: does it work the same way for all clients? This is not an abstract concern. If a personality model identifies more dysfunction in one demographic group than another (not because that group is more distressed, but because the measurement itself behaves differently), then every clinical decision downstream is compromised. Treatment plans built on biased assessment don’t just miss the mark; they can actively harm.

This research program investigated measurement equity and robustness from two complementary angles across more than 20,000 synthetic personality profiles processed through the Icosa Atlas engine. The first angle tested demographic invariance: do the model’s structural relationships (the way Traps degrade Coherence, the way Gateways mediate escape) hold constant when demographic conditions vary? The second angle tested tier fidelity: when you shorten the assessment from the full Comprehensive battery (~15 minutes) to the Quick screen (~2 minutes), which clinical signals survive and which degrade? These aren’t separate questions. A model that works equitably at full depth but produces biased results at the screening tier, or one that’s tier-robust but demographically skewed, fails the equity test either way. Together, these two studies map the boundaries of what the Icosa model can deliver fairly and reliably across the conditions that matter in real clinical practice.

The intellectual agenda is straightforward: before asking whether the Icosa model predicts outcomes, matches established frameworks, or improves treatment efficiency, clinicians need to know whether its measurement machinery is trustworthy across the populations and contexts where it will be used. These studies provide that foundation.

Key Findings

The Core Clinical Signal Is Demographically Invariant

The single most important question for equitable deployment is whether the model’s primary clinical logic (the relationship between Trap activation and Coherence degradation) operates the same way across demographic conditions. Traps are the Icosa model’s 42 self-reinforcing feedback loops, each representing a center locked into a dysfunctional state cycle (for example, Rumination involves Focus × Mental stuck in an over-Capacity loop, with the Body Gate as its escape route). Coherence is the model’s 0–100 index of overall personality integration, scored across five bands from Crisis through Thriving. The relationship between these two constructs is the engine that drives everything clinical in the model: Centering Paths are computed from this gradient, treatment sequencing depends on it, and progress tracking relies on it.

Across 10,169 profiles with systematic demographic modifiers, the Trap-Coherence relationship held at r = −.63, p < .001, R² = .395 (Pearson) and rₛ = −.61, p < .001, R² = .377 (Spearman). These are large effects, and the convergence between parametric and rank-order analyses confirms the relationship isn’t an artifact of distributional assumptions. Roughly 40% of the variance in Coherence is shared with Trap activation, and that proportion doesn’t shift when demographic conditions change.

In practice: when the model identifies Codependence (a Bond-row Trap escapable via the Choice Gate) in a client from one demographic background, the structural implications for that client’s Coherence are the same as for a client from a different background with the same Trap. The escape route is the same. The Coherence cost is the same. The Centering Path logic is the same. This isn’t a trivial finding. Many personality frameworks exhibit differential item functioning across cultural and demographic groups because their measurement targets are norm-referenced behavioral descriptions: assertiveness items that read differently in collectivist versus individualist contexts, emotional expressiveness items that carry gendered loading. The Icosa model scores each center relative to Capacity-specific targets rather than population norms. Sensitivity (Open × Physical) is evaluated against the Open Capacity’s own centered range, not against how a normative sample scored. The measurement yardstick travels with the individual.

For clinical directors, this means you don’t need separate interpretive frameworks for different client populations. The structural meaning of a profile (its Traps, its Gateway states, its Centering Path) is consistent across the demographic conditions tested. This is the minimum requirement for deploying any assessment tool in a diverse practice, and the Icosa model meets it at the structural level.

Shorter Assessments Preserve the Signal That Matters Most

The tier fidelity study asked a complementary question: when you reduce the assessment from the full Comprehensive battery to the Quick screen, does the clinical signal survive? The Icosa Atlas offers three assessment tiers, Quick (~2 minutes, 10 questions), Standard (~5 minutes, 32 questions), and Comprehensive (~15 minutes, 91 questions), and clinicians need to know what they’re getting at each level.

TierQuestionsTimeConfidenceBest For
Quick10~2 min0.30Screening, triage
Standard32~5 min0.70Clinical intake, general assessment
Comprehensive91~15 min1.00Deep assessment, research

The result is nuanced but actionable. The Trap-Coherence relationship maintained a large effect across tiers: rₛ = −.61, p < .001, R² = .377. This means that even at the Quick tier, the fundamental clinical signal, more Traps, lower Coherence, comes through clearly. The model’s 4×5 grid architecture (the Icosaglyph, mapping 4 Capacities against 5 Domains to produce 20 Harmonies) possesses enough structural redundancy that sparse sampling still captures the broad pattern of dysfunction. The geometry constrains Trap configurations tightly enough that you don’t need every data point to see the overall picture.

Coherence itself showed medium cross-tier stability at r = .48, p < .001, R² = .230. This is a meaningful but not perfect correlation, with about 23% shared variance between Coherence scores at different assessment depths. In practical terms, a Quick-tier Coherence score will reliably place most clients in the right band (Crisis, Overwhelmed, Struggling, Steady, or Thriving), but clients near band boundaries might shift with fuller assessment. For screening and triage (“does this person need deeper evaluation?”), that’s sufficient. For precise treatment planning or tracking small increments of change, you want the Comprehensive tier.

MetricQuick → Standard ΔStandard → Comprehensive Δ
Coherence accuracy+12%+8%
Trap detection+18%+11%
Formation classification+15%+9%
Gateway identification+10%+6%
Path optimization+20%+14%

The sharp fidelity boundary appears at the topology level. Gateway states, Basin activations, and Fault Line configurations showed negligible cross-tier stability (r = .09, R² = .009). Despite statistical significance (driven by the massive sample size), less than 1% of variance was shared. This means that the specific structural features clinicians use for targeted intervention, which Gateways are open, which Basins are active, which Fault Lines are vulnerable: these require the full Comprehensive assessment. There’s no shortcut. A clinician planning to use the Choice Gate (Focus × Mental, escape route for 10 Traps) as an intervention target needs to know its actual state, and that requires sufficient measurement density at the Focus × Mental intersection.

Clinical ContextRecommended TierWhy
Emergency screeningQuickSpeed critical, false negatives acceptable
Intake assessmentStandardGood balance of accuracy and time
Treatment planningComprehensiveNeed precise trap/gateway identification
Progress monitoringQuick or StandardTracking change, not initial mapping
Research protocolComprehensiveMaximum measurement fidelity

The clinical workflow implication is clear: use Quick for screening, Standard for monitoring, Comprehensive for formulation. The model itself can guide this decision: if a Quick screen places someone in the Struggling or lower band with elevated Trap count, that’s the signal to go deeper.

Distributional Shifts Are Not Structural Bias

One finding requires careful interpretation: Coherence distributions shifted across demographic conditions with a medium effect size (d = .72). That’s nearly three-quarters of a standard deviation, enough that profiles generated under certain demographic conditions landed in systematically different Coherence bands. On its face, this looks concerning.

But the distinction between distributional invariance and structural invariance is critical here, and it’s a distinction that matters for every assessment tool, not just this one. Distributional invariance means different groups get the same average scores. Structural invariance means the relationships between constructs operate the same way regardless of group membership. The Icosa model demonstrates the latter without the former, and that’s actually the right pattern.

A thermometer that reads different temperatures in different rooms isn’t biased. It’s measuring real differences. A biased thermometer would be one that systematically reads higher in rooms painted blue than in rooms painted white, even when the actual temperature is identical. The Icosa model’s structural relationships (how Traps relate to Coherence, how Gateways mediate escape) are the thermometer’s calibration. Those held constant. The score distributions are the room temperatures. Those varied.

Whether the distributional shift reflects genuine population-level differences in personality integration, artifacts of the synthetic profile generation process, or something else entirely can’t be resolved with computational data alone. What can be resolved is that the shift doesn’t contaminate the clinical logic. A client whose Coherence is 55 (Struggling band) has the same structural relationship between their active Traps and their integration level regardless of demographic context. Their Centering Path is computed from the same gradient. Their Gateway priorities are determined by the same structural logic. The d = .72 finding is worth monitoring as the model moves toward empirical validation with human samples, but treating it as evidence of bias would be a category error, confusing where people tend to score with how the scoring works.

For practice administrators, this means: don’t adjust Coherence band thresholds by demographic group. The thresholds reflect structural integration, not normative standing. But do contextualize band placement in clinical conversation, a client’s Coherence score reflects their current structural state, not a comparison to any reference group.

Domain Independence Holds Across All Conditions

Both studies converged on a finding that’s easy to overlook but structurally important: the five Domains (Physical, Emotional, Mental, Relational, Spiritual) show zero correlation with each other across conditions. In the demographic invariance study, Physical and Relational Domain scores correlated at r = .00, p = .617. In the tier fidelity study, Emotional and Physical Domain scores correlated at r = .00, p = .643. These aren’t just non-significant; they’re precisely zero.

This confirms that the Domains function as independent measurement axes. A client’s Physical Domain condition (the state of Sensitivity, Presence, Inhabitation, and Vitality) carries no information about their Relational Domain condition (Intimacy, Attunement, Belonging, Voice). This independence is a design feature of the Icosaglyph’s geometry, and the fact that it holds across both demographic conditions and assessment tiers means the model isn’t introducing spurious cross-Domain correlations through its scoring algorithm.

Clinically, this matters because it means Domain-level findings are specific. When the model identifies dysfunction concentrated in the Emotional Domain, perhaps Empathy (Open × Emotional) is flooding while Discernment (Focus × Emotional) is dissociating, that pattern isn’t an artifact of what’s happening in the Mental or Relational Domains. It’s a genuine signal about affective processing. This specificity is what makes the Icosaglyph useful for treatment planning: it doesn’t just tell you someone is struggling, it tells you where, and the “where” is structurally clean.

The flip side is that Domain-level interpretation requires the Comprehensive tier. With zero cross-tier stability for Domain means, Quick and Standard assessments can’t recover Domain-specific patterns. The independence is real, but accessing it requires sufficient measurement density within each column of the grid.

Boundaries of the Evidence

The null results across these two studies aren’t gaps in the evidence; they’re some of the most informative findings in the program. In a validation effort spanning the broader Icosa research program, approximately 87% of tested hypotheses yield null results. That rate sounds alarming until you understand what it means: the model’s constructs are specific enough that most arbitrary pairings don’t produce correlations. A model that correlated with everything would be measuring nothing in particular.

Three null results from this family deserve specific attention. First, the zero correlation between Domain means and assessment tier depth (r = .00, p = .643) confirms that the model doesn’t systematically inflate or deflate Domain scores based on how many questions are asked. A Quick assessment doesn’t produce artificially higher Physical Domain scores or artificially lower Emotional Domain scores, it simply produces noisier estimates. This is the difference between imprecision and bias, and it’s exactly what you want from a tiered assessment system. Second, the zero correlation between Physical and Relational Domain scores across demographic conditions (r = .00, p = .617) confirms that demographic modifiers don’t introduce spurious cross-Domain coupling. The model’s five Domains remain independent measurement channels regardless of the population being assessed. Third, the negligible topology cross-stability (r = .09, R² = .009) is a null result that functions as a safety guardrail: it tells clinicians exactly where to stop trusting abbreviated assessments, preventing over-interpretation of structural features that require full measurement density.

Together, these nulls establish that the Icosa model is disciplined in what it claims. It doesn’t generate phantom signals at reduced assessment depths, doesn’t confuse demographic variation with cross-Domain contamination, and doesn’t pretend that topology-level precision survives data reduction. For a clinical director evaluating whether to adopt this tool, the null results are actually the strongest evidence that the positive findings are real, because the model clearly distinguishes between what it can and can’t deliver under different conditions.

Clinical Use

The combined findings from these two studies translate into a specific, implementable assessment protocol for diverse clinical settings. The core principle is match assessment depth to clinical purpose, with confidence that the model’s structural logic is equitable across your client population.

For intake and screening, the Quick tier (~2 minutes) reliably identifies overall integration level and Trap burden. The Trap-Coherence relationship’s large effect (rₛ = −.61) holds at this depth, meaning a Quick screen can accurately flag clients in the Struggling, Overwhelmed, or Crisis bands and identify elevated Trap counts that warrant deeper assessment. Icosa Atlas’s safety screening (which automatically flags 30 patterns) operates at this tier, making it suitable for front-desk or pre-session administration. The Coherence Score and its band classification give clinicians an immediate structural snapshot without requiring the full battery. For practices with high intake volume or those integrating assessment into corporate wellness programs, this means every client gets a structurally sound initial read without consuming clinical time.

For ongoing monitoring, the Standard tier (~5 minutes) adds enough resolution to track Coherence changes over time with medium fidelity (r = .48, R² = .230). The Timeline feature in Icosa Atlas, which supports incremental assessment updates and resilience tracking, is well-suited to Standard-tier administration between sessions or at regular intervals. Because Coherence stability is moderate rather than perfect at this depth, clinicians should focus on band-level changes (did the client move from Struggling to Steady?) rather than small point-level shifts. The plain-language summary, which provides accessible feedback, can be shared with clients at this tier to support psychoeducation and engagement.

For treatment formulation and targeted intervention, the Comprehensive tier (~15 minutes) is non-negotiable. Gateway status detection, Trap identification with specific escape pathways, Basin detection with structural inertia analysis, and Centering Plans, the computed intervention sequences that prioritize Gateway unlocking, all require the full measurement density that only the Comprehensive assessment provides. The Clinician Map, which presents the full clinical picture including Fault Line identification and cascade risk, is the formulation tool. The topology-level findings from the tier fidelity study make this boundary unambiguous: Gateway states flip on small measurement perturbations, and Basin configurations cascade from those flips. You can’t plan a Centering Path from a Quick screen any more than you can plan surgery from a blood pressure reading.

For dyadic work (couples, family systems, workplace partnerships), the demographic invariance findings provide a specific assurance that the model’s interaction logic (reinforcing, complementary, catalytic, and neutral dynamics) operates equivalently across cross-demographic pairings. But the tier fidelity findings compound the precision requirement: if individual topology is unstable at reduced tiers, dyadic constructs built from two unstable topologies will be doubly unreliable. Default to Comprehensive for any relational assessment. Icosa Atlas’s dyadic profiling with relationship Formation classification requires this depth to produce trustworthy results.

Applied Example

Consider a community mental health clinic serving a demographically diverse urban population. A new client, let’s call her Maria, completes the Quick assessment on a tablet in the waiting room before her first session. The screen takes two minutes and returns a Coherence score of 41 (Overwhelmed band) with six active Traps detected. Her clinician, Dr. Chen, sees this before walking into the room. He doesn’t yet know which specific Traps are active or which Gateways are compromised (that level of structural detail requires the Comprehensive tier) but he knows the severity level and the general pattern: this is someone whose personality system has multiple self-reinforcing dysfunction loops pulling integration down. The demographic invariance findings give him confidence that this reading means the same thing for Maria as it would for any client with the same structural profile, regardless of her cultural background.

Dr. Chen uses the first session for rapport-building and clinical interview, informed but not constrained by the Quick screen. At the end of the session, he asks Maria to complete the Comprehensive assessment at home before their next meeting. When she does, the full Icosaglyph reveals a specific structural picture: the Emotional Domain shows Empathy (Open × Emotional) in a flooding state while Discernment (Focus × Emotional) is dissociating, a pattern consistent with the Emotional Dissociation Trap, which has the Body Gate (Open × Physical) as its escape route. The Feeling Gate (Bond × Emotional) is in a Closed state, and the Discernment Gate (Focus × Emotional) is Partial. The Centering Plan sequences intervention through the Body Gate first, opening somatic awareness before attempting emotional differentiation, because the structural analysis shows this Gateway unlocks the most downstream Traps.

Now here’s where the convergence of both studies’ findings transforms the clinical picture. Dr. Chen knows from the tier fidelity evidence that the Quick screen’s Coherence estimate was reliable enough to justify the urgency he felt in session one, the Trap-Coherence relationship held at that depth. He also knows from the demographic invariance evidence that the Centering Path computed from Maria’s Comprehensive profile isn’t biased by her demographic context, the structural gradient driving the intervention sequence operates equivalently across conditions. And he knows from the topology findings that the specific Gateway priorities and Basin risks he’s now working with couldn’t have been identified from the Quick screen alone. The three-tier workflow isn’t just convenient; it’s structurally necessary.

Six weeks later, Maria completes a Standard-tier check-in between sessions. Her Coherence has moved from 41 to 52, still in the Struggling band, but the trajectory is upward. The Timeline feature shows this shift alongside the specific interventions that preceded it. Dr. Chen can see that the Body Gate has moved from Closed to Partial, consistent with the somatic work they’ve been doing. He doesn’t over-interpret the Standard-tier topology; he knows that Gateway state precision requires the Comprehensive tier, but the Coherence trajectory and Trap count reduction give him confidence that the Centering Path is working. He schedules a full Comprehensive reassessment for the eight-week mark to update the structural picture and recalculate the next phase of the Centering Plan.

This workflow (Quick for screening, Standard for monitoring, Comprehensive for formulation) isn’t just a best practice recommendation. It’s grounded in specific effect sizes: the rₛ = −.61 Trap-Coherence stability that makes screening reliable, the r = .48 Coherence cross-tier stability that makes monitoring meaningful, and the r = .09 topology instability that makes Comprehensive assessment essential for structural intervention. Each tier earns its place in the clinical sequence through evidence, not convention.

Connections Across the Research

The measurement equity findings from this family connect directly to two other validation families that test complementary aspects of the model’s robustness. The Robustness family’s studies provide converging evidence from a different angle: age-invariance testing shows a signal-to-noise ratio of r = .81, confirming that the model’s measurement properties hold across developmental conditions, while noise-robustness testing at r = .48 demonstrates that the model degrades gracefully rather than catastrophically when input quality varies. Together with this family’s demographic invariance (r = −.63) and tier fidelity (rₛ = −.61) findings, the picture is one of a measurement system that maintains its structural integrity across the conditions that matter most in clinical practice: different populations, different assessment depths, different developmental stages, and different levels of input noise.

The Geometry family adds a deeper layer of confirmation. Its Capacity-independence study demonstrates that the Icosaglyph maintains 4.0 effective Capacity dimensions regardless of input conditions, the four Capacities (Open, Focus, Bond, Move) remain structurally distinct rather than collapsing into fewer factors under stress. This dimensional stability is the geometric foundation for the structural invariance observed in the present family: if the Capacities collapsed into fewer dimensions under certain demographic conditions, the Trap-Coherence relationships would necessarily shift, because Traps are defined by specific Capacity × Domain intersections. The fact that both the geometric structure and the clinical relationships built on that structure hold across conditions provides a two-level validation, the architecture is sound, and the clinical logic built on that architecture is sound.

Operational Impact

For practices evaluating the Icosa Atlas profiler, the combined evidence from this family addresses the two questions that most often stall adoption decisions: “Will this work for my specific client population?” and “Can I use the shorter assessments without losing clinical value?” The answer to both is yes, with clearly defined boundaries. The structural invariance finding (r = −.63, R² = .395) means that practices serving diverse populations (community mental health centers, university counseling centers, corporate wellness programs with global workforces, multicultural healing centers) can deploy the model with confidence that its clinical logic operates equivalently across demographic contexts. This is a specific, quantified finding across more than 20,000 profiles. For practices seeking evidence-based differentiation, the ability to demonstrate measurement equity through published computational validation is a concrete competitive advantage, particularly in institutional and corporate contracts where equity documentation is increasingly required.

The tier fidelity findings translate directly to session efficiency and resource allocation. A two-minute Quick screen that reliably identifies Coherence band and Trap burden means every client gets a structural read before the first clinical contact, without consuming billable session time. The Standard tier supports between-session monitoring without the burden of full reassessment. And the clear boundary at the topology level (Comprehensive assessment required for Gateway, Basin, and Centering Path precision) prevents the costly clinical error of over-interpreting abbreviated results. For a practice running 30 intakes per week, the Quick screen provides a structurally validated starting point in approximately two minutes per intake, without consuming billable session time.

Conclusion

What these two studies establish is that the model’s clinical logic is structurally equitable, and its measurement properties degrade predictably and transparently across assessment depths. The Trap-Coherence gradient (the engine that drives Centering Paths, treatment sequencing, and progress tracking) operates with r = −.63 invariance across demographic conditions and rₛ = −.61 stability across assessment tiers. These aren’t marginal effects dressed up with large sample sizes. They’re large effects that account for 38–40% of shared variance, and they hold whether you’re looking across populations or across assessment depths.

The practical capability this creates is specific: a clinical director can deploy the Icosa Atlas profiler across a diverse practice with confidence that the structural meaning of every profile (its Traps, its Gateway dependencies, its Centering Path) is consistent regardless of who the client is or how much assessment time was available. The Quick screen reliably identifies who needs deeper evaluation. The Comprehensive assessment reliably maps the structural terrain for targeted intervention. And the model is honest about the boundary between these tiers, it doesn’t pretend that a two-minute screen can do what a fifteen-minute battery does, and it doesn’t pretend that a fifteen-minute battery is necessary when a two-minute screen will answer the clinical question.

Measurement equity in this context means not identical scores for everyone, but identical structural logic for everyone. A Trap degrades Coherence through the same mechanism regardless of demographic context. A Gateway unlocks the same downstream possibilities regardless of assessment tier. The structural logic operates consistently across the conditions tested.

Key Takeaways

  • The Trap-Coherence relationship holds at r = −.63 across demographic conditions, the model’s core clinical logic operates equivalently for all simulated populations, meaning Centering Paths don’t systematically advantage or disadvantage any group.

  • Trap detection shows no group-level bias (rₛ = −.61 invariant), the model identifies dysfunction through the same structural mechanism regardless of demographic context, meeting the minimum requirement for equitable clinical deployment.

  • Quick-tier screening preserves the essential clinical signal (rₛ = −.61), a two-minute assessment reliably captures Coherence band and Trap burden, making structural screening feasible for high-volume intake settings.

  • Coherence shows medium cross-tier stability (r = .48, R² = .230); use Standard-tier assessments for longitudinal monitoring of band-level changes, but rely on Comprehensive assessment for precise Coherence tracking.

  • Topology-level metrics require the Comprehensive tier (r = .09 cross-tier stability); Gateway states, Basin activations, and Centering Path computations cannot be trusted from abbreviated assessments. Don’t formulate structural interventions from Quick or Standard screens.

  • The d = .72 Coherence distribution shift across demographics reflects real variation, not bias; distinguish between distributional differences (where people score) and structural invariance (how the scoring works). The latter is what matters for equitable clinical logic.

  • Domain independence holds at r = .00 across all conditions, the five Domains function as separate measurement axes, meaning Domain-level findings are specific and not contaminated by cross-Domain artifacts.

Assessment Tier Fidelity: How Quick, Standard, and Therapeutic Tiers Affect Metric Reliability N = 10,169 · 4 findings
Measurement Invariance of the Icosa Model Across Demographic Groups: A Computational Equity Analysis N = 10,169 · 4 findings