QQ COGNITION PILOT · WANG–BUSEMEYER 2013 EMPIRICAL DATA · IBM FEZ EXECUTION

QPC-QQ-v3: Polycontextural Reproduction of the Wang–Busemeyer Question-Order Effect

A discriminating K-ablation against published empirical Clinton–Gore joint distributions, executed end-to-end on ibm_fez. The first QPC pilot to produce a quantitative architectural claim with hardware-archived statistical significance.

Reader's guide

Start here for the narrative, the K-ablation table, and the bootstrap discrimination result. Follow the chain below if you audit outcomes via job IDs.

  1. This report — executive summary: task, data contract, simulator pre-flight, IBM Fez K-ablation (16 qubits, 18 jobs), bootstrap discrimination at p<0.0005, interpretation boundary.
  2. Machine-readable evidence — JSON produced alongside scripts on your machine (not hosted here): QQ v3 evidence bundle in the QQ workspace folder. Publish copies with your submission bundle if a reviewer requires downloads.
  3. Pilot iteration history: v1 (initial run, K=1 control too degenerate), v2 (K=1 control fixed; QQ-residual metric did not discriminate within quantum), v3 (this report — divergence-based metric, simulator pre-flight verified before hardware).

Navigation: Home · Highlights

Task and goal

The task asks whether QPC's polycontextural architecture — multiple coexisting contextures with their own quantum-logical states — reproduces the empirical structure of contextual cognitive data better than a faithful non-polycontextural control of equal quantum resources. The empirical target is the Clinton–Gore question-order experiment of Wang & Busemeyer (2013), in which a 1997 Gallup poll of 1,002 respondents found a robust, replicable order effect: the joint answer distribution differs depending on whether the Clinton question is asked first or second. Wang & Busemeyer prove that this empirical structure cannot arise from any single Kolmogorov probability space (no Bayesian or Markov model satisfies the QQ equality the data satisfies). Classical impossibility is therefore established by theorem and empirical replication; this pilot does not re-prove it.

What this pilot does claim:

Data and computation contract

Empirical target: Wang & Busemeyer 2013, Topics in Cognitive Science 5(4):689–710, Table 1, "Consistency" column. The poll was conducted Sept 6–7, 1997; half of 1,002 respondents answered the Clinton honesty question first, the other half answered the Gore question first.

Contexture-to-qubit mapping is proprietary. Public evidence: K-ablation metrics and IBM job IDs.

Simulator pre-flight

Unlike v1 and v2, v3 was tested for K-ablation discrimination on a noiseless simulator before being submitted to hardware. The simulator pre-flight produced TV(K=1)=0.296, TV(K=4)=0.256, bootstrap difference +0.040 with 95% CI [+0.029, +0.050] and one-sided p-value 0.0000. This established that the architectural signal exists at the noiseless level — the necessary condition for it to be visible on hardware. Hardware execution proceeded only after pre-flight passed.

IBM hardware pilot (ibm_fez)

Run archived as QQ v3 evidence bundle. IBM Quantum Platform instance routed as open-instance; SamplerV2 primitive on Heron R2 device. QPC noise reducer enabled — 3 runs per circuit, counts averaged across runs.

16q / 4c
Architecture (depth 8–16)
0.251
K=4 TV-mean to empirical
+0.0505
K=1 − K=4 TV gap
p < 0.0005
Bootstrap one-sided
4096
Shots per circuit
18
Sampler jobs (3 K × 2 orders × 3 runs)
2,155 s
Total wall-clock
2026-05-08
Run timestamp (UTC 22:34)

K-ablation: divergence to empirical Clinton–Gore joints

KTVABTVBATVmeanKLmean
1 (faithful non-polycontextural control)0.28150.32160.30150.4317
2 (intermediate transjunctional structure)0.26720.24910.25820.3296
4 (full polycontextural)0.24790.25410.25100.3115

Lower TV / KL means closer to the empirical Wang–Busemeyer joint distribution. Improvement is monotone across K on both metrics. The model fits empirical data more faithfully as polycontextural blocking is added, with parameters held fixed across K.

Bootstrap discrimination (n = 2000)

Each replicate resamples per-circuit counts from the multinomial defined by the observed shots, recomputes both TVmean values, and records the K=1 − K=4 difference.

MetricMean K=1Mean K=4Mean diff95% CI of diffOne-sided pSignificant @95%?
Total-variation distance0.30150.2510+0.0505[+0.0359, +0.0655]0.0000YES
KL divergence0.43190.3119+0.1200[+0.0888, +0.1520]0.0000YES

A one-sided p-value of zero from 2000 bootstrap replicates means every single resample showed K=4 fitting the empirical Clinton–Gore joints strictly better than K=1. The 95% confidence intervals on the difference do not cross zero on either metric.

Hardware archive

FieldValue
Backendibm_fez (Heron R2)
ModeSamplerV2 on open-instance Runtime; readout proprietary readout
QPC noise reducerEnabled ([internal flag]); 3 runs per circuit aggregated; matrix readout mitigation applicable at 16 qubits
First / last job IDd7v5p2jack5s73bf13jg (K=1 AB run 1) … d7v5q2nmrars73d7prsg (K=4 BA run 3)
Full job list18 IDs in ibm_job_ids_all

The headline result

On ibm_fez quantum hardware, the QPC polycontextural architecture (K=4) reproduces the empirical Wang–Busemeyer Clinton–Gore joint-distribution shape strictly better than a faithful non-polycontextural control (K=1) of equal qubit count, depth, and shot budget — bootstrap-significant at p<0.0005 on both total-variation and KL-divergence metrics, with parameters fit only from order-blind marginals.

This is the first QPC pilot that produces a quantitative architectural claim grounded in a controlled comparison against published empirical human data, with hardware-archived statistical significance, on a problem class where the classical limit is theorem-level rather than computational.

Interpretation boundary

What this pilot does not claim, and what readers should not infer from it:

Pilot iteration history

Three iterations were required to produce a defensible result. We document the trajectory because the methodological lessons are part of the evidence.

Data and source artifacts

Empirical target data: Wang & Busemeyer 2013, Topics in Cognitive Science 5(4):689–710, Table 1. PDF available at https://jbusemey.pages.iu.edu/quantum/QuestOrdEff.pdf.

Numbers in this page come from:

← Home Highlights HSBC report