"The correlation between human brain volume and intelligence is r = 0.4"

neuroscience · generated 2026-03-28 · v0.10.0

⬡ Verified by Proof Engine — an open-source tool that proves claims using cited sources and executable code. No LLM trust required.
methodology · github · re-run this proof · submit your own

Key Findings

SC1 (unconditional overall estimate) — FAILS: Two independent large-scale meta-analyses, Pietschnig et al. (2015) and Nave et al. (2022), both report r = 0.24 for the unconditional brain volume–IQ correlation. The deviation from 0.40 is 0.16, well outside the ±0.05 tolerance.
SC2 (conditional estimate: healthy adults, high-quality tests) — PASSES: Wikipedia's Neuroscience and intelligence article (citing Gignac & Bates 2017, Intelligence) states r ≈ 0.40 when high-quality IQ tests are used in healthy adult samples.
The r = 0.4 figure is conditionally correct: It holds under optimal measurement conditions, not as a general population-wide average.
Publication bias inflates, not deflates: Both major meta-analyses found evidence that published studies overreport high correlations. The true unconditional r is likely ≤ 0.24, not 0.40.

Claim Interpretation

Natural language claim: "The correlation between human brain volume and intelligence is r = 0.4"

Formal interpretation:

Field	Value
Subject	Pearson r between total in-vivo brain volume (MRI) and intelligence (IQ/g-factor)
Property	Meta-analytic correlation coefficient
Threshold	0.40
Tolerance	±0.05 (i.e., 0.35 ≤ r ≤ 0.45)

Operator rationale: The claim specifies a single point value (r = 0.4). A tolerance of ±0.05 is applied, as meta-analytic estimates are reported to two decimal places and carry estimation uncertainty. This is a generous interpretation — a narrower tolerance (±0.02) would still fail SC1 and still pass SC2.

Two sub-claims: - SC1: The unconditional overall meta-analytic estimate = 0.40 (±0.05) - SC2: The conditional estimate for healthy adults using high-quality intelligence tests = 0.40 (±0.05)

evidence summary

ID	Fact	Verified
B1	Pietschnig et al. (2015): overall r = .24, 88 studies, 8,000+ subjects	Yes (live)
B2	Nave et al. (2022): overall r = 0.24, 86 studies, N = 26,000+	Partial (50% fragment match; data value 0.24 confirmed live)
B3	Wikipedia: r ≈ 0.4 for healthy adults using high-quality tests	Yes (live)
A1	SC1-A deviation: \|0.24 − 0.40\| = 0.1600	Computed
A2	SC1-B deviation: \|0.24 − 0.40\| = 0.1600	Computed
A3	SC2 deviation: \|0.40 − 0.40\| = 0.0000	Computed
A4	Cross-check: Pietschnig 2015 r vs PMC 2022 r — both 0.24	Computed (agreement)

Linked Sources

Source	ID	Verified
Pietschnig et al. (2015), Neuroscience & Biobehavioral Reviews — PubMed	B1	Yes
Nave et al. (2022), Royal Society Open Science — PMC	B2	Partial
Wikipedia — Neuroscience and intelligence	B3	Yes
SC1-A: \|r_Pietschnig - 0.40\|	A1	Computed
SC1-B: \|r_PMC2022 - 0.40\|	A2	Computed
SC2: \|r_conditional - 0.40\|	A3	Computed
Cross-check: Pietschnig 2015 vs PMC 2022 overall r agreement	A4	Computed

Proof Logic

SC1: Unconditional overall meta-analytic estimate

The two largest systematic meta-analyses both find the same result:

Pietschnig et al. (2015) (B1): Based on 88 studies with over 8,000 subjects, the overall weighted correlation is r = .24 (R² = .06). This generalises across age groups, IQ domains, and sex. The authors note evidence of publication bias inflating earlier estimates.
Nave et al. (2022) (B2): The largest meta-analysis to date (86 studies, N = 26,000+, 454 effect sizes). Most reasonable meta-analytic specifications yield r-values in the mid-0.20s. Their primary result is r = 0.24, with the extreme range being 0.10–0.37 depending on specification choices. Three-quarters of all reasonable specifications do not exceed r = 0.26.

Both sources independently converge on r = 0.24 (cross-check: |0.24 − 0.24| = 0.0 ≤ 0.01, A4). The deviation from the claimed 0.40 is 0.16 — more than three times the ±0.05 tolerance. SC1 fails.

SC2: Conditional estimate (healthy adults, high-quality tests)

Wikipedia (B3), summarising Gignac & Bates (2017, Intelligence), states: "In healthy adults, the correlation of total brain volume and IQ is approximately 0.4 when high-quality tests are used." The underlying paper found corrected correlations of .23 (fair quality tests), .32 (good quality), and .39 (excellent quality), concluding the association "is arguably best characterised as r ≈ .40." The deviation from 0.40 is 0.00. SC2 passes.

This conditional result is consistent with broader meta-analytic patterns: the Nave et al. (2022) analysis also notes "the strongest effects observed for more g-loaded tests and in healthy samples" (B2).

Conclusion

Verdict: PARTIALLY VERIFIED

SC1 (unconditional r = 0.40): DISPROVED. The best-established meta-analytic consensus, from two independent studies covering N > 26,000 subjects, places the unconditional correlation at r ≈ 0.24 — not 0.40. This is robust to publication bias corrections (which, if anything, push the true value lower).
SC2 (conditional r ≈ 0.40): PROVED. When the analysis is restricted to healthy adult samples assessed with high-quality (g-loaded) intelligence tests, the correlation rises to approximately r = 0.40. This is supported by the peer-reviewed Gignac & Bates (2017) meta-analysis.

Summary for practical use: Citing "r = 0.4" without qualification overstates the general brain–IQ correlation. The unconditional average is approximately r = 0.24. The figure r ≈ 0.40 applies specifically under optimal conditions. Textbooks and popular science that cite r = 0.4 as a universal value are simplifying — the number is conditionally correct but misleading as a blanket statement.

Note: B2 (Nave et al. 2022, PMC) achieved only partial (fragment) citation verification. However, the key data value (r = 0.24) was independently confirmed live on the page, and B1 (Pietschnig 2015) provides full independent verification of the same r = 0.24 result.

counter-evidence search

Does any major unconditional meta-analysis report r = 0.40? No. Three principal meta-analyses were reviewed: McDaniel (2005) found r = 0.33 overall (37 samples, n = 1,530); Pietschnig et al. (2015) found r = .24 (88 studies, 8,000+ subjects); Nave et al. (2022) found r = 0.24 (86 studies, N = 26,000+). Gignac & Bates (2017) report r ≈ 0.40 only as a conditional estimate for excellent-quality tests, not as an unconditional overall average.

Could publication bias be deflating estimates below 0.40? No — publication bias works in the opposite direction. Pietschnig et al. (2015) found that "strong and positive correlation coefficients have been reported frequently in the literature whilst small and non-significant associations appear to have been often omitted from reports." Nave et al. (2022) similarly found estimates were "somewhat inflated due to selective reporting." After publication bias correction, estimates remain around r = 0.24. The true unconditional r is likely at or below 0.24.

Is the Wikipedia source for SC2 credible? Yes. The source (Gignac & Bates 2017) is published in Intelligence, a peer-reviewed Elsevier journal. Its finding that measurement quality moderates the brain–IQ correlation is consistent with the broad meta-analytic literature, and the direction of the effect (better tests → higher correlations) is theoretically well-motivated.

audit trail

Citation Verification 2/3 unflagged · 1 partial 1 flagged ▸

2/3 citations unflagged. 1 flagged for review:

B2 partial Nave et al. (2022), Royal Society Open Science — PMC

50% word match

Original audit log

B1 — Pietschnig et al. (2015), PubMed - Status: verified - Method: full_quote - Fetch mode: live - Impact: Primary evidence for SC1. Full quote verified.

B2 — Nave et al. (2022), PMC - Status: partial (fragment match, 50% word coverage) - Method: fragment - Fetch mode: live - Impact: Corroborating evidence for SC1. Partial quote verification, but the key data value (r = 0.24) was independently confirmed live via verify_data_values (found: true). The partial match likely reflects minor HTML/whitespace formatting differences in the full-text PMC article vs. the quote extracted from the abstract. The numerical conclusion is independently supported by B1 (full verification, same r = 0.24).

B3 — Wikipedia — Neuroscience and intelligence - Status: verified - Method: full_quote - Fetch mode: live - Impact: Primary evidence for SC2. Full quote verified. Wikipedia cites Gignac & Bates (2017, Intelligence) for this figure.

Computation Traces ▸

[✓] pietschnig_2015: Full quote verified for pietschnig_2015 (source: tier 5/government)
[~] pmc_2022: Only 15/30 quote words matched for pmc_2022 — partial verification only (source: tier 5/government)
[✓] wiki_conditional: Full quote verified for wiki_conditional (source: tier 3/reference)
[✓] B1.r_overall: '.24' found on page [live]
[✓] B2.r_overall: '0.24' found on page [live]
[✓] B3.r_conditional: '0.4' found on page [live]
  B1_r_overall: Parsed '.24' -> 0.24 (source text: '.24')
  B2_r_overall: Parsed '0.24' -> 0.24
  B3_r_conditional: Parsed '0.4' -> 0.4
  [✓] B1: extracted .24 from quote
  [✓] B2: extracted 0.24 from quote
  [✓] B3: extracted 0.4 from quote
  SC1 cross-check: Pietschnig 2015 vs PMC 2022 overall r: 0.24 vs 0.24, diff=0.0, tolerance=0.01 -> AGREE

  SC1-A: |r_Pietschnig(0.24) - threshold(0.40)| = 0.1600
  SC1-A: Pietschnig r within ±0.05 of 0.40: 0.16000000000000003 <= 0.05 = False

  SC1-B: |r_PMC2022(0.24) - threshold(0.40)| = 0.1600
  SC1-B: PMC 2022 r within ±0.05 of 0.40: 0.16000000000000003 <= 0.05 = False
  SC1: max unconditional r deviation within ±0.05 of 0.40: 0.16000000000000003 <= 0.05 = False

  SC2:   |r_conditional(0.40) - threshold(0.40)| = 0.0000
  SC2: conditional r within ±0.05 of 0.40: 0.0 <= 0.05 = True

Hardening Checklist ▸

Rule	Status	Notes
Rule 1: Every empirical value parsed from quote text	✓	All r values parsed via `parse_number_from_quote` from `data_values` strings
Rule 2: Every citation URL fetched and quote checked	✓	B1 full, B2 partial (50%; data value independently confirmed), B3 full
Rule 3: System time for date-dependent logic	N/A	No date-dependent computations
Rule 4: Claim interpretation explicit with operator rationale	✓	`CLAIM_FORMAL` with `operator_note`, tolerance documented
Rule 5: Adversarial checks searched for counter-evidence	✓	3 checks: unconditional r = 0.40 search, publication bias direction, SC2 credibility
Rule 6: Cross-checks from independent sources	✓	Pietschnig 2015 and PMC 2022 independently report r = 0.24; agreement confirmed
Rule 7: No hard-coded constants or unsafe formulas	✓	All comparisons use `compare()`; `cross_check()` for source agreement

Source Credibility Assessment ▸

Fact ID	Domain	Type	Tier	Note
B1	nih.gov	government	5	PubMed — U.S. National Institutes of Health
B2	nih.gov	government	5	PMC — U.S. National Institutes of Health
B3	wikipedia.org	reference	3	Established reference source; SC2 conclusion backed by Gignac & Bates (2017, peer-reviewed)

No sources have tier ≤ 2.

Linked Sources

Fact ID	Domain	Source URL
B1	nih.gov	https://pubmed.ncbi.nlm.nih.gov/26449760/
B2	nih.gov	https://pmc.ncbi.nlm.nih.gov/articles/PMC9096623/
B3	wikipedia.org	https://en.wikipedia.org/wiki/Neuroscience_and_intelligence

Extraction Records ▸

ID	Value	Found in Quote	Quote Snippet	Extraction Method
B1	0.24	Yes	"…brain volume and IQ (r=.24, R(2)=.06)…"	`parse_number_from_quote(".24", r"([.\d]+)", "B1_r_overall")` → float(".24") = 0.24
B2	0.24	Yes	"Brain size and IQ associations yielded r = 0.24…"	`parse_number_from_quote("0.24", r"([.\d]+)", "B2_r_overall")` → float("0.24") = 0.24
B3	0.4	Yes	"…approximately 0.4 when high-quality tests are used."	`parse_number_from_quote("0.4", r"([.\d]+)", "B3_r_conditional")` → float("0.4") = 0.4

All values parsed programmatically from data_values strings derived from page content; none hand-typed.

Linked Sources

ID	Source URL
B1	https://pubmed.ncbi.nlm.nih.gov/26449760/
B2	https://pmc.ncbi.nlm.nih.gov/articles/PMC9096623/
B3	https://en.wikipedia.org/wiki/Neuroscience_and_intelligence

↓ run the proof (Python) ↓ original audit log view on github raw data (JSON)

found this useful? ★ star on github