"The correlation between human brain volume and intelligence is r = 0.4"
Key Findings
- SC1 (unconditional overall estimate) — FAILS: Two independent large-scale meta-analyses, Pietschnig et al. (2015) and Nave et al. (2022), both report r = 0.24 for the unconditional brain volume–IQ correlation. The deviation from 0.40 is 0.16, well outside the ±0.05 tolerance.
- SC2 (conditional estimate: healthy adults, high-quality tests) — PASSES: Wikipedia's Neuroscience and intelligence article (citing Gignac & Bates 2017, Intelligence) states r ≈ 0.40 when high-quality IQ tests are used in healthy adult samples.
- The r = 0.4 figure is conditionally correct: It holds under optimal measurement conditions, not as a general population-wide average.
- Publication bias inflates, not deflates: Both major meta-analyses found evidence that published studies overreport high correlations. The true unconditional r is likely ≤ 0.24, not 0.40.
Claim Interpretation
Natural language claim: "The correlation between human brain volume and intelligence is r = 0.4"
Formal interpretation:
| Field | Value |
|---|---|
| Subject | Pearson r between total in-vivo brain volume (MRI) and intelligence (IQ/g-factor) |
| Property | Meta-analytic correlation coefficient |
| Threshold | 0.40 |
| Tolerance | ±0.05 (i.e., 0.35 ≤ r ≤ 0.45) |
Operator rationale: The claim specifies a single point value (r = 0.4). A tolerance of ±0.05 is applied, as meta-analytic estimates are reported to two decimal places and carry estimation uncertainty. This is a generous interpretation — a narrower tolerance (±0.02) would still fail SC1 and still pass SC2.
Two sub-claims: - SC1: The unconditional overall meta-analytic estimate = 0.40 (±0.05) - SC2: The conditional estimate for healthy adults using high-quality intelligence tests = 0.40 (±0.05)
evidence summary
| ID | Fact | Verified |
|---|---|---|
| B1 | Pietschnig et al. (2015): overall r = .24, 88 studies, 8,000+ subjects | Yes (live) |
| B2 | Nave et al. (2022): overall r = 0.24, 86 studies, N = 26,000+ | Partial (50% fragment match; data value 0.24 confirmed live) |
| B3 | Wikipedia: r ≈ 0.4 for healthy adults using high-quality tests | Yes (live) |
| A1 | SC1-A deviation: |0.24 − 0.40| = 0.1600 | Computed |
| A2 | SC1-B deviation: |0.24 − 0.40| = 0.1600 | Computed |
| A3 | SC2 deviation: |0.40 − 0.40| = 0.0000 | Computed |
| A4 | Cross-check: Pietschnig 2015 r vs PMC 2022 r — both 0.24 | Computed (agreement) |
Linked Sources
| Source | ID | Verified |
|---|---|---|
| Pietschnig et al. (2015), Neuroscience & Biobehavioral Reviews — PubMed | B1 | Yes |
| Nave et al. (2022), Royal Society Open Science — PMC | B2 | Partial |
| Wikipedia — Neuroscience and intelligence | B3 | Yes |
| SC1-A: |r_Pietschnig - 0.40| | A1 | Computed |
| SC1-B: |r_PMC2022 - 0.40| | A2 | Computed |
| SC2: |r_conditional - 0.40| | A3 | Computed |
| Cross-check: Pietschnig 2015 vs PMC 2022 overall r agreement | A4 | Computed |
Proof Logic
SC1: Unconditional overall meta-analytic estimate
The two largest systematic meta-analyses both find the same result:
-
Pietschnig et al. (2015) (B1): Based on 88 studies with over 8,000 subjects, the overall weighted correlation is r = .24 (R² = .06). This generalises across age groups, IQ domains, and sex. The authors note evidence of publication bias inflating earlier estimates.
-
Nave et al. (2022) (B2): The largest meta-analysis to date (86 studies, N = 26,000+, 454 effect sizes). Most reasonable meta-analytic specifications yield r-values in the mid-0.20s. Their primary result is r = 0.24, with the extreme range being 0.10–0.37 depending on specification choices. Three-quarters of all reasonable specifications do not exceed r = 0.26.
Both sources independently converge on r = 0.24 (cross-check: |0.24 − 0.24| = 0.0 ≤ 0.01, A4). The deviation from the claimed 0.40 is 0.16 — more than three times the ±0.05 tolerance. SC1 fails.
SC2: Conditional estimate (healthy adults, high-quality tests)
Wikipedia (B3), summarising Gignac & Bates (2017, Intelligence), states: "In healthy adults, the correlation of total brain volume and IQ is approximately 0.4 when high-quality tests are used." The underlying paper found corrected correlations of .23 (fair quality tests), .32 (good quality), and .39 (excellent quality), concluding the association "is arguably best characterised as r ≈ .40." The deviation from 0.40 is 0.00. SC2 passes.
This conditional result is consistent with broader meta-analytic patterns: the Nave et al. (2022) analysis also notes "the strongest effects observed for more g-loaded tests and in healthy samples" (B2).
Conclusion
Verdict: PARTIALLY VERIFIED
-
SC1 (unconditional r = 0.40): DISPROVED. The best-established meta-analytic consensus, from two independent studies covering N > 26,000 subjects, places the unconditional correlation at r ≈ 0.24 — not 0.40. This is robust to publication bias corrections (which, if anything, push the true value lower).
-
SC2 (conditional r ≈ 0.40): PROVED. When the analysis is restricted to healthy adult samples assessed with high-quality (g-loaded) intelligence tests, the correlation rises to approximately r = 0.40. This is supported by the peer-reviewed Gignac & Bates (2017) meta-analysis.
Summary for practical use: Citing "r = 0.4" without qualification overstates the general brain–IQ correlation. The unconditional average is approximately r = 0.24. The figure r ≈ 0.40 applies specifically under optimal conditions. Textbooks and popular science that cite r = 0.4 as a universal value are simplifying — the number is conditionally correct but misleading as a blanket statement.
Note: B2 (Nave et al. 2022, PMC) achieved only partial (fragment) citation verification. However, the key data value (r = 0.24) was independently confirmed live on the page, and B1 (Pietschnig 2015) provides full independent verification of the same r = 0.24 result.
counter-evidence search
Does any major unconditional meta-analysis report r = 0.40? No. Three principal meta-analyses were reviewed: McDaniel (2005) found r = 0.33 overall (37 samples, n = 1,530); Pietschnig et al. (2015) found r = .24 (88 studies, 8,000+ subjects); Nave et al. (2022) found r = 0.24 (86 studies, N = 26,000+). Gignac & Bates (2017) report r ≈ 0.40 only as a conditional estimate for excellent-quality tests, not as an unconditional overall average.
Could publication bias be deflating estimates below 0.40? No — publication bias works in the opposite direction. Pietschnig et al. (2015) found that "strong and positive correlation coefficients have been reported frequently in the literature whilst small and non-significant associations appear to have been often omitted from reports." Nave et al. (2022) similarly found estimates were "somewhat inflated due to selective reporting." After publication bias correction, estimates remain around r = 0.24. The true unconditional r is likely at or below 0.24.
Is the Wikipedia source for SC2 credible? Yes. The source (Gignac & Bates 2017) is published in Intelligence, a peer-reviewed Elsevier journal. Its finding that measurement quality moderates the brain–IQ correlation is consistent with the broad meta-analytic literature, and the direction of the effect (better tests → higher correlations) is theoretically well-motivated.
audit trail
2/3 citations unflagged. 1 flagged for review:
- 50% word match
Original audit log
B1 — Pietschnig et al. (2015), PubMed - Status: verified - Method: full_quote - Fetch mode: live - Impact: Primary evidence for SC1. Full quote verified.
B2 — Nave et al. (2022), PMC
- Status: partial (fragment match, 50% word coverage)
- Method: fragment
- Fetch mode: live
- Impact: Corroborating evidence for SC1. Partial quote verification, but the key data value (r = 0.24) was independently confirmed live via verify_data_values (found: true). The partial match likely reflects minor HTML/whitespace formatting differences in the full-text PMC article vs. the quote extracted from the abstract. The numerical conclusion is independently supported by B1 (full verification, same r = 0.24).
B3 — Wikipedia — Neuroscience and intelligence - Status: verified - Method: full_quote - Fetch mode: live - Impact: Primary evidence for SC2. Full quote verified. Wikipedia cites Gignac & Bates (2017, Intelligence) for this figure.
[✓] pietschnig_2015: Full quote verified for pietschnig_2015 (source: tier 5/government)
[~] pmc_2022: Only 15/30 quote words matched for pmc_2022 — partial verification only (source: tier 5/government)
[✓] wiki_conditional: Full quote verified for wiki_conditional (source: tier 3/reference)
[✓] B1.r_overall: '.24' found on page [live]
[✓] B2.r_overall: '0.24' found on page [live]
[✓] B3.r_conditional: '0.4' found on page [live]
B1_r_overall: Parsed '.24' -> 0.24 (source text: '.24')
B2_r_overall: Parsed '0.24' -> 0.24
B3_r_conditional: Parsed '0.4' -> 0.4
[✓] B1: extracted .24 from quote
[✓] B2: extracted 0.24 from quote
[✓] B3: extracted 0.4 from quote
SC1 cross-check: Pietschnig 2015 vs PMC 2022 overall r: 0.24 vs 0.24, diff=0.0, tolerance=0.01 -> AGREE
SC1-A: |r_Pietschnig(0.24) - threshold(0.40)| = 0.1600
SC1-A: Pietschnig r within ±0.05 of 0.40: 0.16000000000000003 <= 0.05 = False
SC1-B: |r_PMC2022(0.24) - threshold(0.40)| = 0.1600
SC1-B: PMC 2022 r within ±0.05 of 0.40: 0.16000000000000003 <= 0.05 = False
SC1: max unconditional r deviation within ±0.05 of 0.40: 0.16000000000000003 <= 0.05 = False
SC2: |r_conditional(0.40) - threshold(0.40)| = 0.0000
SC2: conditional r within ±0.05 of 0.40: 0.0 <= 0.05 = True
| Rule | Status | Notes |
|---|---|---|
| Rule 1: Every empirical value parsed from quote text | ✓ | All r values parsed via parse_number_from_quote from data_values strings |
| Rule 2: Every citation URL fetched and quote checked | ✓ | B1 full, B2 partial (50%; data value independently confirmed), B3 full |
| Rule 3: System time for date-dependent logic | N/A | No date-dependent computations |
| Rule 4: Claim interpretation explicit with operator rationale | ✓ | CLAIM_FORMAL with operator_note, tolerance documented |
| Rule 5: Adversarial checks searched for counter-evidence | ✓ | 3 checks: unconditional r = 0.40 search, publication bias direction, SC2 credibility |
| Rule 6: Cross-checks from independent sources | ✓ | Pietschnig 2015 and PMC 2022 independently report r = 0.24; agreement confirmed |
| Rule 7: No hard-coded constants or unsafe formulas | ✓ | All comparisons use compare(); cross_check() for source agreement |
| Fact ID | Domain | Type | Tier | Note |
|---|---|---|---|---|
| B1 | nih.gov | government | 5 | PubMed — U.S. National Institutes of Health |
| B2 | nih.gov | government | 5 | PMC — U.S. National Institutes of Health |
| B3 | wikipedia.org | reference | 3 | Established reference source; SC2 conclusion backed by Gignac & Bates (2017, peer-reviewed) |
No sources have tier ≤ 2.
Linked Sources
| Fact ID | Domain | Source URL |
|---|---|---|
| B1 | nih.gov | https://pubmed.ncbi.nlm.nih.gov/26449760/ |
| B2 | nih.gov | https://pmc.ncbi.nlm.nih.gov/articles/PMC9096623/ |
| B3 | wikipedia.org | https://en.wikipedia.org/wiki/Neuroscience_and_intelligence |
| ID | Value | Found in Quote | Quote Snippet | Extraction Method |
|---|---|---|---|---|
| B1 | 0.24 | Yes | "…brain volume and IQ (r=.24, R(2)=.06)…" | parse_number_from_quote(".24", r"([.\d]+)", "B1_r_overall") → float(".24") = 0.24 |
| B2 | 0.24 | Yes | "Brain size and IQ associations yielded r = 0.24…" | parse_number_from_quote("0.24", r"([.\d]+)", "B2_r_overall") → float("0.24") = 0.24 |
| B3 | 0.4 | Yes | "…approximately 0.4 when high-quality tests are used." | parse_number_from_quote("0.4", r"([.\d]+)", "B3_r_conditional") → float("0.4") = 0.4 |
All values parsed programmatically from data_values strings derived from page content; none hand-typed.
Linked Sources
| ID | Source URL |
|---|---|
| B1 | https://pubmed.ncbi.nlm.nih.gov/26449760/ |
| B2 | https://pmc.ncbi.nlm.nih.gov/articles/PMC9096623/ |
| B3 | https://en.wikipedia.org/wiki/Neuroscience_and_intelligence |
found this useful? ★ star on github