""Math washing" a spreadsheet (presenting empirical observations as universal theorems) is valid scientific practice."
Key Findings
- 3 out of 3 independent authoritative sources — Encyclopaedia Britannica, Stanford Encyclopedia of Philosophy, and the Catalog of Bias — confirm that presenting empirical observations as universal theorems (without falsifiability, hypothesis testing, and replication) violates core standards of scientific validity. (A1)
- Popper's falsifiability criterion (Britannica, B1): a theory is only genuinely scientific if it is possible to establish that it is false — ruling out patterns that were never exposed to disconfirmation.
- The scientific method (Stanford Encyclopedia, B2): valid science requires systematic reasoning that goes beyond what observation alone can establish — not just cataloging data.
- Data-dredging (Catalog of Bias, B3): presenting the results of unplanned statistical tests as if they were a prespecified analysis is a recognized methodological distortion that generates false positives.
- All three adversarial hypotheses — Baconian inductivism, Exploratory Data Analysis, and domain-limited empiricism — were searched and found not to support the claim.
Claim Interpretation
Natural language claim: "Math washing" a spreadsheet (presenting empirical observations as universal theorems) is valid scientific practice.
Formal interpretation:
| Field | Value |
|---|---|
| Subject | Math washing (presenting empirical spreadsheet observations as universal theorems) |
| Property | Constitutes valid scientific practice |
| Operator | ≥ |
| Threshold | 3 independent authoritative sources confirming this practice violates scientific standards |
| Proof direction | Disprove |
Operator rationale: "Valid scientific practice" is interpreted as methodology meeting the standards recognized by the scientific community: specifically, the hypothetico-deductive model requiring hypothesis formation, falsifiability, and controlled testing. The disproof threshold requires 3 independent authoritative sources confirming that presenting empirical observations alone as universal theorems violates these standards. "Universal theorem" is interpreted in the strict sense: a claim that holds without exception for all instances, not merely a statistical regularity or empirical generalization. A threshold of 3 ensures broad expert consensus rather than relying on a single contrary voice.
Source: proof.py JSON summary
evidence summary
| ID | Fact | Verified |
|---|---|---|
| B1 | Britannica: Popper's falsifiability criterion — scientific validity requires falsifiability | Yes |
| B2 | Stanford Encyclopedia of Philosophy: scientific method requires reasoning beyond observation | Yes |
| B3 | Catalog of Bias: data-dredging is a recognized methodological distortion in science | Yes |
| A1 | Count of authoritative sources confirming math-washing is not valid scientific practice | Computed: 3 sources confirmed (threshold: 3) |
Source: proof.py JSON summary
Linked Sources
| Source | ID | Verified |
|---|---|---|
| Encyclopaedia Britannica — criterion of falsifiability | B1 | Yes |
| Stanford Encyclopedia of Philosophy — scientific method | B2 | Yes |
| Catalog of Bias — data-dredging bias | B3 | Yes |
| Count of authoritative sources confirming math-washing is not valid scientific practice | A1 | Computed |
Proof Logic
The claim asserts that "math washing" — taking patterns observed in a spreadsheet and presenting them as universal theorems — constitutes valid scientific practice.
To disprove this, the proof establishes that this methodology violates the foundational standards of scientific validity as articulated by three independent authoritative sources:
Falsifiability failure (B1): Encyclopaedia Britannica states that "a theory is genuinely scientific only if it is possible in principle to establish that it is false." Patterns extracted from a spreadsheet and declared universal theorems have typically not been subjected to attempts at falsification. A claim derived solely by inspecting what a dataset shows has not been tested for what it forbids — it cannot predict what observations would contradict it. This directly fails Popper's demarcation criterion for science.
Insufficiency of observation alone (B2): The Stanford Encyclopedia of Philosophy states that "scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation." Observation is necessary but not sufficient. Valid scientific inference requires the superstructure of hypothesis formation, theoretical grounding, and systematic testing — all absent from raw pattern presentation.
Data-dredging distortion (B3): The Catalog of Bias defines data-dredging as "a distortion that arises from presenting the results of unplanned statistical tests as if they were a fully prespecified course of analyses." Math washing is structurally identical to this distortion: patterns identified post-hoc in a dataset are reframed as if they were predicted theorems. This practice generates false positives and is explicitly cataloged as a methodological error.
Together, these three sources cover the three main failure modes of math washing: it lacks falsifiability (B1), it over-extends what observation can establish (B2), and it misrepresents the analytical process as theory-driven when it is pattern-driven (B3). All three citations were fully verified on their live source pages (A1: 3/3 confirmed ≥ threshold of 3).
Source: author analysis
Conclusion
Verdict: DISPROVED
Three independent authoritative sources — Encyclopaedia Britannica (B1), Stanford Encyclopedia of Philosophy (B2), and the Catalog of Bias (B3) — all confirmed (3/3, threshold ≥ 3) that presenting empirical observations as universal theorems without falsifiability testing, hypothesis formation, and independent replication violates the foundational standards of scientific validity. All three citations were fully verified on live source pages.
No adversarial search found a scientific tradition, framework, or domain that endorses direct pattern-to-universal-theorem inference from spreadsheet data. The practice of math washing fails the scientific method on at least three independent grounds: it lacks falsifiability (Popper), it over-extends what observation alone can establish (SEP), and it constitutes the recognized methodological distortion of data-dredging (Catalog of Bias).
Note: Citation B3 (catalogofbias.org) is classified as credibility tier 2 (unclassified domain). This source is published by a project affiliated with the University of Oxford's Centre for Evidence-Based Medicine (CEBM). The conclusion does not depend solely on B3 — it is independently supported by the fully verified tier-3 (B1) and tier-4 (B2) sources.
Generated by proof-engine v1.0.0 on 2026-03-28.
counter-evidence search
Three adversarial hypotheses were investigated before writing this proof:
1. Is there a scientific tradition that endorses inductive generalization from data as universal law? Baconian inductivism (Francis Bacon's model) is the strongest historical candidate. However, even Bacon's framework requires systematic observation, replication, and elimination of observer bias. Naive inductivism has been largely discredited in philosophy of science (Popper 1934; Hempel 1965). No form of inductivism endorses labeling data patterns as universal theorems (a term implying deductive necessity) rather than empirical generalizations.
2. Does Exploratory Data Analysis (EDA) validate presenting spreadsheet patterns as scientific findings? Tukey's EDA framework (1977) is an explicitly hypothesis-generating practice, not hypothesis-confirming. EDA is designed to produce candidate hypotheses for subsequent rigorous testing — not universal theorems. The EDA literature itself draws this distinction, supporting the disproof.
3. Could math washing be valid in limited domains (actuarial science, empirical economics, physics phenomenology)? Empirical economics explicitly distinguishes "stylized facts" (data regularities) from theorems. Kaldor (1961) introduced "stylized facts" precisely because observed patterns do NOT constitute universal theorems without theoretical grounding. In physics, empirical regularities like Kepler's laws were only accepted as scientific law after derivation from deeper theoretical principles (Newton's mechanics). No domain endorses raw pattern-to-theorem promotion.
None of these adversarial checks found evidence that breaks or qualifies the disproof.
Source: author analysis
audit trail
All 3 citations verified.
Original audit log
B1 — Encyclopaedia Britannica: - Status: verified - Method: full_quote - Coverage: N/A (full match) - Fetch mode: live
B2 — Stanford Encyclopedia of Philosophy: - Status: verified - Method: full_quote - Coverage: N/A (full match) - Fetch mode: live
B3 — Catalog of Bias: - Status: verified - Method: full_quote - Coverage: N/A (full match) - Fetch mode: live
Source: proof.py JSON summary
Confirmed sources: 3 / 3
verified source count vs disproof threshold: 3 >= 3 = True
Source: proof.py inline output (execution trace)
| Rule | Status | Notes |
|---|---|---|
| Rule 1: No hand-typed extracted values | N/A — qualitative proof, no numeric extraction | Citation verification status is machine-computed |
| Rule 2: Citations verified by fetching | PASS — all 3 citations verified live | verify_all_citations() run against live URLs |
| Rule 3: System time for date logic | N/A — no time-dependent computation | date.today() imported but not used for claims |
| Rule 4: Explicit claim interpretation | PASS | CLAIM_FORMAL with operator_note present |
| Rule 5: Adversarial checks independent | PASS | 3 adversarial hypotheses searched independently |
| Rule 6: Cross-checks from independent sources | PASS | 3 sources from different institutions and traditions |
| Rule 7: No hard-coded constants/formulas | PASS | compare() used; no inline formulas |
| validate_proof.py | PASS (14/15, 1 warning fixed) | Warning about missing else branch was resolved |
Source: proof.py inline output (execution trace); author analysis
Generated by proof-engine v1.0.0 on 2026-03-28.
| Fact ID | Domain | Type | Tier | Note |
|---|---|---|---|---|
| B1 | britannica.com | reference | 3 | Established reference source |
| B2 | stanford.edu | academic | 4 | Academic domain (.edu) |
| B3 | catalogofbias.org | unknown | 2 | Unclassified domain — verify source authority manually |
Note on B3 (Tier 2): catalogofbias.org is the online home of the Catalog of Bias project, affiliated with the University of Oxford's Centre for Evidence-Based Medicine (CEBM). The domain is unclassified by the automated credibility system, but the project is an established academic resource in evidence-based medicine. The conclusion does not depend solely on B3 — B1 (Tier 3) and B2 (Tier 4) independently support the disproof.
Source: proof.py JSON summary; tier-2 note is author analysis
Linked Sources
| Fact ID | Domain | Source URL |
|---|---|---|
| B1 | britannica.com | https://www.britannica.com/topic/criterion-of-falsifiability |
| B2 | stanford.edu | https://plato.stanford.edu/entries/scientific-method/ |
| B3 | catalogofbias.org | https://catalogofbias.org/biases/data-dredging-bias/ |
For this qualitative proof, extractions record citation verification status rather than numeric values:
| ID | Value (status) | Countable (verified/partial) | Quote snippet |
|---|---|---|---|
| B1 | verified | Yes | "a theory is genuinely scientific only if it is possible in principle to establis..." |
| B2 | verified | Yes | "In addition to careful observation, then, scientific method requires a logic as ..." |
| B3 | verified | Yes | "A distortion that arises from presenting the results of unplanned statistical te..." |
Source: proof.py JSON summary
Linked Sources
| ID | Source URL |
|---|---|
| B1 | https://www.britannica.com/topic/criterion-of-falsifiability |
| B2 | https://plato.stanford.edu/entries/scientific-method/ |
| B3 | https://catalogofbias.org/biases/data-dredging-bias/ |
found this useful? ★ star on github