Statistical results overview Prepared: 2026-02-10 Updated: 2026-02-22 Purpose - Provide one reader-facing, auditable summary of the main numeric findings produced by the Evidence Lab scripts. - Put effect sizes, ordering, and robustness checks in one place so interpretation is not spread across many files. Data scope and modeling setup - Unit of analysis: school-level aggregate rows (not student-level microdata). - Weighting: participant-weighted unless noted otherwise. - Main outcome for most analyses: Percent Proficient. - Baseline continuity predictors: adult BA+ rate, income, attendance. - Companion 2024-25 predictors: adult BA+ rate, Students Experiencing Poverty, attendance. - Exploratory predictors: overall spending per student, classroom spending per student, median class size. - Typical row filter: "Total Population" student-group rows, with "All Grades" rows dropped when grade-specific rows exist. How to read coefficients in this report - Correlation r: raw association only (no controls). - Standardized beta: relative predictor strength after controlling for other variables in the same model. - R^2: share of weighted variance explained by the model. - Delta R^2: improvement in fit after adding terms. 1) Joint SES + attendance models (core result) Source: report_income_education_attendance_joint_model.py Report: income_education_attendance_joint_model_report.txt Results by subject (standardized betas from Percent Proficient ~ income + education + attendance): - English (ELA): income 0.025, education 0.463, attendance 0.299, R^2=0.418 - Math: income 0.046, education 0.433, attendance 0.413, R^2=0.517 - Science: income 0.006, education 0.458, attendance 0.252, R^2=0.349 What this means: - Education and attendance carry most of the controlled signal. - Income remains correlated in bivariate views, but in joint models its unique contribution is small in these statewide runs. - The education-over-income ordering is large and consistent across subjects. 1b) Poverty-aware reassessment (2024-25 companion result) Source: docs/income_poverty_reassessment_note_2026-02-20.txt Companion notes: - docs/why_poverty_outpredicts_income_explainer_2026-02-21.txt - docs/ba_signal_in_high_poverty_summary_2026-02-21.txt Cross-validated R^2 (non-charter/non-virtual): - ELA: - BA+ + Attendance + Income: 0.5190 - BA+ + Attendance + Poverty: 0.6408 - BA+ + Attendance + Income + Poverty: 0.6508 - Math: - BA+ + Attendance + Income: 0.6429 - BA+ + Attendance + Poverty: 0.6705 - BA+ + Attendance + Income + Poverty: 0.6726 - Science: - BA+ + Attendance + Income: 0.3966 - BA+ + Attendance + Poverty: 0.5077 - BA+ + Attendance + Income + Poverty: 0.5195 Interpretation: - In 2024-25 models, Students Experiencing Poverty adds meaningful independent signal. - Income still contributes context and modest incremental value in some subject/spec combinations. - Practical reading: keep income as community context, but include poverty-aware specs for current-year explanatory comparisons. - Note on comparability: these values use the median-household-income continuity spec; per-capita-income variants (reported in the poverty explainer note) show slightly different baseline R^2 values. 2) Interaction checks (non-additive structure) Source: report_income_education_attendance_interactions.py Report: income_education_attendance_interaction_report.txt Base model vs interaction model: - ELA: R^2 rises from 0.418 to 0.433 (Delta 0.015) - Math: R^2 rises from 0.517 to 0.548 (Delta 0.031) - Science: R^2 rises from 0.349 to 0.380 (Delta 0.031) The largest interaction term is usually education x attendance. Interpretation: - The association between one predictor and performance depends on the level of another predictor. - Purely additive narratives miss some structure in the data. 3) Stability and historical robustness Sources: - report_income_education_split_stability.py - report_income_education_joint_model.py (using 2018-2019 dataset options) - report_income_education_attendance_joint_model.py (using 2018-2019 dataset options) - report_income_education_attendance_interactions.py (using 2018-2019 dataset options) Reports: - income_education_split_stability_report.txt - income_education_joint_model_report_2018_2019.txt - income_education_attendance_joint_model_report_math_2018_2019.txt - income_education_attendance_interaction_report_math_2018_2019.txt Key findings: - Split-sample checks repeatedly preserve education > income ordering. - Pre-pandemic (2018-2019 Math, era-appropriate ACS) preserves the same ordering. - Magnitudes move with scope restrictions (for example, grade/school-level filters), but ordering remains stable. Interpretation: - Current-year findings are unlikely to be a one-off artifact of one sample slice. 4) Outcome-level heterogeneity (Math, level-specific checks) Source: report_income_education_joint_model.py Report example: income_education_joint_model_report_math_level4.txt Percent Level 4 example: - Weighted r: income 0.501, education 0.625 - Standardized betas: income 0.133, education 0.533 - R^2 (income + education): 0.399 Interpretation: - Education signal can strengthen for top-end outcomes, not just overall proficiency. 5) Spending and class-size exploratory models Sources: - report_spending_classsize_effects.py - report_spending_classsize_ridge.py Reports: - spending_class_size_effects_report.txt - spending_class_size_ridge_report.txt - spending_class_size_findings_memo.txt Observed pattern: - Class size: weak and often near-zero contribution after controls. - Spending: detectable in some specifications, but weaker and less stable than education/attendance. - Spending variables are strongly collinear, requiring regularization and cautious interpretation. Interpretation: - Current school-level cross-sections do not support a strong, clean statewide class-size signal. - Spending likely has context-specific effects, but broad aggregate signal is modest in this setup. 6) What appears strongest vs weakest right now Strongest recurring signals: - Adult education level - Attendance - School-population poverty (2024-25 ODE field) Secondary/variable signal: - Income after controls Weakest statewide signal in current cross-sectional models: - Median class size - Some spending specifications Cautions and limits - Associational evidence only; not causal identification. - School-level aggregation can hide within-school heterogeneity. - Collinearity can destabilize coefficient signs and magnitudes. - Best practice is to rely on ordering + stability + consistency across multiple model families. Primary artifacts for replication - scripts/report_income_education_joint_model.py - scripts/report_income_education_split_stability.py - scripts/report_income_education_attendance_joint_model.py - scripts/report_income_education_attendance_interactions.py - scripts/report_spending_classsize_ridge.py - reports/ses_explorations_summary_report.txt - reports/income_education_attendance_joint_model_report.txt - reports/income_education_attendance_interaction_report.txt - reports/spending_class_size_findings_memo.txt