Oregon School Assessment

Statistical results overview

Prepared: 2026-02-10 | Updated: 2026-02-22

Statistical results overview
Prepared: 2026-02-10
Updated: 2026-02-22

Purpose
- Provide one reader-facing, auditable summary of the main numeric findings produced by the Evidence Lab scripts.
- Put effect sizes, ordering, and robustness checks in one place so interpretation is not spread across many files.

Data scope and modeling setup
- Unit of analysis: school-level aggregate rows (not student-level microdata).
- Weighting: participant-weighted unless noted otherwise.
- Main outcome for most analyses: Percent Proficient.
- Baseline continuity predictors: adult BA+ rate, income, attendance.
- Companion 2024-25 predictors: adult BA+ rate, Students Experiencing Poverty, attendance.
- Exploratory predictors: overall spending per student, classroom spending per student, median class size.
- Typical row filter: "Total Population" student-group rows, with "All Grades" rows dropped when grade-specific rows exist.

How to read coefficients in this report
- Correlation r: raw association only (no controls).
- Standardized beta: relative predictor strength after controlling for other variables in the same model.
- R^2: share of weighted variance explained by the model.
- Delta R^2: improvement in fit after adding terms.

1) Joint SES + attendance models (core result)
Source: report_income_education_attendance_joint_model.py
Report: income_education_attendance_joint_model_report.txt

Results by subject (standardized betas from Percent Proficient ~ income + education + attendance):
- English (ELA): income 0.025, education 0.463, attendance 0.299, R^2=0.418
- Math: income 0.046, education 0.433, attendance 0.413, R^2=0.517
- Science: income 0.006, education 0.458, attendance 0.252, R^2=0.349

What this means:
- Education and attendance carry most of the controlled signal.
- Income remains correlated in bivariate views, but in joint models its unique contribution is small in these statewide runs.
- The education-over-income ordering is large and consistent across subjects.

1b) Poverty-aware reassessment (2024-25 companion result)
Source: docs/income_poverty_reassessment_note_2026-02-20.txt
Companion notes:
- docs/why_poverty_outpredicts_income_explainer_2026-02-21.txt
- docs/ba_signal_in_high_poverty_summary_2026-02-21.txt

Cross-validated R^2 (non-charter/non-virtual):
- ELA:
  - BA+ + Attendance + Income: 0.5190
  - BA+ + Attendance + Poverty: 0.6408
  - BA+ + Attendance + Income + Poverty: 0.6508
- Math:
  - BA+ + Attendance + Income: 0.6429
  - BA+ + Attendance + Poverty: 0.6705
  - BA+ + Attendance + Income + Poverty: 0.6726
- Science:
  - BA+ + Attendance + Income: 0.3966
  - BA+ + Attendance + Poverty: 0.5077
  - BA+ + Attendance + Income + Poverty: 0.5195

Interpretation:
- In 2024-25 models, Students Experiencing Poverty adds meaningful independent signal.
- Income still contributes context and modest incremental value in some subject/spec combinations.
- Practical reading: keep income as community context, but include poverty-aware specs for current-year explanatory comparisons.
- Note on comparability: these values use the median-household-income continuity spec; per-capita-income variants (reported in the poverty explainer note) show slightly different baseline R^2 values.

2) Interaction checks (non-additive structure)
Source: report_income_education_attendance_interactions.py
Report: income_education_attendance_interaction_report.txt

Base model vs interaction model:
- ELA: R^2 rises from 0.418 to 0.433 (Delta 0.015)
- Math: R^2 rises from 0.517 to 0.548 (Delta 0.031)
- Science: R^2 rises from 0.349 to 0.380 (Delta 0.031)

The largest interaction term is usually education x attendance.
Interpretation:
- The association between one predictor and performance depends on the level of another predictor.
- Purely additive narratives miss some structure in the data.

3) Stability and historical robustness
Sources:
- report_income_education_split_stability.py
- report_income_education_joint_model.py (using 2018-2019 dataset options)
- report_income_education_attendance_joint_model.py (using 2018-2019 dataset options)
- report_income_education_attendance_interactions.py (using 2018-2019 dataset options)
Reports:
- income_education_split_stability_report.txt
- income_education_joint_model_report_2018_2019.txt
- income_education_attendance_joint_model_report_math_2018_2019.txt
- income_education_attendance_interaction_report_math_2018_2019.txt

Key findings:
- Split-sample checks repeatedly preserve education > income ordering.
- Pre-pandemic (2018-2019 Math, era-appropriate ACS) preserves the same ordering.
- Magnitudes move with scope restrictions (for example, grade/school-level filters), but ordering remains stable.

Interpretation:
- Current-year findings are unlikely to be a one-off artifact of one sample slice.

4) Outcome-level heterogeneity (Math, level-specific checks)
Source: report_income_education_joint_model.py
Report example: income_education_joint_model_report_math_level4.txt

Percent Level 4 example:
- Weighted r: income 0.501, education 0.625
- Standardized betas: income 0.133, education 0.533
- R^2 (income + education): 0.399

Interpretation:
- Education signal can strengthen for top-end outcomes, not just overall proficiency.

5) Spending and class-size exploratory models
Sources:
- report_spending_classsize_effects.py
- report_spending_classsize_ridge.py
Reports:
- spending_class_size_effects_report.txt
- spending_class_size_ridge_report.txt
- spending_class_size_findings_memo.txt

Observed pattern:
- Class size: weak and often near-zero contribution after controls.
- Spending: detectable in some specifications, but weaker and less stable than education/attendance.
- Spending variables are strongly collinear, requiring regularization and cautious interpretation.

Interpretation:
- Current school-level cross-sections do not support a strong, clean statewide class-size signal.
- Spending likely has context-specific effects, but broad aggregate signal is modest in this setup.

6) What appears strongest vs weakest right now
Strongest recurring signals:
- Adult education level
- Attendance
- School-population poverty (2024-25 ODE field)

Secondary/variable signal:
- Income after controls

Weakest statewide signal in current cross-sectional models:
- Median class size
- Some spending specifications

Cautions and limits
- Associational evidence only; not causal identification.
- School-level aggregation can hide within-school heterogeneity.
- Collinearity can destabilize coefficient signs and magnitudes.
- Best practice is to rely on ordering + stability + consistency across multiple model families.

Primary artifacts for replication
- scripts/report_income_education_joint_model.py
- scripts/report_income_education_split_stability.py
- scripts/report_income_education_attendance_joint_model.py
- scripts/report_income_education_attendance_interactions.py
- scripts/report_spending_classsize_ridge.py
- reports/ses_explorations_summary_report.txt
- reports/income_education_attendance_joint_model_report.txt
- reports/income_education_attendance_interaction_report.txt
- reports/spending_class_size_findings_memo.txt