fixed leakage

2026-03-28 12:19:09 -04:00
parent 321fe78105
commit 1530c02961
24 changed files with 8224 additions and 1086 deletions
--- a/data/05_predictive_modelling/model_summary.txt
+++ b/data/05_predictive_modelling/model_summary.txt
@@ -3,10 +3,10 @@

 | Model | MAE | RMSE | R² | Within ±1 | Within ±2 | Exact V | Within ±1 V |
 |-------|-----|------|----|-----------|-----------|---------|-------------|
-| Linear Regression | 1.467 | 1.882 | 0.782 | 42.6% | 73.3% | 34.9% | 79.4% |
-| Ridge Regression | 1.467 | 1.882 | 0.782 | 42.6% | 73.3% | 34.9% | 79.4% |
-| Lasso Regression | 1.475 | 1.891 | 0.780 | 42.2% | 73.0% | 34.6% | 79.3% |
-| Random Forest (Tuned) | 1.325 | 1.718 | 0.818 | 47.0% | 77.7% | 38.6% | 83.0% |
+| Linear Regression | 2.191 | 2.742 | 0.537 | 28.1% | 53.1% | 23.9% | 61.3% |
+| Ridge Regression | 2.191 | 2.742 | 0.537 | 28.1% | 53.1% | 23.9% | 61.3% |
+| Lasso Regression | 2.192 | 2.741 | 0.538 | 27.9% | 53.1% | 23.8% | 61.3% |
+| Random Forest (Tuned) | 1.788 | 2.293 | 0.676 | 36.1% | 64.3% | 30.2% | 70.8% |

 ### Key Findings

@@ -15,8 +15,8 @@
   - Linear models remain useful baselines but leave clear nonlinear signal unexplained.

 2. **Fine-grained difficulty prediction is meaningfully harder than grouped grade prediction.**
-   - On the held-out test set, the best model is within ±1 fine-grained difficulty score 47.0% of the time.
-   - The same model is within ±1 grouped V-grade 83.0% of the time.
+   - On the held-out test set, the best model is within ±1 fine-grained difficulty score 36.1% of the time.
+   - The same model is within ±1 grouped V-grade 70.8% of the time.

 3. **This gap is expected and informative.**
   - Small numeric errors often stay inside the same or adjacent V-grade buckets.