The original words of Phanes, tirelessly carved into a slab of "No'".

" />

MAG, MAG, MAG…

I ended up being sick all weekend and couldn’t get much done on MAG. 🙁

What to change for MAG

  • Target: y_mag = (close_1100[T+2] - close_1100[T+1]) / close_1100[T+1] (a 1-day forward return; scale as bps if you prefer).
  • Preprocess: winsorize y at 1/99 pct to tame tails; optionally standardize.
  • Model: XGBRegressor (not Classifier). Start with Huber loss (loss_function='huber' via objective='reg:squarederror' + alpha not exposed; practical alternative: keep reg:squarederror + robust y). If you want uncertainty bands, add quantile heads (two extra regressors with objective='reg:quantileerror', quantile_alpha=0.15/0.85 on newer XGB; if unavailable, train with pinball loss manually using LightGBM or GradientBoostingQuantile).
  • Weights: optionally down-weight high-vol days with sample_weight = 1 / (rolling_vol + eps).
  • Eval: MAE (in bps), median AE, and Spearman corr with realized return. Don’t use accuracy.

Some solid default params:

xgb.XGBRegressor(
    n_estimators=200,
    learning_rate=0.08,
    max_depth=5,
    min_child_weight=5,
    subsample=0.9,
    colsample_bytree=0.9,
    reg_lambda=1.0,
    reg_alpha=0.0,
    tree_method='hist',
    max_bin=256,
    n_jobs=-1,
    random_state=42,
    objective='reg:squarederror'
)

Minimal MAG forecaster (drop-in shape)

class MagForecaster:
    def __init__(self):
        self.model = None
        self.feature_columns = None

    def _make_target(self, df):
        y = np.full(len(df), np.nan, dtype=float)
        for i in range(len(df) - 2):
            t1 = df.iloc[i+1]['close_1100']
            t2 = df.iloc[i+2]['close_1100']
            if pd.notna(t1) and pd.notna(t2) and t1 != 0:
                y[i] = (t2 - t1) / t1  # signed return
        return y

    def train(self, df):
        self.feature_columns = [c for c in df.columns if c.startswith((
            'close_','volume_','ema_','rsi_','macd_','bb_','obv_','pbf_',
            'vwap','price_vs_vwap','volume_imbalance','intraday_realized_vol',
            'vol_regime_change','open_gap','open_volume_spike',
            'volume_concentration','volume_acceleration',
            'uptick_','high_rejection','low_rejection',
            'volume_price_divergence','divergence_strength',
            'cumulative_delta','delta_acceleration',
            'range_position','range_size','range_expansion',
            'momentum_acceleration','momentum_consistency',
            'volume_price_confirmation','volume_surge_momentum',
            'price_extension','consecutive_up','consecutive_down','exhaustion_score',
            'daily_return','failed_breakout','failed_breakdown',
            'recent_range_position','new_3d_high','new_3d_low'
        )) or c in ['intraday_momentum','mfi','day_of_week','is_monday','is_friday','month','is_month_end']]

        y = self._make_target(df)
        mask = ~np.isnan(y)
        if mask.sum() < 30:
            raise ValueError("Not enough samples for magnitude regression")

        # winsorize target (1/99 pct)
        y_train = pd.Series(y[mask]).clip(lower=pd.Series(y[mask]).quantile(0.01),
                                          upper=pd.Series(y[mask]).quantile(0.99)).to_numpy()
        X_train = df.loc[mask, self.feature_columns].astype(np.float32)

        # optional inverse-vol weights
        vol = df.get('intraday_realized_vol', pd.Series(index=df.index, data=np.nan)).fillna(method='ffill').fillna(0.0)
        w = 1.0 / (vol.loc[mask].to_numpy() + 1e-6)

        self.model = xgb.XGBRegressor(
            n_estimators=200, learning_rate=0.08, max_depth=5,
            min_child_weight=5, subsample=0.9, colsample_bytree=0.9,
            reg_lambda=1.0, reg_alpha=0.0, tree_method='hist', max_bin=256,
            n_jobs=-1, random_state=42, objective='reg:squarederror'
        )
        self.model.fit(X_train, y_train, sample_weight=w)

        # keep data for forecast alignment
        self.data = df.copy()
        return {'trained': True}

    def forecast(self, df, preserve_history=False):
        if self.model is None or self.feature_columns is None:
            raise ValueError("Model not fitted. Call train() first.")
        X_last2 = self.data[self.feature_columns].iloc[-2:].astype(np.float32)
        preds = self.model.predict(X_last2)
        # write back into a copy for inspection
        out = self.data.copy()
        out.loc[out.index[-2:], 'predicted_mag'] = preds
        return out if preserve_history else out.iloc[-2:]

Variants worth trying (often help MAG)

  • Two-stage: classify sign (your current Forecaster), then regress |return| and reapply sign.
  • Quantile heads: predict P15/P50/P85 to get an interval; use P50 for point, band for risk.
  • Heteroskedastic modeling: second regressor for absolute error (uncertainty), use it to size positions.

I just don’t have enough time to tie it all together in between all the coughing and sneezing and blech’ing.

Next Post

© 2025 Phanes' Canon

The Personal Blog of Chris Punches