MAG, MAG, MAG...

I ended up being sick all weekend and couldn’t get much done on MAG. 🙁

What to change for MAG

Target: y_mag = (close_1100[T+2] - close_1100[T+1]) / close_1100[T+1] (a 1-day forward return; scale as bps if you prefer).
Preprocess: winsorize y at 1/99 pct to tame tails; optionally standardize.
Model: XGBRegressor (not Classifier). Start with Huber loss (loss_function='huber' via objective='reg:squarederror' + alpha not exposed; practical alternative: keep reg:squarederror + robust y). If you want uncertainty bands, add quantile heads (two extra regressors with objective='reg:quantileerror', quantile_alpha=0.15/0.85 on newer XGB; if unavailable, train with pinball loss manually using LightGBM or GradientBoostingQuantile).
Weights: optionally down-weight high-vol days with sample_weight = 1 / (rolling_vol + eps).
Eval: MAE (in bps), median AE, and Spearman corr with realized return. Don’t use accuracy.

Some solid default params:

xgb.XGBRegressor(
    n_estimators=200,
    learning_rate=0.08,
    max_depth=5,
    min_child_weight=5,
    subsample=0.9,
    colsample_bytree=0.9,
    reg_lambda=1.0,
    reg_alpha=0.0,
    tree_method='hist',
    max_bin=256,
    n_jobs=-1,
    random_state=42,
    objective='reg:squarederror'
)

Minimal MAG forecaster (drop-in shape)

class MagForecaster:
    def __init__(self):
        self.model = None
        self.feature_columns = None

    def _make_target(self, df):
        y = np.full(len(df), np.nan, dtype=float)
        for i in range(len(df) - 2):
            t1 = df.iloc[i+1]['close_1100']
            t2 = df.iloc[i+2]['close_1100']
            if pd.notna(t1) and pd.notna(t2) and t1 != 0:
                y[i] = (t2 - t1) / t1  # signed return
        return y

    def train(self, df):
        self.feature_columns = [c for c in df.columns if c.startswith((
            'close_','volume_','ema_','rsi_','macd_','bb_','obv_','pbf_',
            'vwap','price_vs_vwap','volume_imbalance','intraday_realized_vol',
            'vol_regime_change','open_gap','open_volume_spike',
            'volume_concentration','volume_acceleration',
            'uptick_','high_rejection','low_rejection',
            'volume_price_divergence','divergence_strength',
            'cumulative_delta','delta_acceleration',
            'range_position','range_size','range_expansion',
            'momentum_acceleration','momentum_consistency',
            'volume_price_confirmation','volume_surge_momentum',
            'price_extension','consecutive_up','consecutive_down','exhaustion_score',
            'daily_return','failed_breakout','failed_breakdown',
            'recent_range_position','new_3d_high','new_3d_low'
        )) or c in ['intraday_momentum','mfi','day_of_week','is_monday','is_friday','month','is_month_end']]

        y = self._make_target(df)
        mask = ~np.isnan(y)
        if mask.sum() < 30:
            raise ValueError("Not enough samples for magnitude regression")

        # winsorize target (1/99 pct)
        y_train = pd.Series(y[mask]).clip(lower=pd.Series(y[mask]).quantile(0.01),
                                          upper=pd.Series(y[mask]).quantile(0.99)).to_numpy()
        X_train = df.loc[mask, self.feature_columns].astype(np.float32)

        # optional inverse-vol weights
        vol = df.get('intraday_realized_vol', pd.Series(index=df.index, data=np.nan)).fillna(method='ffill').fillna(0.0)
        w = 1.0 / (vol.loc[mask].to_numpy() + 1e-6)

        self.model = xgb.XGBRegressor(
            n_estimators=200, learning_rate=0.08, max_depth=5,
            min_child_weight=5, subsample=0.9, colsample_bytree=0.9,
            reg_lambda=1.0, reg_alpha=0.0, tree_method='hist', max_bin=256,
            n_jobs=-1, random_state=42, objective='reg:squarederror'
        )
        self.model.fit(X_train, y_train, sample_weight=w)

        # keep data for forecast alignment
        self.data = df.copy()
        return {'trained': True}

    def forecast(self, df, preserve_history=False):
        if self.model is None or self.feature_columns is None:
            raise ValueError("Model not fitted. Call train() first.")
        X_last2 = self.data[self.feature_columns].iloc[-2:].astype(np.float32)
        preds = self.model.predict(X_last2)
        # write back into a copy for inspection
        out = self.data.copy()
        out.loc[out.index[-2:], 'predicted_mag'] = preds
        return out if preserve_history else out.iloc[-2:]

Variants worth trying (often help MAG)

Two-stage: classify sign (your current Forecaster), then regress |return| and reapply sign.
Quantile heads: predict P15/P50/P85 to get an interval; use P50 for point, band for risk.
Heteroskedastic modeling: second regressor for absolute error (uncertainty), use it to size positions.

I just don’t have enough time to tie it all together in between all the coughing and sneezing and blech’ing.

MAG, MAG, MAG…

What to change for MAG

Variants worth trying (often help MAG)

A Bigger Universe, A Brighter Future

Universe Inventory and TDM/R

Tetsuo MK-VII Goes Live: A New Dawn