Surprise! Expectations.

Regarding Tetsuo, it’s now time for the most important part for this sweep of building: Expectations data surprise calculation, and somehow framing that data in a way that can be used as:

Features in the raw price/volume data.
Rank weighting after forecast.

This would provide an estimated 5-9% boost in directional accuracy and magnitude forecasting accuracy.

The expectations data includes EARNINGS_ESTIMATES function returns from the vendor. This is used for EPS estimates, revenue estimates. These are used to compute earnings surprise once actuals arrive or are forecasted.

The EARNINGS function provides actual EPS and revenue over time. These are the other half of the puzzle. Indicators are as regression features in the training dataset.

Actually I’ll just break my current understanding of the whole thing down:

EARNINGS_ESTIMATES (consensus forecasts)

Gives baseline expectations (EPS, revenue).
Feature: numerical value of consensus.
Used to compute earnings surprise once actuals arrive.

EARNINGS (reported actuals)

Provides actual EPS and revenue.
Compute earnings surprise = Actual − Estimate (absolute and %).
Surprise features:
- Raw difference
- Scaled difference (e.g., surprise ÷ estimate or ÷ price)
- Direction (positive vs negative)

EARNINGS_CALL_TRANSCRIPT sentiments

NLP sentiment from management remarks and Q&A.
Features:
- Overall sentiment score (−1 to 1)
- Labels (Bullish / Bearish / Neutral)
- Section-specific scores (prepared vs Q&A if provided)
Usage:
- Add as features to both classifier and regressor.
- Apply as a weight in ranking after model output (bias magnitude forecasts upward if sentiment strongly positive).

Integration Into Models

Directional classifier (LightGBM):
- Input: OHLCV features + earnings surprise + transcript sentiment.
- Surprise provides the core event signal, sentiment refines direction probability.
Magnitude regressor (LightGBM):
- Input: OHLCV features + surprise size + sentiment.
- Surprise size drives expected %Δ, sentiment fine-tunes ranking among positives.
Ranking:
- Keep only positive-direction signals.
- Rank by predicted magnitude.
- Weight rank by sentiment (e.g., multiply magnitude by (1 + sentiment score)).

This way: estimates + actuals → surprise = strongest predictive driver; transcript sentiment = secondary boost; news sentiment = external confirmation for rank weighting.

I’d be expecting a 15-20 point boost on precision@K for the top 5 and slightly less with greater variance for the top 10. These pieces /can/ do that based on everything I’ve learned, if implemented properly (whatever that means).

In any case, the projection of performance improvement of results would put ROI far, far above goal so it’s worth building and its alot more informed than previous tetsuo versions.

Next Steps

In terms of next steps, the earnings call transcript sentiments can be an afterthought really; but there’s a new component needed that pulls actual and expected EPS and revenue for suprise calculation as a new generated data source that will have a huge impact on bot accuracy.

Integration into the models is later. I need the data first.

Now one concern I’ve got is when to pull the data, as we’re already spending about 45-75 minutes pulling data after market closure with the current rate limits. I guess we have all night to perform the calculation as long as the winnners are selected before 11 the next trading day. I guess I won’t know how much of a problem the expectation/actual/surprise data operations will be until I start building it. It’s next up.

Truthfully this is a part I’ve been eager to get to for a while and is one of the reasons that prompted the rebuild to a distributed setup, as it provides just a gigantic boost to accuracy.

~~~

Edit:

~~~

So, after a break I came back to this and did some actual verification. I’ve been doing some exploration with GPT5 on this and it just lies through it’s fucking teeth constantly to the point that almost anything it says later becomes invalidated. I have new understandings:

EPS estimates are quarterly or annually and so are actual EPS value releases, making expectations not able to be calculated, but they can be forecasted. While some of them get revised every day across many symbols, you can’t use it for a rolling data point in time series forecasts with a fixed two-point horizon. You could potentially model shifts in estimates to indicate momentum but it wont plot a market response/reaction movement and it’ll be almost all noise.
Quarterly and annual EPS estimates/expectations surprises are still perfectly valid on the earnings announcement day. About 20 symbols a day on the NYSE have one.
That precludes it from being used as anything but event data instead of a pulse data regression feature.
So, this data is still useful, but not in the way I described. It’ll still greatly impact precision @5 and @10 and overall return. Symbols on earnings day get a positive or negative rank weight/boost based on surprise direction and magnitude.
NEWS_SENTIMENT is reportedly less reliable than initially thought but still more reliable than only price and volume. Also event data for rank weighting/boosting.
It may even be that we restrict to “only positive signals, only earning dates, only positive surprises” which will dramatically reduce the ranking pool but should give a very high ratio of increases. Only God knows what the numbers will look like so there is testing to do once all the pieces are in place.
Ex-dividend date needs accounted for.
To be clear, magnitude of forecasted %delta will be the ranking order, of a subset comprised of symbols with earnings days during the holding period –weighted by expectations surprise in the ranking and to a lesser degree also by the sentiment strength.

So this next component will need to pull EPS Estimates for all symbols daily for that day, refreshing and deduplicating by symbol identifier. It will need to also pull actual EPS records going back for that day and also for the whole trading range (150 days, static, doesn’t need updated or refreshed) so that future actual EPS can be forecasted with actual revenue as a regressor feature row and consensus revenue estimate to plug the future hole in the regressor row, and future actual EPS for the holding period can be forecasted, to calculate surprise for each symbol with earning dates during the hold period. Blech.

I’m not real thrilled about this tbh. I’ve been lied to repeatedly about how surprise calculations are integrated as forecast weights trying to speed up learning on this and it’s created more problems than it’s solved. At this point I’m convinced ChatGPT is designed to sabotage anything you tell it to do.

Surprise! Expectations.

EARNINGS_ESTIMATES (consensus forecasts)

EARNINGS (reported actuals)

EARNINGS_CALL_TRANSCRIPT sentiments

Integration Into Models

Next Steps

A Bigger Universe, A Brighter Future

Universe Inventory and TDM/R

Tetsuo MK-VII Goes Live: A New Dawn