Historical EPS and revenue analyst estimates, paired to their earnings announcement, and upsets for that window along with the EPS estimates for the next forecast day, are now automatically refreshed on each trading day at exp.silogroup.org
Code quality could be better but this was a unique solution, so, the first 3 versions are going to suck with me anyway.
This was a BITCH to work out and I don’t even know how to explain to someone reading this the way in which it was or why this data is even important for ranked forecasts.
So, to understand this, you have to first know that every publicly traded company operates on what is called a “FISCAL YEAR”. You’ve probably heard people in the accounting office use that term. You’re probably more familiar with “the quarter” or “the quarterly report”. This is the “fiscal quarter”.
What most people probably don’t know is that every company that’s traded declares their own “fiscal year ending” date, which is “the last day of their fiscal year. So far, so good. Simple enough.
Well, when regular folks talk about “Q1” or “Q2” or “Q3” or “Q4”, they are almost always talking about “january-march”, “april-june”, “july-sep”, “oct-dec” for the quarter date ranges. So most humans think of quarters as:
- 2025/Q1: 2025-01-01 → 2025-03-31
- 2025/Q2: 2025-04-01 → 2025-06-30
- 2025/Q3: 2025-07-01 → 2025-09-30
- 2025/Q4: 2025-10-01 → 2025-12-31
Not in corporate finance. That’s too easy. For publicly traded companies, their quarters are not based on the calendar year, they’re based on the FYE date, or “fiscal year ending” date. Sometimes that’s December 31st and everybody wins. The relevance of this boring piece of information is that each quarter, traded companies are required to report their earnings for investors to review and buy/sell decisions are made based on these reports and other data. These earnings figures have a significant impact on the price of the company on the stock market. So Q1 has a report, Q2, Q3, Q4 and that 4th one is an annual report.
Sometimes, though, that FYE date, it’s fucking January 31st. Sometimes it’s February 28th. They can pick pretty much any date they want for an FYE. June. October. Any date they want.
So let’s say they pick January 31st for their FYE date. They’re allowed, great. Well, that means after almost all your business for the year being in 2025, your earnings announcement 2025-04-30 is for fucking 2024Q1 instead of 2025Q2. That’s right, your whole financial year is behind a year on a different cadence than the entire rest of the sane world. You’d end up with, for January 31st FYE:
- 2024Q1: 2024-02-01 → 2024-04-30
- 2024Q2: 2024-05-01 → 2024-07-31
- 2024Q3: 2024-08-01 → 2024-10-31
- 2024Q4: 2024-11-01 → 2025-01-31
I wish that were the end of the madness. It’s even weirder than that.
You see, those earnings reports tell investors buying stock what the company earnings are and this weights in strongly with the price of the stock and the investor interest in buying shares. So, they rely on analysts to estimate the “earnings per share” value, which is a figure related to that I won’t go into here other than to say it all ties together in the investor math to influence the price of the stock.
These analysts submit a revenue and EPS “estimate” to various financial journals and feeds and investors read those estimates and do their own reckoning.
Those estimates also come with a date: That date is the “last date of the financial quarter the estimate corresponds to”. So you’d have to know the company’s unique quarterly date ranges to align the estimate to the company’s actual announcement of their earnings for the historical data to even be analyzed.
It gets even weirder.
Above, I used “divide FYE by 4” to get the quarterly date ranges. That’s not universal either. There are many quarterly date range schedules these companies use that are all very similar but add enough wiggle room that a global rule is not really viable unless you’re ok with some messy action. CHWLY is a good example. They don’t divide FYE by 4, they do “the previous sunday up to the division by 4” so you have to have window padding on the date ranges.
So, I mean, I’m Chris Punches, I made it work, but, no wonder people go crazy studying the stock market numbers, these numbers are made up by practically supervillains. This is hamburglar shit that needs standardization. I’m just saying, I don’t want to be stuck in an elevator with someone who thinks the fiscal year should end in January, because that’s someone who is planning on the world to end.
So, my eyes are bleeding from aligning EPS estimates to historical earnings announcements and “tomorrow’s schedule of earnings announcements” to fetch EPS estimates for tomorrow to extrapolate data into for an important part of this puzzle: surprises.
I talked a little about surprises in my last post, but, essentially, if the EPS estimate for an earnings announcement for a stock is under, the stock goes up that day. If it’s over, the stock typically goes down that day. That’s what this whole EXP dataset exists for, is predicting that for all the symbols that announce earnings that day.
Unfortunately historical values do not indicate future values for this. But, there’s something that does, which is in the same data: revisions. More specifically, revision momentum in the lead-up trail to the announcement. Analysts get itchy pants as the earnings announcement day approaches for a stock they’ve been watching and they update their estimates and submit those to the journals that are tracking their analysis. They will come across new data that revises their estimate figures. So sometimes an event will happen or they’ll see something in the numbers that makes them more or less likely to think the stock will go up or down, and they’ll update that estimate, submit it– and the frequency by which analysts do that (keep in mind many analysts submit for many stocks) correlates with “surprises”, which, again, is when the stock goes up or down because of drift between actual EPS reported in the earnings announcement and the estimated EPS by the analysts on that trading day. So, I’m going to use “revision momentum” as a signal along with some other heuristics.
Eyeball crossing shit getting these dates to align. In any case, this is just the dataset to do the forecasting when I get to that part.
At this point I’ve got everything I need to get the new forecast engine built. I have:
- TDM (tdm.silogroup.org), which daily grabs about 150 days of price and volume data at 1-minute increments leading up to the closing of that trading day (it runs as soon as the market closes).
- SAP (sap.silogroup.org), which grabs daily sentiment data from various financial journals/feeds that investors use to pick their stocks, and this data is already processed from raw article to “stock symbol sentiment” so this is a time series and momentum based forecast element when the forecast engine is built.
- EXP (exp.silogroup.org), which does what this article is about.
There are still some issues. SAP, for example, doesn’t send out email reports when it runs because I never added it in, and EXP’s email reports need a little maturity polishing. And while all 3 components are made to operate independently on different servers with state files and last run time for orchestration, there either needs to be an orchestration component that watches these files for brokering actions or these need to be hardcoded into a pipeline that watches each other’s state files and last run timestamp. I like the idea of a broker component because it’ll let me do testing without conflicts, but it would be simpler to have them just read each other in a fixed pipeline.
TDM sometimes will fail on some symbols and needs run again. I believe this is an issue with the data provider or some CDN they might be using. It’s not a huge issue as it’ll throw out incomplete data to prevent bad forecasts for whatever engine is pulling the data from it.
Once those issues are worked out I’ll be able to start building:
- The new price/volume forecast engine. Most of this is already built from previous versions of tetsuo though this version will be a full rewrite.
- The Sentiment analyzer. This will largely be used for rank weighting of the forecasts from the engine but is likely to involve some momentum forecasting in the process of that.
- The surprise forecaster. Between like 2 and 22 common stock symbols on the NYSE every day have an earnings announcement. We’ll want to use this to give strong rank priority to symbols whose price is forecasted to go up and also are going to have a surprise based on EPS revision momentum.
This is a lot of work but it’s a much cleaner approach than previous efforts.