AI Model Watch
Best Unreleased Frontier Model
Preview / unreleased watch
A separate live watch page for preview, alpha, beta, internal, research, prototype, and other unreleased model rows. Treat it as a roadmap signal, not as a production recommendation.
Updated 2026-06-09
Static fallback
The live dossier loads when benchmark data is available
This route is designed to make sense even if the public feed is unavailable, rate limited, or not yet served through SSR.
Why unreleased needs its own page
Preview models can distort a public buying decision. They may not be broadly available, may change before release, and may have pricing or rate limits that do not match production access.
What the signal is good for
Use unreleased benchmark rows to see where the frontier is moving: coding ceilings, agentic scores, long-context claims, reasoning strength, and likely pressure on the next released models.
What not to assume
Do not assume a preview row is available, stable, priced, safe, or identical to the eventual released model. Compare it against the released #1 and #2 rows before making a plan.
Use it well
How to turn a winner into a decision
The live winner is the beginning of evaluation, not the end.
Use previews as horizon signals
A preview leader can show where the market is moving, but it should not replace released-model evaluation until access and terms are real.
Compare against released #1 and #2
A preview score only matters if it changes the decision against the best available public models.
Wait for production constraints
Availability, pricing, rate limits, safety behavior, and model identity can change before a preview becomes a released model.
FAQ
Quick answers
What counts as unreleased here?
Rows with preview, alpha, beta, experimental, internal, research, prototype, or similar labels are treated as unreleased. The detection uses the model name and source-type fields from the public feed.
Can an unreleased model be the real best model?
It can be the highest-scoring row in the feed, but that is different from being the best model to buy or deploy. Availability, terms, safety behavior, limits, and pricing can change before public release.
Why keep this separate from best frontier model today?
The main page should answer what released model to inspect first. This page answers what preview model is setting the visible frontier and what it might imply for upcoming releases.
Should teams test preview models?
Yes, when they have legitimate access and a clear reason. Test previews against the same local tasks as released models, but do not build a production dependency on a preview-only result.
Inspect first
Sources
- BenchLM public leaderboard endpoint
- BenchLM public pricing endpoint
- Models.dev model database
- Best released frontier model page
- LLM model benchmarks guide
- LMArena Leaderboard
- SWE-bench repository
- Terminal-Bench 2.1
Third-party data note: live rows come from public benchmark and pricing feeds, not internal Dreamers testing. Preview, alpha, beta, internal, research, and prototype rows are excluded before leaders are shown.