Weather Report

Weather Report

What models we're running today, how they're configured, and what role each one plays in the factory.

Subscribe via RSS

Why do we publish The Weather Report? The Weather Report started out as a casual internal summary of how each provider and model was performing on our most important use cases. We update it frequently and have found it essential to our process. The report is not an eval or benchmark: it's a consensus experience report reflecting what's actually working for us right now.

Today's Models

UseModels (by preference)ParametersNotes
CS/Math Hard Problemsgpt-5.3-codexdefault
Image comprehensiongemini-3-flash-previewdefault
Frontend Aestheticsopus-4.6default
Frontend Architecturegpt-5.3-codexdefault
Architectural Critiquegpt-5.2extra high
Sprint Planningconsensus(opus-4.6, gpt-5.2)high
Devops Tasksopus-4.6default
QA Orchestrationopus-4.6default
Security reviewgpt-5.3-codexhigh
Bulk classificationAnydefaultGo up cost and strength as needed
Bulk MapReduceAnydefaultGo up cost and strength as needed
Consensus operator refers to an LLM merge of the points from independent plans.

Log

February 6th, 2026

New models this week. We're very happy with gpt-5.3-codex. No problems with Opus 4.6 so far.