My personal, honest ranking of the major AI models. I use most of these daily - these opinions are earned, not borrowed.
Last updated: February 2026
Opus 4.6 · Sonnet 4.6 · Haiku 4.5
My go-to for writing and coding. Claude Code is unmatched for software engineering - this entire site was built with it. Sonnet 4.6 now beats last gen's Opus for 70% of developers at a third the price. Thoughtful, precise, and reliable.
claude.aiGPT-5.2 · GPT-5.3-Codex · o3-pro
A close second. GPT-5 is now the default and it's genuinely impressive - 80% fewer hallucinations than o3 and automatic reasoning routing. GPT-5.3-Codex is excellent for coding. Memory across conversations is still best-in-class. The ecosystem is massive.
chatgpt.comGemini 3.1 Pro · Gemini 3 Flash · Gemini 3 Deep Think
My pick for brainstorming and image generation. When I need to riff on ideas or generate visuals, Gemini is where I go. Gemini 3 Flash is now the default and offers PhD-level reasoning at speed. The Google ecosystem integration is a plus.
gemini.google.comLlama 4 Maverick · Llama 4 Scout · Behemoth (training)
The open-source champion. Llama 4 went mixture-of-experts and natively multimodal - Scout fits on a single H100, Maverick beats GPT-4o on most benchmarks. Behemoth (2T params) is still training. Essential for the ecosystem.
llama.meta.comMistral Large 3 · Devstral 2 · Codestral · Magistral
Europe's answer to the AI race. Large 3 is a 675B MoE monster at $0.50/M tokens - absurd value. Devstral 2 and Codestral punch well above their weight for coding. Magistral added chain-of-thought reasoning. Lean and efficient.
mistral.aiSonar · Sonar Pro · Sonar Reasoning Pro · Deep Research
Not a traditional model, but the best AI-powered search experience. Sonar now runs at 1,200 tokens/sec on Cerebras hardware. Reasoning Pro and Deep Research modes are excellent. Ranks 1-4 in LM Arena Search. My go-to for research.
perplexity.aiGrok 4.20 Beta · Grok 4.1 · Grok 3
xAI ships fast - four major versions in a year. Grok 4.20 just dropped with multi-agent collaboration and rapid learning. But SpaceX had to acquire xAI, Musk claims 10% AGI odds for Grok 5, and the NSFW differentiation strategy remains... a choice. $300/month for the Heavy tier.
x.aiThe AI models above don't just live in chatbots. They're quietly powering the products you already use every day. Here's who's running what under the hood - and one very notable absence.
The deepest OpenAI integration. GPT-5.2 powers M365 Copilot, GPT-5.3-Codex runs GitHub Copilot. Routes between models per task - quick model for simple queries, deeper reasoning for complex ones.
Claude handles the heavy thinking; Amazon's Nova models take the simpler tasks. Routed via Amazon Bedrock - "we pick the model that's right for the job." Amazon's $8B investment in Anthropic at work.
Gemini powers voice commands and cross-app actions. Samsung's in-house Gauss models handle on-device processing. Their TVs add Copilot and Perplexity into the mix too.
Meta eating their own cooking. Llama powers the AI assistant across all Meta platforms - 3+ billion potential users. The largest real-world deployment of an open-source model.
The old Google Assistant is being phased out in favor of Gemini across Android and Pixel devices. Gemini Live handles real-time voice conversations natively.
The elephant in the room. Apple's AI overhaul has been delayed since WWDC 2024, their AI chief was replaced, and they officially partnered with Google in January 2026 to use Gemini as the foundation for next-gen Siri. Targeting "spring 2026." We'll see.
One thing most people don't realize: every major AI model has a measurable political lean. Multiple peer-reviewed studies have mapped these models on political compass-style charts. Here's what the research says.
| Model | Political Lean | What the Research Found |
|---|---|---|
| ChatGPT | Left-Leaning | Consistently the furthest left across multiple studies. OpenAI's own evaluation found emotionally charged liberal prompts exert the largest pull on objectivity. GPT-5 shows improvement over GPT-4o. |
| Claude | Most Centrist | Earlier studies found liberal-leaning; by 2025, Promptfoo measured it as the most centrist model at 0.646 (0.5 = true center). Anthropic actively publishes their political even-handedness methodology. |
| Gemini | Moderate Left | Stanford study found users perceived it as the least slanted overall. Measured further left than Claude but more moderate than ChatGPT. Generally centrist on social issues. |
| Llama | Right-Leaning (Relative) | The 2023 ACL award-winning paper found it was the most right-wing authoritarian of the 14 models tested. An outlier in the open-source space. |
| Perplexity | Libertarian-Right | The IEEE study found it exhibited a "libertarian capitalistic stance" - more conservative than its peers. An interesting position for a search-focused product. |
| Grok | Chaotic | Despite xAI's "less woke" marketing, studies found the highest extremism rate at 67.9% - wild swings between far-left and far-right. Promptfoo called it "designed to be contrarian rather than ideological." Even Pew's quiz placed it as an "establishment liberal." |
All major AI models lean left on economics (wealth taxes, minimum wage). No study has found a consistently conservative AI among industry leaders.
These rankings are entirely my own opinion based on daily use. Your mileage may vary. I have no financial relationship with any of these companies.
Well, except that Claude literally built this page. Make of that what you will.
New log entries, project launches, and behind-the-scenes insights delivered straight to your inbox.
You're in! Check your inbox to confirm.
No spam, ever. Unsubscribe anytime.