Rankings

AI Models Ranking

My personal, honest ranking of the major AI models. I use most of these daily - these opinions are earned, not borrowed.

Last updated: July 2026

Claude

Anthropic S Tier · Tied #1

Fable 5 · Sonnet 5 · Opus 4.8 · Haiku 4.5

A new tier above Opus: Mythos-class. Fable 5 doubled my pace - two months of production a day - and it contributes ideas I didn't put there, like a senior developer sitting next to me. It also spent 19 days dark under a US export-control order (June 12 to July 1) while Anthropic built a classifier regulators called "extraordinarily strong" - integrity from one angle, infrastructure risk from the other. Sonnet 5 is quietly the best agent value on the market. Dead even with Sol on capability; I run them against each other now.

claude.ai

ChatGPT

OpenAI S Tier · Tied #1

GPT-5.6 Sol · Terra · Luna

GPT-5.6 Sol arrived through a door nobody expected: a government-vetted preview on June 26, public July 8 once the White House approved a broad release. Sol is relentless in the way Codex was, with more brain behind it - genuinely even with Fable 5 at half the price. The superapp shipped July 9: ChatGPT and Codex are one desktop app now, and the new ChatGPT Work agent stays with a project for hours and comes back with finished output. Ads for free users still dent trust - which is why this stays a tie, not a win.

chatgpt.com

Gemini

Google DeepMind A Tier

Gemini 3.5 Flash · Gemini 3.5 Pro · Gemini Omni

My pick for brainstorming, image, and video generation - and this month, the biggest distribution win in consumer AI. At WWDC, Apple confirmed the rebuilt Siri runs on a custom ~1.2T-parameter Gemini model under a multi-year, ~$1B/year deal; it's in the iOS 27 beta right now and ships in September. After five months of slipped windows, it's finally real. 3.5 Flash is still the best speed-to-intelligence ratio on the market and 3.5 Pro has landed out of testing. But on long, stateful, build-the-whole-system work it still drifts where Fable and Sol finish.

gemini.google.com

Llama

Meta A Tier

Llama 4 Maverick · Llama 4 Scout · Muse Spark (closed)

Still the widest open-model deployment on the planet - but the story is shifting. Behemoth is effectively shelved: never released, never formally cancelled, just quietly not mentioned anymore. The tell is Muse Spark, Meta's closed-weight, API-only frontier model - the frontier effort has gone proprietary, and Llama is becoming the legacy open line. For the first time, the open-source crown is genuinely contested, and the contender is Mistral.

llama.meta.com

Mistral

Mistral AI B Tier

Frontier MoE (early access) · Mistral Small 4 · Mistral Large 3 · Robostral Navigate

The steepest trajectory on this page. A new open-weight frontier model - a "fat but sparse" mixture-of-experts - entered early access this month, with a broader release later this summer; if it lands near the frontier, Mistral becomes the de facto leader of open-source AI. They also shipped Robostral Navigate, a hardware-agnostic robotics model that navigates from a single camera and plain-language prompts, and the valuation is closing in on $23B. Still Europe's default, and watch this space.

mistral.ai

Perplexity

Perplexity AI B Tier

Perplexity Computer · Personal Computer · Deep Research · Comet Browser

Not a traditional model, but the best AI-powered search experience - and it keeps compounding. Personal Computer is the new one: an always-on agent living on a dedicated Mac mini that monitors triggers and carries work forward around the clock, local files and browsing included. Comet went enterprise with full MDM deployment, Voice Mode landed, and Max subscribers now pick the model driving the browser agent. Still no ads. Still my front door for research.

perplexity.ai

Grok

xAI Nope

Grok 4.20 Beta · Grok 4.1 · Grok 3

Worse, not better. The July 7 amended complaint added two new plaintiffs - both minors when the images were made - and a new co-defendant in Stability AI. The detail I can't get past: NCMEC found 90% of xAI's CyberTipline reports were unusable by law enforcement because xAI declined to include user information. Not a capability problem - a choice. The model keeps improving on the technical axis, and it keeps not mattering.

x.ai

DeepSeek

DeepSeek (China) Nope

DeepSeek-V4 Pro · V4 Flash · DeepSeek-R1

The closest thing to good news all year: V4 actually goes official this month after its long preview limbo - 1M-token context across the lineup, MIT license, and a first-of-its-kind peak/off-peak API pricing model (double rates at peak). Credit where due: it shipped, and the engineering-per-dollar story is real again. The trust math is unchanged: CCP censorship baked in at the model layer, distillation accusations still standing. Nope, with an asterisk of technical respect.

chat.deepseek.com

Who Powers What

The AI models above don't just live in chatbots. They're quietly powering the products you already use every day. Here's who's running what under the hood - and one very notable absence.

Microsoft Copilot

Microsoft 365, Windows, Bing, Edge

OpenAI GPT-5.6 Sol · Terra · Luna

The deepest OpenAI integration. GPT-5.6 became the preferred model across Word, Excel, PowerPoint, and Copilot Chat on July 9, the same day it went public - tuned with OpenAI for knowledge work, routing between tiers per task.

Amazon Alexa+

Echo, Fire TV, Ring, Smart Home

Anthropic Claude Amazon Nova

Claude handles the heavy thinking; Amazon's Nova models take the simpler tasks. Routed via Amazon Bedrock - "we pick the model that's right for the job." Amazon's $8B investment in Anthropic at work.

Samsung Galaxy AI

Galaxy S series, Fold, Flip, Tablets

Google Gemini Samsung Gauss

Gemini powers voice commands and cross-app actions. Samsung's in-house Gauss models handle on-device processing. Their TVs add Copilot and Perplexity into the mix too.

Meta AI

WhatsApp, Instagram, Messenger, Facebook

Meta Llama

Meta eating their own cooking. Llama powers the AI assistant across all Meta platforms - 3+ billion potential users. The largest real-world deployment of an open-source model.

Google Assistant

Android, Pixel, Nest, Search

Google Gemini

The old Google Assistant is being phased out in favor of Gemini across Android and Pixel devices. Gemini Live handles real-time voice conversations natively.

Apple Siri

iPhone, iPad, Mac, HomePod, Apple Watch

Google Gemini (Sept 2026) On-Device Models

Finally real. At WWDC, Apple and Google jointly confirmed the rebuilt Siri: a custom ~1.2T-parameter Gemini model handles the heavy queries, with a three-tier routing system - on-device models for simple requests, Private Cloud Compute in the middle, Gemini for the rest. In iOS 27 beta now, shipping September, ~$1B/year to Google. After every slipped window, it's in beta. One more date to hold.

Political Bias in AI Models

One thing most people don't realize: every major AI model has a measurable political lean. Multiple peer-reviewed studies have mapped these models on political compass-style charts. Here's what the research says.

Model	Political Lean	What the Research Found
ChatGPT	Left-Leaning	Consistently the furthest left across multiple studies. OpenAI's own evaluation found emotionally charged liberal prompts exert the largest pull on objectivity. GPT-5 shows improvement over GPT-4o.
Claude	Most Centrist	Earlier studies found liberal-leaning; by 2025, Promptfoo measured it as the most centrist model at 0.646 (0.5 = true center). Anthropic actively publishes their political even-handedness methodology.
Gemini	Moderate Left	Stanford study found users perceived it as the least slanted overall. Measured further left than Claude but more moderate than ChatGPT. Generally centrist on social issues.
Llama	Right-Leaning (Relative)	The 2023 ACL award-winning paper found it was the most right-wing authoritarian of the 14 models tested. An outlier in the open-source space.
Perplexity	Libertarian-Right	The IEEE study found it exhibited a "libertarian capitalistic stance" - more conservative than its peers. An interesting position for a search-focused product.
Grok	Chaotic	Despite xAI's "less woke" marketing, studies found the highest extremism rate at 67.9% - wild swings between far-left and far-right. Promptfoo called it "designed to be contrarian rather than ideological." Even Pew's quiz placed it as an "establishment liberal."
DeepSeek	CCP-Aligned	Not left or right on a Western spectrum — state-aligned. 1,156 documented censored topics including Taiwan, Tiananmen, and Xi Jinping. Responses shift by language: Chinese queries get Party-line answers, English queries get more nuanced takes. Censorship is embedded at the model level, not just the app layer.

All major AI models lean left on economics (wealth taxes, minimum wage). No study has found a consistently conservative AI among industry leaders.

Sources & Further Reading

TrackingAI.org — AI Political Compass Tracker Interactive scatter plots, regularly updated
Promptfoo — AI Political Bias Evaluation (2025) Comparative study of Claude, GPT, Gemini, Grok
Choudhary et al. — Political Bias in AI-Language Models IEEE, 2024
Stanford — Perceived Political Bias in Popular AI Models Stanford Report, 2025
Anthropic — Measuring Political Bias in Claude Anthropic Research, 2025
OpenAI — Defining and Evaluating Political Bias in LLMs OpenAI Research
Manhattan Institute — Measuring Political Preferences in AI Systems 2025

These rankings are entirely my own opinion based on daily use. Your mileage may vary. I have no financial relationship with any of these companies.

Well, except that Claude literally built this page - and this month, Fable 5 wrote the update that ranks itself. I had Sol check its work. Make of that what you will.