About

LLM Comparison is a web application that allows users to compare responses from leading Large Language Models (LLMs) such as OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude. Users can input prompts, generate responses from various LLMs, and vote on their favorite responses. The app supports multiple versions of each model and includes visualizations of voting data using charts. To ensure fair usage, query limits are enforced for premium models, allowing up to three queries per day.

Models

OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude were selected for this project because they are among the most popular and powerful Large Language Models currently available to the general public. You can read about the capabilities of the models and their different versions below, with descriptions directly from their respective documentation.

Models below with yellow text are premium models. These models are considered premium because querying them via APIs cost fairly more than other models, hence why users are only allowed to query premium models 3 times a day.

Model Descriptions from OpenAI:

ModelCapabilities
GPT-4o miniAffordable and intelligent small model for fast, lightweight tasks
GPT-4oThe fastest and most affordable flagship model
GPT-4 Turbo and GPT-4The previous set of high-intelligence models
GPT-3.5 TurboA fast, inexpensive model for simple tasks

Model Descriptions from Google:

ModelCapabilities
Gemini 1.5 ProComplex reasoning tasks such as code and text generation, text editing, problem solving, data extraction and generation
Gemini 1.5 FlashFast and versatile performance across a diverse variety of tasks
Gemini 1.0 ProNatural language tasks, multi-turn text and code chat, and code generation

Model Descriptions from Anthropic:

ModelCapabilities
Claude 3.5 SonnetMost intelligent model, highest level of intelligence and capability
Claude 3 OpusPowerful model for highly complex tasks, top-level performance, intelligence, fluency, and understanding
Claude 3 SonnetBalance of intelligence and speed, strong utility, balanced for scaled deployments
Claude 3 HaikuFastest and most compact model for near-instant responsiveness, quick and accurate targeted performance