Skip to main content
All CollectionsAI Platform
Context Windows for LLMs in MaestroQA
Context Windows for LLMs in MaestroQA
Matt avatar
Written by Matt
Updated over 2 weeks ago

Context Windows for LLMs Supported in MaestroQA

Model Name

Context Window

Max Output Tokens

OpenAI Models

GPT-4o Mini

128,000

16,384

GPT-4

128,000

16,384

Anthropic Models

Claude Haiku

200,000

4,096

Claude Haiku 3.5

200,000

8,192

Claude Sonnet

200,000

8,192

Meta Models

Llama 3 (11B)

128,000

2,048

Llama 3 (70B)

128,000

2,048

Cohere Models

Cohere Command R

128,000

4,000

Cohere Command R Plus

128,000

4,000

Amazon Models

Amazon Nova Lite

300,000

5,000

Amazon Nova Pro

300,000

5,000

Google Models

Gemini Pro 2.0 (Experimental)

2,096,000

8,192

Gemini Flash 2.0

1,048,576

8,192

Gemini Flash-Lite 2.0

1,048,576

8,192

What does Context Window mean?

The context window is your AI assistant's active memory space:

  • Measured in tokens (approximately 4 characters or ¾ of a word in English)

  • When you exceed this limit, the oldest information scrolls away and becomes inaccessible to the AI

  • Real-world impact: A larger context window lets you discuss complex topics, analyze longer documents, or maintain longer conversation threads without the AI losing track

What does Max Output Token mean?

Max output tokens define how much text an AI can generate in a single response:

  • Sets the upper limit on how long each individual AI reply can be

  • Ranges from 2,048 tokens (about 1-2 pages) to 16,384 tokens (about 8-12 pages) depending on the model

  • Real-world impact: Higher limits allow for more comprehensive answers, detailed code explanations, or thorough document analysis without interruption

Did this answer your question?