Model Selection Reference

Last updated 10 months ago

This reference explains (1) available models, (2) how auto model selection works, and (3) how manual model selection works.

Available Models

Model	Original Provider	GDPR Compliant	Roll Out
GPT-5	OpenAI	Yes	Workspace by workspace (contact AE)
GPT-4o	OpenAI	Yes	Available for everyone
GPT-4o mini	OpenAI	Yes	Available for everyone
Mistral large	Mistral AI	Yes	Available for everyone
Gemini 2.0 Flash	Google	Yes	Available for everyone
Claude 3.7 Sonnet	Anthropic	Yes	Available for everyone
Claude 4 Sonnet (Reasoning)	Anthropic	Yes	Available for everyone
Gemini 2.5 Pro (Reasoning)	Google	Yes	Available for everyone

File Support: All models support all files: PDF, DOCX, TXT, PPTX (files are converted to text during additional MAIA's analysis step)

Addition Of New Models

We’re constantly evaluating and testing new base models. Once we can ensure that a new base model is available GDPR-compliant, works well for the use cases our customers have, and we can provide it in a reliable way, we will add it to the available models.

Removal: Because MAIA should be accessible to every user, we constantly check whether the available models still met the bar. We aim to only have models available that will work well even if a user selects them by mistake.

Auto Model Selection (Default)

How it works:

Per-query analysis: For every single query, MAIA analyzes the question type before responding
Model selection: A decision system chooses the optimal model based on benchmark data for different task types
Dynamic switching: Models may change between queries within the same chat based on question content

Decision basis: Large benchmark table showing model performance across different task types

Transparency: With Auto Model active, the top of each query will reveal which model was used (after creation of the full answer).

Manual Model Selection

How to enable:

Turn off auto model selection in chat interface
Select your preferred model from dropdown
Persistence: Your choice persists across all future queries and new chats

How it works:

Single selection: Choose model once, applies to all subsequent queries
Consistent experience: Same model used until you manually change it
Cross-chat persistence: Selected model remains active in new conversations

Reasoning Models

Gemini 2.5 Pro and Claude 4 Sonnet support so-called reasoning. In this process, the model first “thinks” internally about a request before formulating the actual answer. It uses additional intermediate steps (“thinking tokens”) that help handle complex tasks more effectively. Since the model goes through more steps internally, reasoning answers may take a bit longer than normal ones. That’s normal and means the model is actively “thinking.”

Reasoning can be activated automatically in auto mode depending on the model, or it can be selected manually. Each user currently has 50 reasoning requests per model within a rolling 30-day window.

After reaching the limit, the model is excluded from auto mode. The number of thinking tokens a model can use for its “thoughts” depends on the provider:

Claude 4 Sonnet works with a budget of up to 16,384 thinking tokens
Gemini 2.5 Pro with up to 32,768 thinking tokens

These tokens are not visible. They run in the background and serve exclusively to improve the quality of answers.