Skip to content

General Settings

Basic application settings and behavior.

Tool Model

The Tool Model is a dedicated model used for background AI operations, separate from your main chat model. This allows you to use a fast, cost-effective model for auxiliary tasks while using more capable (and expensive) models for conversation.

What Tool Model Does

Tool Model is used for:

  • Thread Title Generation: Automatically generating titles for new chat threads
  • Tool Selection: Analyzing which tools to use for a given task
  • Parameter Extraction: Parsing user requests into tool parameters
  • Memory Operations: Processing memory storage and retrieval (unless overridden in Memory Settings)
  • Background Tasks: Handling auxiliary AI operations

Why Use a Separate Tool Model

BenefitDescription
Cost SavingsTool operations are frequent but simple - using a cheaper model reduces costs significantly
SpeedSmaller models respond faster, making tool operations feel instant
EfficiencyYour main model focuses on conversation while the tool model handles mechanics

Why Both Speed and Quality Matter

A good Tool Model must balance speed and generation quality—you can't sacrifice one for the other.

Why Speed Matters

Tool operations happen constantly during your conversation:

  • Every new chat needs a title generated
  • Every message may trigger tool selection analysis
  • Every tool call requires parameter extraction

If the Tool Model is slow (3+ seconds per operation), these micro-delays accumulate and make the entire experience feel sluggish. Users expect title generation and tool operations to feel instant—ideally completing in under 1 second.

Why Quality Still Matters

Despite needing speed, the Tool Model's tasks require genuine intelligence:

TaskWhat Can Go Wrong with Low Quality
Thread TitlesGeneric titles like "Chat about code" instead of meaningful summaries like "Debugging React useEffect infinite loop"
Tool SelectionCalling the wrong tool or missing when a tool should be used entirely
Parameter ExtractionMisinterpreting "search for files modified last week" as searching for the literal text "last week"
Memory OperationsStoring irrelevant information or failing to retrieve relevant context

A model that's fast but inaccurate will cause frustrating errors throughout your workflow. A model that's accurate but slow will make the app feel unresponsive.

The Sweet Spot

The recommended models (like gpt-4o-mini, claude-haiku-4-5, gemini-2.0-flash) hit the sweet spot: they're small enough to respond in under 1 second, but capable enough to handle tool operations accurately.

Choosing a Tool Model

Recommended models:

ProviderRecommendedWhy
OpenAIgpt-4o-miniFast, excellent tool support
Anthropicclaude-haiku-4-5Very fast, good quality
Googlegemini-2.0-flash, gemini-1.5-flashExtremely fast
DeepSeekdeepseek-chatFast and cost-effective

Avoid Reasoning Models

Never use reasoning/thinking models as Tool Model:

  • OpenAI o1, o3 series
  • Anthropic models with extended thinking enabled
  • Any model designed for deep reasoning

These models are optimized for complex problem-solving and take significantly longer to respond (10-60+ seconds), making them completely unsuitable for tool operations that need instant responses.

Auto-Detection

If you don't configure a Tool Model, Alma automatically selects one based on your enabled providers:

  1. Checks enabled providers in priority order (OpenAI → Anthropic → Google → others)
  2. Selects the fastest recommended model available
  3. Falls back to your chat model if no suitable tool model is found

Testing Your Tool Model

Use the Test button next to the model selector to verify performance:

ResultResponse TimeMeaning
✅ Good< 2.5sOptimal for tool operations
⚠️ Slow2.5s - 5sUsable but may feel sluggish
❌ Unusable> 5sToo slow for responsive tool use

Language

Choose the interface language:

  • English (en)
  • Chinese (zh)
  • Japanese (ja)

Changes take effect immediately.

Startup

Auto-Start

Launch Alma automatically when you log in to your computer.

Start Minimized

Start in the system tray instead of opening a window.

Window Behavior

Minimize to Tray

When minimizing the window, minimize to system tray instead of taskbar.

Close to Tray

When clicking the close button, minimize to tray instead of quitting the application.

TIP

These options are useful for keeping Alma running in the background for quick access via the Quick Chat shortcut.

Quick Chat

Hide on Blur

When the Quick Chat window loses focus, automatically hide it.

When enabled, the Quick Chat window will close when you click outside of it. When disabled, it stays visible until you explicitly close it.