Skip to content

AI Providers

Alma supports multiple AI providers, giving you flexibility in choosing models and managing costs.

Supported Providers

ProviderModelsStreamingToolsVision
OpenAIGPT-4o, GPT-4, GPT-3.5
AnthropicClaude 3.5, Claude 3
Google GeminiGemini Pro, Gemini Flash
DeepSeekDeepSeek Chat, DeepSeek Coder
Azure OpenAIAll Azure-hosted models
OpenRouter100+ modelsVaries
CustomAny OpenAI-compatible APIVariesVaries

Adding a Provider

  1. Open Settings (Cmd+, / Ctrl+,)
  2. Navigate to Providers
  3. Click Add Provider
  4. Select the provider type
  5. Enter your API key and configuration
  6. Click Save

Provider Configuration

Each provider has these common settings:

  • Name - A display name for the provider
  • API Key - Your authentication key
  • Base URL (optional) - Custom endpoint URL
  • Enabled - Toggle to enable/disable the provider

Managing Models

After adding a provider, Alma automatically fetches available models. You can:

  • Fetch Models - Click to refresh the model list from the provider
  • Enable/Disable Models - Toggle which models appear in the model selector
  • Add Custom Models - Manually add model IDs not in the fetched list

Multiple Providers

You can configure multiple providers of the same type (e.g., two OpenAI accounts with different API keys). Each will appear separately in the model selector.

Testing Providers

After configuration, test your provider:

  1. Click the Test button in provider settings
  2. Alma sends a simple request to verify the connection
  3. If successful, you'll see a confirmation message

Best Practices

API Key Security

  • Alma stores API keys locally and encrypts them
  • Never share your API keys
  • Use separate keys for development and production

Cost Management

  • Use different models for different tasks (e.g., GPT-3.5 for simple queries)
  • Monitor usage through provider dashboards
  • Consider OpenRouter for pay-per-use pricing across providers

Performance

  • Providers closer to your location typically have lower latency
  • Some models support caching for faster responses
  • Use streaming for better perceived performance