AI Backends

A backend is a connection to an AI provider. It tells Magec where to send requests for text generation, embeddings, speech-to-text, or text-to-speech. You can have as many backends as you want — one per provider, or several pointing at different models or instances of the same provider.

Every agent references a backend for its LLM. Optionally, an agent can also reference backends for voice (STT, TTS) and the memory system references one for embeddings. This means you can mix providers freely: one agent can use OpenAI for its brain and a local Ollama for embeddings, another can use Anthropic for reasoning and the same OpenAI for voice.

Admin UI — Backends

Backend types

Click + New Backend to create one. All types share the same dialog:

Admin UI — New Backend dialog

OpenAI (openai)

Works with OpenAI’s API and any service that implements the same protocol. This includes Ollama, LM Studio, vLLM, LocalAI, and many others. If a service exposes /v1/chat/completions, this is the type to use.

FieldRequiredDescription
nameYesDisplay name (e.g., “OpenAI Production”, “Local Ollama”)
urlNoAPI base URL. Defaults to https://api.openai.com/v1. For Ollama, use http://ollama:11434/v1.
apiKeyNoAPI key. Required for OpenAI. Not needed for local services without auth.

Common configurations:

# OpenAI Cloud
Name: OpenAI
URL:  (leave empty — uses default)
Key:  sk-...

# Local Ollama
Name: Ollama Local
URL:  http://ollama:11434/v1
Key:  (leave empty)

# Local Parakeet (STT)
Name: Parakeet
URL:  http://parakeet:8888
Key:  (leave empty)

# OpenAI Edge TTS (local TTS)
Name: Edge TTS
URL:  http://tts:5050
Key:  (leave empty)
The openai type is the most versatile because the OpenAI API has become a de facto standard. Most local inference servers implement it, so you can run fully local without any code changes.

Anthropic (anthropic)

For Anthropic’s Claude models. Uses the official Anthropic API protocol, which is different from OpenAI’s.

FieldRequiredDescription
nameYesDisplay name
apiKeyYesAnthropic API key (starts with sk-ant-)

Anthropic doesn’t offer STT, TTS, or embedding APIs, so this backend type is used only for LLM inference. For voice and embeddings, add a separate openai-type backend pointing at a local service.

Google Gemini (gemini)

For Google’s Gemini models. Uses the official Google GenAI SDK.

FieldRequiredDescription
nameYesDisplay name
apiKeyYesGoogle API key (starts with AI)

Like Anthropic, Gemini is LLM-only in Magec. Use a separate backend for STT, TTS, and embeddings.

What a backend can power

A single backend connection can serve multiple roles depending on the provider’s capabilities:

RoleUsed forWho references itExample models
LLMText generation, reasoning, tool useAgents (required)gpt-4.1, claude-sonnet-4-20250514, qwen3:8b
EmbeddingsSemantic search for long-term memoryMemory providerstext-embedding-3-small, nomic-embed-text
STTSpeech-to-text (Whisper-compatible)Agents (optional)whisper-1, nvidia/parakeet-ctc-0.6b-rnnt
TTSText-to-speechAgents (optional)tts-1, tts-1-hd

Not every backend supports every role. OpenAI supports all four. Ollama supports LLM and embeddings. Anthropic and Gemini support only LLM. For the roles a provider doesn’t cover, you add a different backend.

Creating a backend

In the Admin UI, go to Backends and click New. Choose the type, fill in the connection details, and save. The backend is available immediately — no restart needed.

You can then reference this backend in:

  • Agent → LLM — for text generation
  • Agent → Transcription (STT) — for speech-to-text
  • Agent → TTS — for text-to-speech
  • Memory Provider → Embedding — for semantic search in long-term memory

Mixing backends

One of Magec’s strengths is that each agent picks its own backend independently. In a single flow, you could have:

  • Agent A using GPT-4 (OpenAI) for complex reasoning
  • Agent B using Qwen 3 8B (local Ollama) for fast, simple tasks
  • Agent C using Claude (Anthropic) for careful analysis
  • All sharing a local Parakeet backend for speech-to-text

This lets you optimize for cost, speed, and capability at the individual agent level rather than committing your entire platform to one provider.