Skip to content

AI Services

AI Butler is model-agnostic and provider-agnostic. You bring the keys, AI Butler handles the plumbing. This page covers the optional AI services (beyond the core text model).

Used for voice messages on Telegram, Discord, Slack, WhatsApp, and the web chat microphone.

ProviderNotes
whisperDefault. Whisper.cpp via local binary when present, otherwise Whisper API.
stubNo-op — useful for testing without audio
configurations:
voice:
stt_provider: whisper

Check current status:

Terminal window
aibutler voice status
aibutler voice providers

Used for voice replies on channels that support voice messages (Telegram, Discord, Slack, WhatsApp).

ProviderNotes
stubDefault — no-op (no audio output)
piperFully local CPU-only TTS via the Piper binary
configurations:
voice:
tts_provider: piper

Image understanding is handled by the primary model if it’s vision-capable (Claude 3+, GPT-4o, Gemini 1.5+, LLaVA via Ollama). No extra configuration — just send an image in any channel.

Tool-based image generation for creative tasks.

ProviderToolVault Key
DALL-E 3image.generateopenai_api_key
Stable Diffusionimage.generatestability_api_key
Fluximage.generatereplicate_api_key

Integration with design platforms for generating branded assets.

ProviderToolsVault Key
Canvadesign.canva_create, design.canva_getcanva_api_key
Figmadesign.figma_read, design.figma_commentfigma_api_key

Text-to-3D and image-to-3D for creative and smart-home projects.

ProviderToolsVault Key
Meshythreed.meshy_text_to_3dmeshy_api_key
Tripothreed.tripo_text_to_3dtripo_api_key
Lumathreed.luma_genieluma_api_key

Every AI service follows the same pattern: tools are always registered, but they return "configure API key" errors until you store the credential in the vault:

Terminal window
aibutler vault set canva_api_key YOUR_KEY

This lets you enable services one at a time without restarting or touching config files.

For a fully-local deployment with zero API keys:

  • Text model — Ollama (Llama 3.3, Mistral, Qwen, etc.)
  • STT — whisper.cpp or Ollama
  • TTS — Piper
  • Embeddings — Ollama
  • Vision — Ollama with LLaVA
Terminal window
docker compose -f docker-compose.ollama.yml up -d

See Choose Your AI for a comparison of providers.