Model integrations

The models Steward integrates through Phoeniqs MAAS.

Steward uses Phoeniqs Model-as-a-Service for language-model inference, document processing, embeddings, reranking, and transcription. For client-facing chat and drafting work, Steward uses guardrailed model aliases where Phoeniqs publishes them.

Same Swiss infrastructure, same Phoeniqs MAAS boundary. The GRC suffix marks gateway content-safety and sensitive-data filtering for that request.

Client-facing generation

These are the production defaults for user-facing chat, drafting, analysis, and vision-assisted turns. The -GRC aliases run the Phoeniqs gateway checks before the model call. If the gateway blocks a guarded request, Steward does not retry that same content on a raw chat fallback model.

Purpose	Model alias	Posture	Notes
General chat and drafting	`inference-deepseek-v32-GRC`	Guardrailed	Default for user-facing assistant turns and written work.
Analytical chat	`inference-deepseek-v32-GRC`	Guardrailed	Default for chart-capable and analysis-heavy chat routes unless overridden.
Reasoning turns	`inference-gpt-oss-120b-GRC`	Guardrailed	Used for scenario, risk, comparison, and recommendation questions.
Vision chat	`inference-qwen3-vl-235b-GRC`	Guardrailed	Used when a chat turn includes supported image attachments.
Chat fallback	`inference-deepseek-v32-GRC`	Guardrailed	Fallback stays on a GRC alias so a blocked guarded request is not retried on a raw chat model.

Specialized processing

Some model integrations are not chat-completion aliases and do not currently have published GRC variants in the committed Phoeniqs catalog. Steward still routes them through Phoeniqs MAAS inside the same Swiss processing boundary and logs only sanitized operational metadata.

Purpose	Model alias	Posture	Notes
Structured extraction	`inference-qwen3-8b`	Task-specific	Used for JSON-style extraction where deterministic local fallbacks also exist.
Document OCR	`inference-deepseek-ocr`	Task-specific	Used to extract text from scanned or image-heavy documents.
Document parsing	`inference-miner-u25`	Task-specific	Used for document layout and parsing support.
Speech transcription	`inference-whisper-large-v3`	Task-specific	Used for meeting audio transcription when enabled.
Embeddings	`inference-bge-m3`	Retrieval	Used to index and retrieve workspace context.
Reranking	`inference-bge-reranker`	Retrieval	Used to improve retrieval ordering before answer generation.

Evaluation-only models

Steward keeps an admin benchmark path for internal evaluation. That path may compare guarded and raw aliases to measure behavior, latency, and answer quality. It is not the default posture for client-facing work.

Current benchmark sweep: inference-deepseek-v32-GRC, inference-llama4-maverick-GRC, inference-qwen3-vl-235b-GRC, inference-gemma4-31b-GRC, inference-apertus-70b-GRC, inference-glm-51-754b, inference-glm45-air-110b, inference-gpt-oss-120b-GRC, inference-llama4-scout-17b.

The model publisher does not receive Steward prompts or outputs. Phoeniqs MAAS serves the models inside the Phoeniqs environment; for GRC aliases, the Phoeniqs gateway is part of that same processing path and may block or mask content before inference.