Skip to content

Bring Your Own LLM

The RunWhen platform supports multiple pathways for securely integrating with Enterprise-managed Large Language Models (LLMs). These integrations are designed to give customers full control over their data, security posture, and access governance.

  • Due to cost, latency, and strong enterprise data-security terms, we strongly recommend a Microsoft-hosted endpoint: Azure OpenAI or Azure AI Foundry Models. For teams primarily on AWS or GCP, our team can assist in provisioning a single-service Azure account for this.
  • Azure AI Foundry Models supports both Microsoft and partner model families (for example, Anthropic Claude Opus deployments) while keeping authentication and governance in Azure.
  • If Azure is organizationally difficult, RunWhen can host a single-tenant Azure subscription with a secured endpoint and make it available for your team. Enterprise data does not transit RunWhen.

Alternative LLM Options

RunWhen functionality has been validated with the models below. Latency, quality and pricing may vary.

  • Azure AI Foundry Models endpoint – Microsoft and partner models (for example, Anthropic Claude Opus) with customer-managed credentials.
  • OpenAI Endpoint – OpenAI endpoint (various GPT models) with OpenAI credentials.
  • OpenAI Endpoint via LiteLLM proxy – OpenAI endpoint (various GPT models) with deployment-specific credentials.
  • Google Cloud Vertex AI with Gemini – using customer-managed service accounts with restricted IAM roles.
  • Google Cloud Vertex AI with Llama4 – using customer-managed service accounts with restricted IAM roles.
  • Amazon Bedrock with Llama4 – roadmap
  • On-Prem Llama4 via vLLM Serving or LiteLLM proxys – Llama4 via LiteLLM or vLLM and customer-issued credentials.

Each integration pattern is tailored to enterprise requirements around compliance, data isolation, and access auditing.

Configuration Guides

Step-by-step setup instructions for each integration:

Azure Service Principal Role Requirements

For Microsoft Entra ID (keyless) access with a Service Principal, assign roles at the resource scope where possible (not subscription-wide) and use least-privilege.

Endpoint typeMinimum role for inference accessScopeNotes
Azure OpenAICognitive Services OpenAI UserAzure OpenAI resourceRequired for Service Principal inference API calls with Entra ID.
Azure AI Foundry ModelsCognitive Services UserFoundry/AI Services resourceRequired for Service Principal inference API calls to Foundry model endpoints (including partner models like Anthropic Claude Opus).

Optional elevated roles (only if your team manages deployments and not just inference):

  • Azure OpenAI deployment management: Cognitive Services OpenAI Contributor
  • Quota visibility: Cognitive Services Usages Reader (subscription scope in Azure)

Support and Guidance

Every enterprise environment is unique. To ensure the integration meets your organization’s compliance and security needs, please reach out to the RunWhen Support Team for:

  • Assistance with selecting the right integration model.
  • Guidance on Service Principal or IAM role setup (including Azure OpenAI and Azure AI Foundry role assignments).
  • Network and IP allowlisting requirements.
  • End-to-end validation and troubleshooting.

Contact: support@runwhen.com