Platform Documentation
Breadcrumbs

Common User Journeys

This page describes step-by-step workflows for the most common scenarios you'll encounter using the RunWhen platform. Each journey shows the steps a user follows to accomplish a goal.


Journey 1: Investigating an Unhealthy Service

Scenario: You receive a Slack alert or notice that a service is degraded. You need to find the root cause quickly.

Steps:

  1. Open Workspace Chat in your workspace

  2. Ask about the problem: "Why is checkoutservice unhealthy in production?"

  3. Review the Assistant's findings:

    • It searches for existing Issues related to checkoutservice

    • It runs diagnostic tasks (pod health, logs, events, recent changes)

    • It presents a structured analysis with severity and evidence

  4. Drill deeper: "What do the logs show?" or "What changed recently?"

  5. Get a fix: "How do I fix this?" — the Assistant recommends specific remediation

  6. Apply the fix: Follow the remediation steps (rollback, config change, resource adjustment)

  7. Verify: "Is checkoutservice healthy now?" — confirm the fix worked

Time: 5-15 minutes depending on complexity

Relevant pages: Workspace Chat, Issues and Triage


Journey 2: Responding to a PagerDuty / Monitoring Alert

Scenario: An alert fires from your monitoring system. RunWhen can investigate before you even look at it.

With Workflows Configured (Automated)

  1. Alert fires in PagerDuty / Opsgenie / Prometheus

  2. Webhook triggers a RunWhen Workflow

  3. The Workflow starts a RunSession with an appropriate Assistant (e.g., Cautious Cathy)

  4. The Assistant runs relevant diagnostic tasks automatically

  5. Results are posted to Slack or the ticketing system

  6. When the on-call engineer opens the ticket, diagnostic context is already there

Without Workflows (Manual)

  1. Alert fires

  2. Engineer opens Workspace Chat

  3. Pastes the alert context: "PagerDuty alert: high error rate on frontend service in prod"

  4. The Assistant investigates and returns findings

  5. Engineer reviews and acts on recommendations

Relevant pages: Workspace Chat, Workspace Studio (Workflows)


Journey 3: Developer Self-Service Troubleshooting

Scenario: A developer's deployment isn't working in their dev/test namespace. They don't know kubectl and don't want to wait for the platform team.

Steps:

  1. Developer logs into app.beta.runwhen.com

  2. Opens their team's workspace

  3. Types: "My deployment isn't starting in the dev namespace"

  4. The Assistant:

    • Checks pod status and events

    • Identifies resource quota issues, image pull errors, or config problems

    • Explains the issue in plain language

  5. Developer follows the fix: "Change the memory request to 256Mi" or "Update the image tag"

  6. No SRE escalation needed

Time: 5 minutes

Why this matters: Platform teams field these questions constantly. RunWhen gives developers direct access to the same diagnostic capability, freeing the platform team for higher-value work.


Journey 4: Onboarding to a New Workspace

Scenario: You've just been added to a RunWhen workspace and want to understand what's being monitored and what issues exist.

Steps:

  1. Open the workspace and review the Issues list (the default landing screen)

    • Note the total issue count and which SLXs have findings

    • Expand a few issues to see what's being detected

  2. Switch to Workspace Chat and ask a broad question:

    • "What's the overall health of this environment?"

    • "Show me what's wrong across all namespaces"

  3. Explore Workspace Studio to understand the configuration:

    • Tasks tab — what platforms and SLXs are configured

    • Assistants tab — which assistants are available and their access levels

    • Rules and Commands — any custom automation in place

  4. Try a tutorial from the Live Demos page if the Sandbox workspace is available

  5. Read the Learn section for background on how the platform works

Relevant pages: Issues and Triage, Workspace Studio, Engineering Assistants, Learn


Journey 5: Setting Up Automated Monitoring

Scenario: You want RunWhen to continuously monitor a namespace and alert you when issues are found.

Steps:

  1. Open Workspace Studio > Tasks tab

  2. Add SLX for the resource you want to monitor (e.g., a Kubernetes namespace)

  3. Configure the SLX with:

    • Health check tasks (SLIs) — define what "healthy" looks like

    • Troubleshooting tasks (TaskSets) — what to investigate when health degrades

    • Alerting thresholds (SLOs) — when to raise an alert

  4. Set up a Workflow (Workflows tab) to notify Slack or PagerDuty when issues are detected

  5. Configure an Assistant (e.g., Cautious Cathy) to automatically investigate new issues via webhook

  6. Issues are now detected and investigated automatically, with results in Slack

Relevant pages: Workspace Studio, Engineering Assistants


Journey 6: Reviewing Past Incidents

Scenario: You want to review what happened during a previous incident for a post-mortem or to share with a colleague.

Steps:

  1. Open Workspace Chat

  2. Browse previous chat sessions (RunSessions) from the sidebar

  3. Each session shows the full investigation timeline:

    • Original question or alert trigger

    • Tasks that were run

    • Results and analysis

    • Remediation steps taken

  4. Share the session URL with colleagues or export for documentation

Relevant pages: Workspace Chat


Quick Reference: What to Ask

Goal

Example Prompt

Check overall health

"What's unhealthy in [namespace]?"

Investigate specific service

"Why is [service] crashing/slow/failing?"

Find recent changes

"What changed in [namespace] recently?"

Get remediation steps

"How do I fix this?"

Check resource issues

"Are there resource quota problems in [namespace]?"

Compare environments

"Compare the configuration between dev and test"

Investigate logs

"What do the logs say for [service]?"

Broad sweep

"Show me what's wrong across all namespaces"

Specific issue

"Tell me about the segmentation fault in checkoutservice"

Status check

"Is [service] healthy now?"