Why Four Modes Beat One

Yeti AI Chat ships with four modes. General is the everyday default. Guru is the deep architecture mode. Thinking is the step-by-step reasoning mode. Fast is the lowest-latency mode. The reason for four rather than one is that ServiceNow work spans a wide range of cognitive loads. A one-line GlideRecord question wants a different model behaviour than a multi-table architecture review.

Picking the right mode matters for three reasons: it changes the depth of the answer, it changes the latency, and it changes the token cost. Picking the wrong one wastes any combination of those. This article gives a practical heuristic for each mode.

SnowCoder is independently benchmarked at 60% more accurate than generic ChatGPT or Claude across 120-plus ServiceNow benchmarks. That accuracy is shared across every mode, but each mode is tuned for a different shape of question.

General Mode: The Everyday Default

General mode is what most ServiceNow developers should leave Yeti AI Chat parked on for most of the day. It is fast enough for interactive use, deep enough to write a non-trivial Business Rule, and cheap enough to use without thinking about token spend.

General mode is the right pick for routine tasks: Business Rules, Client Scripts, UI Policies, simple Flow Designer actions, and quick GlideRecord queries against the connected instance. It is also the right pick for explaining an error message or talking through a stack trace.

Prompt example: "Write an onChange Client Script for the incident table that hides the resolution code field until state is Resolved."

function onChange(control, oldValue, newValue, isLoading) {
    if (isLoading) {
        return;
    }
    var isResolved = (newValue == 6); // 6 = Resolved
    g_form.setDisplay('close_code', isResolved);
}

If you are reaching for the mode selector to ask a routine question, you are probably overthinking it. Leave the dial on General.

Guru Mode: Deep Architecture

Guru mode is the right pick when the question is genuinely about architecture, not implementation. Should this be a custom application or a scoped extension? How do we structure a CMDB-driven impact map? What is the right pattern for routing tickets across multiple tenants on the same instance? These are questions where the answer matters more than the latency.

Guru mode reasons more thoroughly, considers more edge cases, and is more willing to push back when the framing of the question is wrong. It is also the right mode to pair with the CMA Architect hat (described in our four hats post) for design conversations.

Prompt example: "We have a custom application with 200,000 records growing at 50,000 per month. The main list view is taking 12 seconds to render. Walk me through where the time is being spent and what the right structural fix looks like."

In Guru mode, the response will not jump to "add an index". It will reason about query patterns, list view configuration, related list eager loading, dictionary defaults, choice list cardinality, ACL evaluation cost, and the trade-off between database indexes and a denormalised summary table. The response will be longer and slower. For an architecture decision that lives in the codebase for years, the trade is obvious.

Thinking Mode: Show the Working

Thinking mode is the mode that exposes the reasoning chain. The output is structured so you can read each step the model took to get to the answer. It is the right pick for two situations.

Debugging strange behaviour: When a Business Rule fires when it should not, or a Flow Designer flow takes a branch you did not expect, Thinking mode walks through the platform's evaluation order step by step.
Justifying a decision: When you need a paper trail for an architecture choice (compliance, audit, customer review), Thinking mode produces the trail.

Prompt example: "Walk me through, step by step, why this Business Rule on incident is running twice on insert."

Thinking mode is slower than General but more transparent. The benefit is that you can spot a flaw in the reasoning before you act on the conclusion.

Fast Mode: Lowest Latency

Fast mode is the right pick when latency matters more than depth. It is the mode to reach for when you are in the middle of a working session and you need a syntax reminder, a quick clarification, or a one-line answer.

Prompt example: "What is the GlideRecord method to update without firing Business Rules?"

gr.setWorkflow(false);
gr.update();

Fast mode is not the mode for architecture decisions or security reviews. It is the mode for keeping your flow state intact when you would otherwise alt-tab to the docs site.

A Practical Decision Heuristic

A short rule of thumb that holds up well in practice:

If the task is interactive and routine, use General.
If the task is an architecture or design decision, use Guru.
If the task requires explaining or justifying the reasoning, use Thinking.
If the task needs a one-line answer right now, use Fast.

It is also fine to switch mid-conversation. A common pattern is to start in General, jump to Guru for a hard design question, then drop back to General to implement. Yeti AI Chat preserves context across the switch.

Modes Across SnowCoder Tiers

All four Yeti AI modes are available across every SnowCoder tier (Standard, Enterprise, and Enterprise+). The MCP integration is also available on every tier. The Yeti Build Agent and the MSP Agents are Enterprise-tier features. The Yeti Build Agent has its own benchmark, the 291-story build benchmark, and uses 42 artifact classes via the ServiceNow Fluent SDK to generate code that fits inside the platform's upgrade path.

If you are wondering which tier you need, the simplest cut is: if you only need conversational AI, Standard is enough. If you need autonomous building or MSP-grade estate management, Enterprise.

General, Guru, Thinking, Fast: Picking the Right Yeti AI Mode for the Job