The Complete Guide to Feature Flags: Patterns, Scenarios, and Best Practices

Supa DeveloperSupa Developer
··8 min read

Feature flags started as a workaround: a way to merge half-finished code without it reaching users. Over time they became a deliberate engineering practice, a clean separation between deploying code and releasing features.

This guide skips the basics and goes straight to the patterns that matter in practice. Each one is illustrated with a realistic scenario so you can see not just how the pattern works, but when and why you'd reach for it.

If you're new to feature flags, start with What are feature flags? first, then come back here.


The fundamental principle

Every feature flag pattern comes down to one idea: the code lives in production before the feature is visible.

const showNewCheckout = await client.getFeature('new-checkout', { userId: user.id })

if (showNewCheckout) {
  return <NewCheckout />
}
return <LegacyCheckout />

The new checkout has been merged, tested, and deployed. Whether a user sees it depends entirely on what the flag evaluates to at runtime, and you control that from a dashboard without touching the codebase.

That separation is what makes every pattern below possible.


Scenario 1: Gradual rollout (shipping a redesigned checkout)

The situation: Your team rebuilt the checkout flow. It works in staging. QA signed off. But checkout is the most critical path in your app, and a broken experience costs real money. You want to roll it out to real users, but not all at once.

The pattern: Percentage rollout. Start at 5%, let it run for a day or two, check your error rates and conversion metrics, then expand.

// The flag is configured in Supaship as a 5% rollout.
// Supaship hashes the userId to consistently assign users to the same bucket.
const useNewCheckout = await client.getFeature('checkout.redesign.v2', {
  userId: user.id,
})

return useNewCheckout ? <NewCheckoutFlow /> : <LegacyCheckoutFlow />

How it plays out: Day 1 at 5%: error rate looks normal, conversion is flat. Day 2 you bump to 25%. Day 4 you're at 75% and conversion is slightly up. You ship to 100% and remove the flag at the end of the sprint.

At every step you have a real escape hatch: drop the rollout back to 0% instantly if a metric looks wrong. No deployment, no rollback script, no incident.

Key detail: The userId in the context ensures consistency. The same user always gets the same experience. Without this, users get different UIs on different page loads, which is confusing and corrupts any metrics you're tracking.


Scenario 2: Kill switch (a third-party integration fails)

The situation: Your app sends emails through Mailgun. Mailgun has an outage. Every call to their API times out, which is cascading into slow page loads and 500s on your registration flow.

The pattern: Kill switch. A flag that wraps the integration. When you flip it off, the code falls back gracefully instead of hammering a broken endpoint.

const emailEnabled = await client.getFeature('integrations.mailgun.enabled', {
  userId: user.id,
})

if (emailEnabled) {
  await mailgun.send({ to: user.email, subject: 'Welcome!' })
} else {
  // Queue it for later; show the user a "you'll receive a confirmation shortly" message
  await queue.push({ type: 'welcome-email', userId: user.id })
}

How it plays out: You're in the middle of an incident. You open the Supaship dashboard, flip integrations.mailgun.enabled to off. Within seconds, new registrations stop hammering Mailgun and instead queue the email. The 500s stop. You fix the root cause and flip the flag back on when Mailgun recovers. The queued emails drain normally.

Key detail: Kill switches are the only type of flag that should potentially live forever. Unlike rollout flags (which get cleaned up after the rollout), a kill switch wrapping a risky third-party integration is genuinely permanent infrastructure. Document it as such.


Scenario 3: Beta program (early access for power users)

The situation: You built a new analytics dashboard. It's stable enough for enthusiastic users but not ready for general availability. You want to give your most engaged customers early access and collect feedback before the full launch.

The pattern: Targeted flag evaluated against user attributes. Only users in the beta-testers segment see the feature.

const betaDashboard = await client.getFeature('analytics.dashboard.beta', {
  userId: user.id,
  email: user.email,
  segment: user.segment,   // 'beta-testers' | 'standard' | 'enterprise'
  plan: user.plan,
})

return betaDashboard ? <BetaDashboard /> : <LegacyDashboard />

In the Supaship dashboard, you configure the flag to return true only when segment === 'beta-testers'. Adding or removing users from the beta is a dashboard operation: no code change, no deployment, no engineering involvement.

How it plays out: You email your beta list. They log in and see the new dashboard. You collect feedback for two weeks, ship a few improvements, then graduate to a 20% rollout for everyone before going to 100%.

Key detail: This pattern also works for internal QA ("only show this to @yourcompany.com email addresses") and for sales-led access ("enable this for the Acme Corp enterprise deal before it closes").


Scenario 4: Plan gating (premium features)

The situation: You're adding an AI writing assistant to your product. It's expensive to run and you want it only on the Pro and Enterprise plans.

The pattern: Targeting rule evaluated against the user's plan. The gate is controlled by the flag, not hardcoded in the component.

const aiAssistant = await client.getFeature('features.ai-assistant', {
  userId: user.id,
  plan: user.plan,        // 'free' | 'pro' | 'enterprise'
})

return (
  <EditorToolbar>
    {aiAssistant && <AIAssistantButton />}
    <FormatButton />
    <InsertButton />
  </EditorToolbar>
)

In the flag configuration, the rule is: return true when plan is pro or enterprise.

How it plays out: When marketing decides to run a promotion that gives free users access to the AI assistant for a week, you update one targeting rule in the dashboard. No code change, no deployment. When the promotion ends, you revert the rule. This is why hardcoding plan checks in component logic is a mistake: flag-based gating gives marketing and product the ability to change these rules without engineering involvement.


Scenario 5: Canary release (a risky infrastructure change)

The situation: You're switching your database connection pool from one driver to another. The new driver handles connection limits better under load, but you've only validated it in staging. You want to expose it to a small fraction of production traffic before flipping it entirely.

The pattern: Server-side flag evaluated per request, controlling which code path handles the database connection.

// In your Node.js request handler or middleware
const useNewDbDriver = await featureClient.getFeature('infra.db-driver.v2', {
  userId: req.user?.id ?? req.ip, // anonymous traffic can use IP as the bucket key
})

const db = useNewDbDriver ? newConnectionPool : legacyConnectionPool

How it plays out: 5% of requests hit the new driver. You watch query latency and connection error rates in your APM. After 24 hours with no regression, you raise it to 50%, then 100% over the following two days. The legacy driver code stays around for a week after the full rollout, then gets deleted in a clean-up PR.

Key detail: Infrastructure-level canary releases are where feature flags genuinely replace what some teams do with blue/green deployment or traffic splitting at the load balancer. Flag-based canaries are cheaper to set up, easier to adjust, and don't require infrastructure changes to tune the traffic split.


Scenario 6: Dark launch (running new code silently)

The situation: You're migrating to a new search index. Before showing the new results to users, you want to run both the old and new search in parallel, compare their results, log discrepancies, and build confidence that the new index is equivalent, with no user-visible change.

The pattern: Dark launch. The flag controls whether the new search runs and logs, but the UI always shows the old results.

const shadowSearch = await client.getFeature('search.new-index.shadow', {
  userId: user.id,
})

const legacyResults = await legacySearch.query(query)

if (shadowSearch) {
  // Run in the background; don't await or surface errors to the user
  newSearch
    .query(query)
    .then(newResults => compareAndLog(legacyResults, newResults))
    .catch(err => logger.warn('shadow search failed', { err }))
}

return legacyResults // User always sees legacy results during the shadow phase

How it plays out: You run the shadow mode for a week. Your comparison logs show 99.4% result overlap. The 0.6% discrepancy turns out to be stale index data in a specific category, which you fix. The following week you switch from dark launch to actual rollout with confidence you couldn't have gotten from any amount of staging testing.


Choosing the right pattern

ScenarioPatternFlag type
Releasing a risky feature to real usersGradual rolloutPercentage
Protecting against third-party outagesKill switchBoolean, permanent
Collecting feedback from engaged usersBeta / early accessSegment targeting
Restricting features to paid plansPlan gatingAttribute targeting
Validating infrastructure changesCanary releasePercentage (server-side)
Testing new backend logic without riskDark launchPercentage (silent)

Most flags start as percentage rollouts. Targeting rules (segment, plan, email domain) are layered in when you need more precision. Kill switches are the only flags that don't have a planned end date.


One pattern to avoid: flag sprawl

Every pattern above involves creating a flag. The risk is ending up with dozens of flags that were "temporary" and are now dead code. A few habits prevent this:

  • Set a cleanup date when you create the flag. Most flags should live for days or weeks, not months.
  • Name flags to communicate intent. checkout.redesign.v2 is obviously a rollout flag with a lifecycle. new-thing-2 is not.
  • Delete the flag and the old code path together. A flag at 100% with no plans to ever turn it off is just an if (true) block.

For flags that are genuinely permanent (kill switches, plan gates), document that explicitly so no one cleans them up by accident.


Getting started

All six patterns above are supported by Supaship out of the box: percentage rollouts, attribute targeting, segments, and instant dashboard control with no redeploy required. The free tier covers 1M events/month across unlimited projects. Pro plan is $30/month for your entire workspace.

Framework guides:

Related reading:

Sign up free → · 1M events/month, unlimited flags, no credit card required.


Feedback

Got thoughts on this?

We're constantly learning how developers actually use these tools. Ideas, use cases, integration requests — every bit of feedback makes the platform better for everyone.

Thanks for being part of the journey — Supaship