Skip to content

Feedback loop and tuning best practices

This article provides guidance for building an effective labeling and tuning practice. For an overview of how the feedback loop works and why each stage matters, see Why feedback and tuning matter.

1. Start with a clear objective

Before starting a labeling or tuning program, define what you want to improve. A clear objective helps determine which action types should be labeled first, which tuning controls to adjust, and which KPIs should be monitored. Examples include:

  • Reducing account takeover at login
  • Preventing fraudulent new account creation
  • Improving fraud prevention at transaction time
  • Reducing friction for legitimate users
  • Improving investigator efficiency
Label the action types you want to improve

The most useful labels are the ones tied directly to the business decision you want to optimize:

  • If the priority is detecting account takeover earlier, focus on login actions
  • If the priority is transaction fraud, focus on transaction actions
  • If the priority is new account opening protection, focus on registration actions
Prioritize labeling confirmed fraud

If your team resources are limited, begin with the confirmed fraud since labeling such events typically provides the highest value for tuning, especially during the early stages of deployment. As the program matures, expand to include high-confidence suspected fraud, confirmed legitimate outcomes, and broader coverage of deny and challenge decisions.

Measure legitimate friction, not only fraud capture

An effective feedback loop should improve both fraud prevention and customer experience. That means tracking not only what fraud was caught, but also which legitimate users were challenged or denied, whether trusted users are being recognized effectively, and whether investigation teams are reviewing the right cases. Confirmed legitimate labels are especially useful for improving trust recognition and reducing false positives.

Recommended practice

Align on your primary fraud use case, the relevant decision points, and the success metrics before expanding the feedback program.

Example

A financial institution wants to reduce account takeover at login. They define:

  • Primary objective: Reduce account takeover at login
  • Decision point: Login events
  • Success metrics: Fraud capture at login, challenge rate, false positives on legitimate users

Based on this, they prioritize labeling login actions (rather than transactions), focus on confirmed fraud and reviewed challenge/deny outcomes at login, and tune thresholds and challenge recommendations specifically for login flows. This focused approach improves early detection without unnecessarily increasing friction in other parts of the user journey.

2. Label outcomes as early as possible, once reliable

The earlier high-confidence feedback is provided, the more value it creates. At the same time, labels should only be submitted once there is enough confidence in the outcome.

Examples of reliable confirmation include:

  • Manual investigation results
  • Confirmed chargebacks
  • Verified findings from another fraud or risk system
  • Legitimate cases confidently cleared by investigation
Confirm fraud without requiring realized loss

A fraud outcome can be valid even if no monetary loss occurred—for example, fraud may have been prevented before funds were lost, or abuse may have been confirmed through investigation before a transaction completed. This is especially important in prevention-focused flows such as login, new account opening, verification, and early journey enforcement. Base fraud confirmation on confidence in the conclusion, not only on whether a realized loss occurred.

Validate early denies with alternative evidence

When Fraud Prevention denies an action early on, the fraud may never fully unfold, making direct validation more difficult. In these cases, organizations may need to rely on shadow mode or non-blocking evaluation during early rollout, retrospective analyst review, related downstream evidence, or portfolio-level performance analysis.

Recommended practice

Submit labels as soon as they are reliable enough to support tuning and performance analysis, and use a realistic validation framework for each stage of the user journey.

3. Use structured labeling data

To make feedback more actionable, labels should include as much useful structure as possible. Where applicable, include:

This helps separate different fraud patterns and makes future analysis more meaningful.

Use label types consistently

Fraud Prevention supports multiple label types, each serving a different purpose. Use them intentionally:

  • Confirmed fraud for high-confidence fraud outcomes
  • Suspected fraud when fraud is likely but not yet fully confirmed
  • Confirmed legit when the case has been investigated and cleared
  • Undetermined when the outcome remains inconclusive

Consistent usage improves the quality and interpretability of the feedback loop. Define a clear internal labeling policy so teams apply labels consistently over time.

Choose the right subject level

Fraud Prevention supports labeling several types of subjects, including individual actions, correlated journeys, users, verification sessions, campaigns, and fraud rings. In many cases, the best starting points are:

  • Action-level labeling when feedback applies to a specific event
  • Correlation-level labeling when fraud spans multiple linked actions in the same journey or session

Use the narrowest subject level that accurately represents the fraud outcome while preserving the necessary context.

Correlate fraud across the user journey when possible

Fraud often spans more than one event. A single fraud case may involve multiple connected actions, devices, or sessions. Where possible, connect feedback to the broader journey using relevant identifiers such as correlation ID, claimed user ID, etc. When fraud affects multiple related actions, use correlated labeling instead of only labeling the final event.

Use reliable validation sources

The source of the label helps explain how the outcome was determined. Common sources include manual review, customer complaints, chargebacks, and other vendors or fraud systems. Tracking the source improves transparency and supports future analysis.

Recommended practice

Structure labels deliberately so they reflect both the outcome and the fraud scenario, and use the source field consistently so the feedback loop reflects how each outcome was validated.

4. Tune detection sensitivity based on observed patterns

Detection sensitivity lets you adjust how much weight each risk factor carries in the risk score calculation. Use it to align the system's scoring with your actual threat landscape.

Adjust sensitivity to business needs

Each factor can be set to one of five levels:

  • Ignore: Removes the factor from score calculation entirely.
  • Low: Reduces the factor's contribution to the score.
  • Default: Uses the system's baseline weighting.
  • High: Amplifies the factor's contribution, making it more likely to trigger a Challenge or Deny.
  • Deny: Automatically blocks any action where this factor is detected, regardless of other signals.

Sensitivity adjustments are most effective when:

  • A specific risk factor generates too many false positives (e.g., VPN usage is common among your legitimate users—consider lowering the is VPN weight).
  • A new attack pattern emerges that requires immediate response (e.g., a surge in bot activity—set the Bot factor to Deny to block all bot-flagged actions until a proper strategy is defined).
  • Certain factors are not relevant to your environment or business (e.g., corporate users are using virtual machines by policy—set the Virtual machine factor to Ignore).
Apply sensitivity changes safely
  • Use the Preview mode first: Save sensitivity configurations as a preview policy and evaluate their impact on the Recommendations dashboard before deploying to production. This avoids unintended disruption.
  • Adjust incrementally: Change one or two factors at a time so you can isolate the effect of each adjustment on your recommendation distribution.
  • Review after labeling milestones: When a significant batch of labels has been submitted, revisit sensitivity settings to see whether the system's baseline has shifted enough to warrant recalibration.
  • Combine with rules for layered control: Sensitivity adjusts the score broadly; rules override the recommendation for specific conditions. Use both together—for example, increase sensitivity for public Wi-Fi signals generally, but also create a rule that denies transactions from public Wi-Fi combined with a new device.
Recommended practice

Adjust sensitivity incrementally, always validate in Preview mode before production, and revisit settings as your label volume grows and fraud patterns shift.

5. Limit rules to business logic

Rules let you define fixed recommendation outcomes when specific conditions are met, overriding the system's ML-based recommendation. Unlike labels and detection sensitivity, rules do not train or refine the recommendation engine—they apply a static override.

For adjusting how the system learns and scores risk over time, labels are a more effective mechanism. Labels refine the ML models so that future recommendations improve organically, whereas rules simply force a fixed outcome for matching conditions.

Choose the right scenario for rules

Rules are best suited for two scenarios:

  • Mitigating immediate threats: When a new attack vector is identified, a rule can be deployed immediately to block it (e.g., deny all actions from IP ranges confirmed as malicious). Once the threat is contained and enough labels have been submitted, the ML models adapt on their own and the rule can be retired.
  • Permanent business policies: For strict, well-defined conditions that should always produce the same outcome regardless of risk score—such as allowlisting corporate devices or always challenging transactions above a certain threshold.
Apply rules effectively
  • Start in the Preview mode: Every new rule should be tested in Preview mode first. Preview mode simulates the rule's effect against live traffic without impacting production decisions, so you can evaluate its reach before going live.
  • Keep rules focused and minimal: Each rule should address a single, well-defined scenario. Overly broad rules can override the system's ML intelligence in ways that increase false positives or reduce fraud capture.
  • Manage rule priority carefully: Rules are evaluated top to bottom, and only the first match applies. Place the most critical rules (e.g., deny rules for confirmed threats) higher in the priority order, and more permissive rules (e.g., trust rules for known-good devices) lower.
  • Audit rules regularly: As fraud patterns evolve and your label history grows, rules created during early deployment may become redundant or counterproductive. Review active rules during your periodic tuning reviews and disable or remove rules that the ML models now handle effectively on their own.
  • Coordinate rules with sensitivity settings: If you increase the sensitivity of a factor to High, you may not also need a rule that denies actions based on the same factor. Avoid duplicating logic across both controls.
Recommended practice

Reserve rules for immediate threat mitigation and permanent business policies. For long-term improvement of fraud detection accuracy, rely on labels to train the recommendation engine.

6. Distinguish between initial tuning and ongoing optimization

Distinguish between diffrent stages of tuning. Tuning should also be revisited whenever the environment changes in a meaningful way—such as new fraud patterns, performance degradation, new channels or use cases, changes in enforcement strategy, or shifts in business priorities.

Initial tuning

Initial tuning is the early phase of aligning Fraud Prevention to a new environment, channel, or use case. Typical goals include:

  • Establishing a performance baseline
  • Understanding the organization's fraud patterns
  • Reviewing early recommendation quality
  • Aligning on operational processes
  • Calibrating detection sensitivity and creating initial rules

This phase may take place during a proof of concept, proof of value, or early production rollout. During initial tuning, lean on rules and sensitivity adjustments for quick wins while you build up the label volume that improves ML accuracy over time.

Ongoing optimization

Ongoing optimization is the continuous improvement phase once a baseline is in place. Typical goals include:

  • Adapting to new fraud patterns
  • Improving precision and fraud capture over time
  • Monitoring performance trends
  • Revisiting strategy when business conditions change
  • Retiring rules that are no longer needed as the ML models mature
Recommended practice

Treat initial tuning as a setup phase and ongoing optimization as a continuous operating process. Labeling and tuning are long-term capabilities, not one-time setup tasks.

7. Establish a regular review cadence

Establish a review schedule. A practical approach may include:

  • Continuous submission where labeling is automated
  • Weekly operational reviews to assess recently confirmed fraud, review deny and challenge performance, detect changes in fraud patterns, and evaluate whether rules or sensitivity settings need adjustment
  • Periodic business-level performance reviews to align on tuning decisions, review rule effectiveness, and maintain process consistency

Regular reviews help teams detect changes in fraud patterns, align on tuning decisions, and maintain process consistency.

Recommended practice

Choose a review cadence that your organization can sustain consistently over time.

8. Automate labeling wherever possible

Manual labeling in the Admin Portal is handy during initial integration phase, but it can become difficult to scale. As your program matures, move toward integrating feedback directly into fraud workflows through the Send label API.

Examples of systems to integrate with:

  • Case management systems
  • Investigation tools
  • Chargeback workflows
  • Complaint resolution systems
  • Other fraud and risk platforms
Recommended practice

Incorporate API-driven labeling into operational workflows to improve speed, consistency, and scale.

No two organizations have the same fraud team, tooling, or operating model. The right approach depends on your organization's maturity. Start with a process your team can sustain, then expand it over time.

Minimum viable model

A good starting model includes:

  • Defining the primary fraud use case
  • Labeling confirmed fraud first
  • Using action-level or correlation-level feedback
  • Using manual review as the main source
  • Labeling a manageable sample if full coverage is not feasible
  • Adjusting detection sensitivity for the most impactful risk factors
  • Creating a small set of rules for known threat patterns, tested in Preview mode first
  • Establishing a recurring review cadence
  • Aligning with Mosaic on early tuning decisions
Mature model

A more advanced model includes:

  • Automating label submission via the Send label API
  • Correlating feedback across journeys and sessions
  • Including confirmed legitimate outcomes
  • Classifying by fraud scenario and source
  • Maintaining a curated rule set that complements ML-driven recommendations, with regular audits to retire redundant rules
  • Coordinating sensitivity settings with rules and label trends
  • Integrating with existing fraud operations
  • Revisiting tuning based on KPI movement