Mistral Moderation 2411

warning

Deprecated: mistral-moderation-2411 is deprecated. Migrate to mistral-moderation-2603 and update any moderation_llm_v1 guardrail configs to moderation_llm_v2.

Model

Model

mistral-moderation-2411 has been superseded by mistral-moderation-2603, which introduces updated policy categories (Dangerous, Criminal, Jailbreaking).

Policy Categories

Policy Categories

CategoryDescription
SexualMaterial that explicitly depicts, describes, or promotes sexual activities, nudity, or sexual services.
Hate and DiscriminationContent expressing prejudice or hostility against individuals or groups based on protected characteristics.
Violence and ThreatsContent that describes, glorifies, incites, or threatens physical violence against individuals or groups.
Dangerous and Criminal ContentContent that promotes illegal activities or extremely hazardous behaviors. (Legacy — replaced by separate Dangerous and Criminal categories in mistral-moderation-2603.)
Self-HarmContent that promotes or encourages deliberate self-injury, suicide, or eating disorders.
HealthContent that contains or tries to elicit detailed or tailored medical advice.
FinancialContent that contains or tries to elicit detailed or tailored financial advice.
LawContent that contains or tries to elicit detailed or tailored legal advice.
PIIContent that requests or shares personal identifying information.
Custom Guardrails (moderation_llm_v1)

Custom Guardrails (moderation_llm_v1)

The moderation_llm_v1 guardrail config is backed by mistral-moderation-2411. It is deprecated — use moderation_llm_v2 instead.

{
  "block_on_error": true,
  "moderation_llm_v1": {
    "custom_category_thresholds": {
      "sexual": 0.1,
      "selfharm": 0.1
    },
    "ignore_other_categories": false,
    "action": "block"
  }
}

A blocked request returns 403 with:

{
  "error": {
    "message": "Content blocked by guardrail",
    "status": 403
  },
  "guardrails": {
    "results": {
      "moderation_llm_v1": {
        "model_name": "mistral-moderation-2411",
        "decisions": {
          "sexual": { "threshold": 0.1, "score": 0.3, "violated": true },
          "selfharm": { "threshold": 0.1, "score": 0.05, "violated": false }
        },
        "violated": true,
        "action": "block"
      }
    }
  }
}