Create chat moderation - Inworld AI Documentation

curl -X POST 'https://api.inworld.ai/v1/chat/moderations' \
  -H 'Authorization: Bearer $INWORLD_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello world!"},
      {"role": "assistant", "content": "Hi there!"},
      {"role": "user", "content": "Tell me something"}
    ]
  }'

{
  "id": "modr-8a1241048da1455f92f1e6242a3a3576",
  "model": "inworld/moderation-latest",
  "result": {
    "flagged": false,
    "categories": {
      "sexual": false,
      "sexual/minors": false,
      "harassment": false,
      "harassment/threatening": false,
      "hate": false,
      "hate/threatening": false,
      "illicit": false,
      "illicit/violent": false,
      "self-harm": false,
      "self-harm/intent": false,
      "self-harm/instructions": false,
      "violence": false,
      "violence/graphic": false
    },
    "category_scores": {
      "sexual": 0,
      "sexual/minors": 0,
      "harassment": 0,
      "harassment/threatening": 0,
      "hate": 0,
      "hate/threatening": 0,
      "illicit": 0,
      "illicit/violent": 0,
      "self-harm": 0,
      "self-harm/intent": 0,
      "self-harm/instructions": 0,
      "violence": 0,
      "violence/graphic": 0
    },
    "category_applied_input_types": {
      "sexual": [
        "text"
      ],
      "sexual/minors": [
        "text"
      ],
      "harassment": [
        "text"
      ],
      "harassment/threatening": [
        "text"
      ],
      "hate": [
        "text"
      ],
      "hate/threatening": [
        "text"
      ],
      "illicit": [
        "text"
      ],
      "illicit/violent": [
        "text"
      ],
      "self-harm": [
        "text"
      ],
      "self-harm/intent": [
        "text"
      ],
      "self-harm/instructions": [
        "text"
      ],
      "violence": [
        "text"
      ],
      "violence/graphic": [
        "text"
      ]
    },
    "ailuminate": {
      "safety": "safe",
      "categories": {
        "violent_crimes": false,
        "sex_related_crimes": false,
        "child_sexual_exploitation": false,
        "suicide_self_harm": false,
        "indiscriminate_weapons": false,
        "intellectual_property": false,
        "defamation": false,
        "non_violent_crimes": false,
        "hate": false,
        "specialized_advice": false,
        "privacy": false,
        "sexual_content": false
      },
      "extensions": {
        "politically_sensitive": false,
        "unethical_acts": false,
        "jailbreak": false
      },
      "refusal": false
    }
  }
}

POST

chat

moderations

curl -X POST 'https://api.inworld.ai/v1/chat/moderations' \
  -H 'Authorization: Bearer $INWORLD_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello world!"},
      {"role": "assistant", "content": "Hi there!"},
      {"role": "user", "content": "Tell me something"}
    ]
  }'

{
  "id": "modr-8a1241048da1455f92f1e6242a3a3576",
  "model": "inworld/moderation-latest",
  "result": {
    "flagged": false,
    "categories": {
      "sexual": false,
      "sexual/minors": false,
      "harassment": false,
      "harassment/threatening": false,
      "hate": false,
      "hate/threatening": false,
      "illicit": false,
      "illicit/violent": false,
      "self-harm": false,
      "self-harm/intent": false,
      "self-harm/instructions": false,
      "violence": false,
      "violence/graphic": false
    },
    "category_scores": {
      "sexual": 0,
      "sexual/minors": 0,
      "harassment": 0,
      "harassment/threatening": 0,
      "hate": 0,
      "hate/threatening": 0,
      "illicit": 0,
      "illicit/violent": 0,
      "self-harm": 0,
      "self-harm/intent": 0,
      "self-harm/instructions": 0,
      "violence": 0,
      "violence/graphic": 0
    },
    "category_applied_input_types": {
      "sexual": [
        "text"
      ],
      "sexual/minors": [
        "text"
      ],
      "harassment": [
        "text"
      ],
      "harassment/threatening": [
        "text"
      ],
      "hate": [
        "text"
      ],
      "hate/threatening": [
        "text"
      ],
      "illicit": [
        "text"
      ],
      "illicit/violent": [
        "text"
      ],
      "self-harm": [
        "text"
      ],
      "self-harm/intent": [
        "text"
      ],
      "self-harm/instructions": [
        "text"
      ],
      "violence": [
        "text"
      ],
      "violence/graphic": [
        "text"
      ]
    },
    "ailuminate": {
      "safety": "safe",
      "categories": {
        "violent_crimes": false,
        "sex_related_crimes": false,
        "child_sexual_exploitation": false,
        "suicide_self_harm": false,
        "indiscriminate_weapons": false,
        "intellectual_property": false,
        "defamation": false,
        "non_violent_crimes": false,
        "hate": false,
        "specialized_advice": false,
        "privacy": false,
        "sexual_content": false
      },
      "extensions": {
        "politically_sensitive": false,
        "unethical_acts": false,
        "jailbreak": false
      },
      "refusal": false
    }
  }
}

Classifies messages in a conversation for harmful content. Unlike /v1/moderations, this endpoint accepts chat messages and supports a scope parameter to control which messages are evaluated. Not compatible with the OpenAI SDK. Use /v1/moderations for SDK compatibility.

Authorizations

Authorization

string

header

required

Your authentication credentials. Pass your API key as a Bearer token: Bearer $INWORLD_API_KEY.

Body

application/json

messages

object[]

required

Array of chat messages to classify.

Show child attributes

scope

default:last

Which messages to classify. "last" (default) processes only the last message. "all" processes every message. A positive integer N processes the last N messages. Be mindful that including many messages increases response latency.

Available options:

all,

last

model

string

default:inworld/moderation-latest

The moderation model to use.

Response

A successful response.

string

Unique identifier for the moderation request.

model

string

The model used for classification.

result

object

A single aggregated moderation result for the conversation.

Show child attributes

Create moderation List models

Documentation Index

Authorizations

Body

Response