Skip to main content

Inappropriate message flagging

What types of messages get flagged, where to see all the flagged messages in Flint, and when and who are notified by email.

Lulu Gao avatar
Written by Lulu Gao
Updated over 2 weeks ago

When people send messages that are inappropriate or harmful within a workspace, Flint flags them. This guide covers all the details around inappropriate message flagging.

What types of messages get flagged in Flint?

Flint's categorization of flagged messages is based on OpenAI's content moderation rules. The different types of flagged messages are described in the table below.

Category

Description

harassment

Content that expresses, incites, or promotes harassing language towards any target.

harassment/threatening

Harassment content that also includes violence or serious harm towards any target.

hate

Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harassment.

hate/threatening

Hateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.

illicit

Content that gives advice or instruction on how to commit illicit acts. A phrase like "how to shoplift" would fit this category.

illicit/violent

The same types of content flagged by the "illicit" category, but also includes references to violence or procuring a weapon.

self-harm

Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.

self-harm/intent

Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders.

self-harm/instructions

Content that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts.

sexual

Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).

sexual/minors

Sexual content that includes an individual who is under 18 years old.

violence

Content that depicts death, violence, or physical injury.

violence/graphic

Content that depicts death, violence, or physical injury in graphic detail.

Where can I see the flagged messages within my Flint workspace/group?

If you are an owner of the workspace or group, you'll be able to see all the flagged messages within the workspace analytics and group analytics, respectively.

Seeing flagged messages in a workspace

How to do it -

  1. Navigate to "Analytics" on the left side of your homepage

  2. View "Inappropriate messages" on the right side of the page

  3. Click on any message to bring you to the student chat that was flagged

Seeing flagged messages in a group

Similar to the workspace analytics, groups have group analytics available to group owners. Here, group owners can see flagged messages sent within activities belonging to that group.

How to do it -

  1. Navigate the group you want to view flagged messages in

  2. Click on "View analytics" at the top of the page

  3. See "Inappropriate messages" on the right side of your page

  4. Click on any of the flagged messages to view the student chat

When and who will be emailed about flagged messages in Flint?

Messages indicating any of the categories above will trigger email notifications. Moderators are designated staff who will receive these emails.

By default, the moderator of a workspace is the creator of the workspace and the moderator of a group is the creator of the group. Both the workspace moderator(s) and group moderator(s) will be emailed if an inappropriate message is detected in a session within the group. If the session happens outside a group (e.g. with "Talk to Flint"), the notification will only go to the workspace moderator(s).

Some schools have added their other administrators or counseling staff as moderators. Here's a guide on how to add or change moderators through the workspace or group settings.

These emails will be sent immediately after a student sends an inappropriate message under any of the categories above. If students delete the session with the message, moderators will still be notified, but the session cannot be recovered. Below is an example of what the notification email will look like. The email will come from [email protected] and have the header "Review Required: Message Flagged for *moderation category*

Did this answer your question?