When people send messages that are inappropriate or harmful within a workspace, Flint flags them. This guide covers all the details around inappropriate message flagging.
What types of messages get flagged in Flint?
Flint's categorization of flagged messages is based on OpenAI's content moderation rules. The different types of flagged messages are described in the table below.
Category | Description |
harassment | Content that expresses, incites, or promotes harassing language towards any target. |
harassment/threatening | Harassment content that also includes violence or serious harm towards any target. |
hate | Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harassment. |
hate/threatening | Hateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. |
illicit | Content that gives advice or instruction on how to commit illicit acts. A phrase like "how to shoplift" would fit this category. |
illicit/violent | The same types of content flagged by the "illicit" category, but also includes references to violence or procuring a weapon. |
self-harm | Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders. |
self-harm/intent | Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders. |
self-harm/instructions | Content that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts. |
sexual | Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness). |
sexual/minors | Sexual content that includes an individual who is under 18 years old. |
violence | Content that depicts death, violence, or physical injury. |
violence/graphic | Content that depicts death, violence, or physical injury in graphic detail. |
Where can I see the flagged messages within my Flint workspace/group?
If you are an owner of the workspace or group, you'll be able to see all the flagged messages within the workspace analytics and group analytics, respectively.
Seeing flagged messages in a workspace
You can access all the flagged messages for a workspace by clicking on "Analytics" in the top left of your sidebar. These flags are inclusive of messages sent in groups, in "Talk to Flint", and in any one-off activities in peoples' homepages.
Once you're on the analytics page, you'll see all the flagged messages in the rightmost column. You can click "View all" to open all the flagged messages.
You can also click on an individual message to see the full session and context in which the message was sent.
Seeing flagged messages in a group
Similar to the workspace analytics, the group analytics will show flagged messages sent within activities belonging to that group. To see these, click on "View analytics" located in the three-dot menu at the top of the group page.
Also similar to the workspace analytics, within the group analytics you can see the flagged messages in the bottom right. You can click on any message to see the session where that message was sent.
When and who will be emailed about flagged messages in Flint?
Only messages indicating self-harm will trigger email notifications. Self-harm moderators are the people who will receive these emails.
By default, the moderator of a workspace is the creator of the workspace and the moderator of a group is the creator of the group. Both the workspace moderator(s) and group moderator(s) will be emailed if a self-harm message is detected in a session within the group. If the session happens outside a group (e.g. with "Talk to Flint"), the notification will only go to the workspace moderator(s).
Some schools have added their other administrators or counseling staff as moderators. Here's a guide on how to add or change moderators through the workspace or group settings.
These emails will be sent immediately after a student sends a self-harm-related message. If students delete the session with the message, moderators will still be notified, but the session cannot be recovered. Below is an example of what the notification email will look like. The email will come from [email protected] and have the header "Review Required: Self-Harm Flagged in Flint".