TFGBV Taxonomy
Mitigation Strategy:

User-controlled content filters

Last Updated 6/9/25
Definition: Filters to Empower Users in Managing Content Exposure
Abuse Types:
Inappropriate content Online harassment
Impact Types:
Psychological & emotional harm Sexual harm Social & political harm
Targets:
Public figure Organization, group, community Private individual
Responsible Organizations:
Digital platform non-governmental organization (NGO) / Third-party tool

The information on this page is adapted with permission from Prevention by Design by lead authors Lena Slachmuijlder and Sofia Bonilla.

Provide user-controlled filters to manage and reduce exposure to specific types of content, especially language or topics that may be triggering or unwanted. By giving users the tools to manage their content exposure, platforms empower individuals to create safer personal spaces online. Filters do not just remove harmful content but allow users to define the boundaries of their online interactions, creating a protective buffer against TFGBV and other harmful content.

Content Exposure and Contact/Conduct exposure (from an adversary) are related, but different – although a solution that addresses one should also address the other. Blocking an individual account has its limits, as they may attempt to evade a ban by creating a new account.

Examples

  • Instagram’s Hidden Words and Sensitive Content Filters: Allows users to define their own set of words to filter out, preventing content they deem unwanted from appearing in their DM requests. Once this feature is activated, Instagram will also filter a list of potentially harmful words, emojis, numbers, or phrases developed in conjunction with anti-discrimination and anti-bullying organizations. Instagram’s Sensitive Content control allows users to control how much sensitive content shows up on the Explore page.
  • Private companies such as Block Party, TrollWallAI, Bodyguard and Textgain provide robust filtering mechanisms that report increased user satisfaction and engagement in safer online spaces. Textgain points out that machine learning models can also refine filters to take context into account.
  • Civil society groups have also developed filtering tools such as the Uli browser extension.
  • Google’s Jigsaw created an open-source tool, Harassment Manager, to help people document and manage toxic language targeted at them on social media.
  • TikTok offers a feature that allows users to reset their 'For You' feed, providing an opportunity to start afresh and tailor content recommendations to their current preferences. This action resets the feed to display a new set of popular videos, similar to the experience of a new user, and as users interact with these videos, the platform's algorithm begins to personalize the feed based on the new interactions.
    • Limitation: a "reset" which could lead to potentially harmful content showing up on the feed again.
    • It also puts the onus of curation on the user rather than design improvements of the platform.
    • "Algorithmic Choice" vs. user-created content filters
  • Sensitive Content Alert: TikTok flags potentially harmful content and nudges users so they’re aware of potentially harmful searches that result in distressing content and provides users the option to view or avoid it based on individual preference.
  • X's Safety Mode helps reduce unwanted interactions by temporarily blocking accounts that use potentially harmful language or repeatedly send uninvited replies and mentions. When activated, the system analyzes Tweets and the relationship between users to assess the risk of negative engagement. If an account is flagged for potentially harmful behavior, it’s automatically blocked for seven days, preventing it from following, viewing Tweets, or sending Direct Messages to the user with Safety Mode. However, accounts the user follows or frequently engages with won’t be autoblocked. Users can review and undo any blocks in their settings at any time.
  • Auto blurring images when receiving DMs from unknown users to prevent unintentional exposure to unwanted content

References

  • GCF Global. (2019). Digital Media Literacy: How Filter Bubbles Isolate You. GCFGlobal.org. https://edu.gcfglobal.org/en/digital-media-literacy/how-filter-bubbles-isolate-you/1/
  • Slachmuijlder, L., & Bonilla, S. (2025). Prevention by design: A roadmap for tackling TFGBV at the source. https://techandsocialcohesion.org/wp-content/uploads/2025/03/Prevention-by-Design-A-Roadmap-for-Tackling-TFGBV-at-the-Source.pdf

Opportunities

  • Third party technology could be directly incorporated into platforms rather than engineers developing new tools from scratch.

Limitations

  • The complexity and evolving nature of online content make it difficult for filters to be fully effective.
    • Unintended consequences include over-blocking of legitimate content and the potential for harmful content to slip through.
  • Filter bubbles: Being in a filter bubble means these algorithms have isolated you from information and perspectives you haven’t already expressed an interest in, meaning you may miss out on important information (GFC Global, 2019).
  • Might work for individual harms, but less so social harms (there might be a framework for analyzing this; misinfo vs. individual toxicity).
  • Product/Growth teams might be hesitant to implement if the content type being blocked affects their metrics (e.g. with ads, if relevant).
Is something missing, or could it be better?
About This SiteGet InvolvedContactCopyrightPrivacy
Loading...