Policy Library | 200+ Pre-Built Policies for Trust & Safety Teams

Hate Speech / Hateful Content⌄

Identify and moderate hate speech targeting protected characteristics including race, ethnicity, religion, gender identity, sexual orientation, and disability status

Glorifying and Inciting Violence⌄

Detect and prevent content that glorifies or incites violent acts, terrorism, or dangerous organizations

Graphic Violence and Gore⌄

Identify and filter excessively graphic violent content, gore, and disturbing imagery

Self-Harm and Suicide⌄

Detect content promoting self-harm, suicide, or eating disorders while allowing support resources

Child Safety (CSAM)⌄

Prevent and report child sexual abuse material and grooming behavior

Sexual Content and Adult Material⌄

Moderate explicit sexual content, pornography, and adult material according to platform policies

Harassment and Bullying⌄

Identify patterns of harassment, cyberbullying, doxxing, and targeted abuse

Spam and Inauthentic Behavior⌄

Detect spam, bot networks, coordinated inauthentic behavior, and manipulation campaigns

Misinformation and Deepfakes⌄

Flag false information, manipulated media, and AI-generated deepfakes that could cause harm

Dangerous Challenges and Pranks⌄

Prevent promotion of dangerous viral challenges, harmful pranks, and risky behaviors

Account Takeover (ATO)⌄

Detect credential stuffing, unauthorized account access, and identity theft attempts

Payment Fraud⌄

Identify stolen credit cards, fraudulent transactions, chargebacks, and money laundering patterns

Synthetic Identity Fraud⌄

Detect artificially created identities using combinations of real and fake information

Triangulation Fraud⌄

Identify three-party scams where fraudsters use stolen payment methods to fulfill orders

Fake Reviews and Rating Manipulation⌄

Detect coordinated fake reviews, review farms, and rating manipulation schemes

Dropshipping⌄

Identify dropshipping operations that violate marketplace policies and mislead customers

Predatory Pyramid Schemes and MLMs⌄

Detect multi-level marketing schemes, pyramid schemes, and predatory business opportunities

Phishing and Social Engineering⌄

Identify phishing attempts, social engineering scams, and credential harvesting

Refund and Return Abuse⌄

Detect patterns of refund fraud, wardrobing, and serial returners

Promo Code and Loyalty Fraud⌄

Prevent abuse of promotional codes, referral programs, and loyalty rewards

Ticket and Event Fraud⌄

Identify scalping, fraudulent tickets, bot purchases, and secondary market violations

Marketplace Scams⌄

Detect non-delivery scams, item-not-as-described fraud, and seller/buyer scams

Sybil Attacks and Multi-Accounting⌄

Identify users creating multiple accounts to circumvent restrictions or game systems

Weapons and Firearms⌄

Enforce weapons, firearms, and ammunition policies including country-specific regulations

Controlled Substances and Drugs⌄

Prevent sale of illegal drugs, prescription medications without authorization, and drug paraphernalia

Tobacco and Vaping Products⌄

Regulate tobacco, e-cigarettes, and vaping product sales according to age and location restrictions

Alcohol Sales Compliance⌄

Ensure alcohol sales comply with age verification, licensing, and shipping restrictions

Gambling and Betting⌄

Enforce gambling, sports betting, and games of chance regulations by jurisdiction

Harmful Medical Claims / Medical Devices⌄

Detect false medical claims, unapproved medical devices, and misleading health products

Endangered Species and Wildlife⌄

Prevent illegal wildlife trade, endangered species products, and CITES violations

Hazardous Materials⌄

Regulate dangerous chemicals, explosives, and hazardous materials according to safety standards

Electronic Equipment Policy⌄

Enforce compliance for electronic devices including safety certifications, import restrictions, and prohibited electronics regulations

Age-Restricted Content⌄

Enforce age verification for adult content, mature products, and age-gated services

IP and Brand Protection⌄

Protect intellectual property, trademarks, copyrights, and brand rights

Counterfeit Goods⌄

Identify and remove counterfeit products, fake designer goods, and trademark infringement

Sanctions and Embargoes⌄

Enforce international sanctions, trade embargoes, and restricted party screening

Data Privacy (GDPR, CCPA)⌄

Ensure compliance with data protection regulations including GDPR, CCPA, and regional privacy laws

Accessibility Compliance⌄

Monitor compliance with accessibility standards (WCAG, ADA) for inclusive user experiences

Malicious AI Use⌄

Prevent use of AI for creating malware, hacking tools, or automated attacks

AI-Generated Misinformation⌄

Detect AI-generated fake news, deepfakes, and synthetic media designed to deceive

Prompt Injection and Jailbreaking⌄

Identify attempts to manipulate AI systems through prompt injection, jailbreaks, or system bypasses

AI Model Misuse⌄

Prevent unauthorized use of AI models for harmful purposes including bias amplification

Automated Harm at Scale⌄

Detect AI-powered automation of harassment, spam, or coordinated harmful behavior

AI-Assisted Fraud⌄

Identify fraud schemes enhanced by AI including voice cloning scams and automated account creation

Synthetic Identity Creation⌄

Detect AI-generated fake identities, profile photos, and authentication bypass attempts

Card Issuing Fraud⌄

Detect and prevent fraudulent card applications, first-party fraud, bust-out schemes, and abuse of newly issued payment cards

Bot Detection⌄

Identify automated bot traffic, credential stuffing attacks, scraping, and non-human behavior patterns across platforms

Transaction Laundering⌄

Detect merchants processing unauthorized transactions through legitimate merchant accounts, a form of payment fraud where illicit businesses hide behind compliant fronts

Money Laundering (AML)⌄

Identify suspicious transaction patterns, structuring, layering, and integration of illicit funds in compliance with anti-money laundering regulations

Countering The Financing Of Terrorism (CFT)⌄

Screen transactions and entities against terrorist financing watchlists, detect patterns consistent with terror funding, and ensure regulatory compliance

Adult Website Compliance⌄

Ensure adult content platforms comply with age verification requirements, consent documentation, performer verification, and regional regulations for adult material

Eating Disorder Promotion⌄

Detect and moderate content that promotes or glorifies eating disorders, dangerous dieting practices, or pro-anorexia/bulimia messaging while allowing recovery support resources

Ad Integrity⌄

Ensure advertisements are truthful, non-deceptive, and compliant with platform policies including claims verification, disclosure requirements, and prohibited ad categories

Brand Aligned Content⌄

Moderate user-generated and public-facing content to ensure it meets brand safety standards, maintaining PG-13 appropriateness for general audiences and advertiser compatibility

Cybercrime⌄

Detect and prevent content related to hacking services, malware distribution, exploit trading, DDoS-for-hire, and other cyber-criminal activities

AI Generated Media Detection⌄

Detect AI-generated images, videos, audio, and text including deepfakes, synthetic media, and artificially created content to ensure authenticity and prevent misuse

Recalled Products⌄

Identify and remove listings for products subject to government recalls, safety alerts, and mandatory withdrawals from consumer protection agencies

Product Safety⌄

Enforce product safety standards including certification requirements, safety testing compliance, and removal of hazardous or non-compliant consumer goods

Brand Safety⌄

Ensure content and ads appear in brand-appropriate contexts, preventing association with harmful, controversial, or off-brand material to protect advertiser and platform reputation

SafetyKit's Ready-to-Deploy Policies

Platform Type

Use Case

Solutions powered by these policies

Protect your platform.