Overview

Introduction to anonymize.today platform

anonymize.today - Project Overview

Last Updated: January 1, 2025
Classification: PUBLIC

Executive Summary

anonymize.today is a comprehensive, enterprise-grade platform for detecting and anonymizing Personally Identifiable Information (PII) in text. Built on Microsoft Presidio, the platform provides powerful tools for organizations to protect sensitive data, comply with privacy regulations like GDPR, and safely use data for testing and development purposes.

The platform combines advanced natural language processing with intuitive web interfaces, supporting 15 languages and offering flexible anonymization strategies. Whether you're a developer testing applications, a compliance officer ensuring data protection, or a business processing customer data, anonymize.today provides the tools you need to protect privacy while maintaining data utility.


Problem Statement

In today's digital landscape, organizations face increasing pressure to protect personal data while still being able to use that data for legitimate business purposes. Key challenges include:

  • Regulatory Compliance: GDPR, HIPAA, and other regulations require organizations to protect personally identifiable information
  • Data Breach Risk: Exposed PII can lead to identity theft, financial fraud, and regulatory penalties
  • Development & Testing: Real data is needed for testing, but using actual PII creates security risks
  • Data Sharing: Sharing data with partners or third parties requires anonymization to protect privacy
  • Legacy Data: Organizations often have large volumes of data containing PII that needs to be anonymized

Traditional solutions are either too complex, too expensive, or don't provide the flexibility needed for modern data workflows.


Solution Overview

anonymize.today solves these challenges by providing:

1. Advanced PII Detection

  • 50+ Entity Types: Detects names, emails, phone numbers, credit cards, SSNs, addresses, and more
  • 15 Languages: Full support for English, German, Spanish, French, Italian, Portuguese, Dutch, Polish, Russian, Japanese, Chinese, Korean, Arabic, Hindi, and Turkish
  • Custom Entities: Create your own detection patterns with AI assistance
  • High Accuracy: Microsoft Presidio's proven NLP technology ensures reliable detection

2. Flexible Anonymization

  • Multiple Operators: Replace, redact, hash, encrypt, or mask detected PII
  • Reversible Encryption: Encrypt data for later deanonymization when needed
  • Preset Management: Save and reuse anonymization configurations
  • Batch Processing: Process multiple texts efficiently

3. Enterprise Features

  • Token-Based Usage: Fair, transparent pricing based on actual usage
  • Plan Management: Choose the plan that fits your needs (Free, Basic, Pro, Business)
  • API Access: Integrate anonymization into your applications
  • Team Collaboration: Share presets and entities with your organization

4. Security & Compliance

  • ISO 27001 Compliant: Enterprise-grade security standards
  • GDPR Ready: Built-in compliance features
  • Encryption: AES-256-GCM encryption for sensitive data
  • Audit Logging: Complete audit trails for compliance

Key Benefits

Protect Privacy: Anonymize sensitive data before sharing or using in non-production environments
Ensure Compliance: Meet GDPR, HIPAA, and other regulatory requirements
Reduce Risk: Minimize the impact of data breaches by anonymizing sensitive information
Enable Testing: Use realistic data for development and testing without privacy concerns
Save Time: Automated detection and anonymization saves hours of manual work
Scale Easily: Process large volumes of text efficiently with batch processing
Stay Flexible: Multiple anonymization strategies let you choose the right approach for each use case


Use Cases

GDPR Compliance

Organizations processing EU citizen data can use anonymize.today to:

  • Identify PII in existing datasets
  • Anonymize data before sharing with third parties
  • Create anonymized datasets for analytics
  • Demonstrate compliance with data protection regulations

Healthcare (HIPAA)

Healthcare organizations can:

  • Anonymize patient records for research
  • Protect PHI (Protected Health Information) in test environments
  • Share anonymized data with research partners
  • Comply with HIPAA privacy requirements

Financial Services

Financial institutions can:

  • Anonymize customer data for testing
  • Protect credit card numbers and account information
  • Share anonymized transaction data for analysis
  • Comply with PCI DSS and other financial regulations

Software Development

Development teams can:

  • Use realistic test data without privacy concerns
  • Anonymize production data for staging environments
  • Test applications with various PII types
  • Integrate anonymization into CI/CD pipelines via API

Data Analytics

Data analysts can:

  • Anonymize datasets before analysis
  • Share anonymized data with team members
  • Create privacy-preserving datasets for machine learning
  • Maintain data utility while protecting privacy

Technology Highlights

anonymize.today is built on modern, proven technologies:

  • Microsoft Presidio: Industry-leading PII detection and anonymization framework
  • Next.js 14: Modern React framework for fast, responsive web applications
  • PostgreSQL 16: Robust, scalable database for reliable data storage
  • Python 3.12: Powerful backend services with extensive NLP libraries
  • TypeScript: Type-safe development for reliability and maintainability

The platform is designed for:

  • Performance: Fast processing with optimized language model loading
  • Scalability: Handle large volumes of text efficiently
  • Reliability: 99.9% uptime target with robust error handling
  • Security: Enterprise-grade security with encryption and audit logging

Supported Languages

anonymize.today supports 15 languages with intelligent lazy loading:

Full NLP Support (12 languages)

  • English (en)
  • German (de)
  • Spanish (es)
  • French (fr)
  • Italian (it)
  • Portuguese (pt)
  • Dutch (nl)
  • Polish (pl)
  • Russian (ru)
  • Japanese (ja)
  • Chinese (zh)
  • Korean (ko)

Transformer-Only Languages (3 languages)

  • Arabic (ar)
  • Hindi (hi)
  • Turkish (tr)

These languages use English NLP processing with XLM-RoBERTa transformer for entity recognition, ensuring broad language coverage while maintaining performance.


Security Highlights

anonymize.today implements comprehensive security measures:

  • Encryption: AES-256-GCM encryption for data at rest, TLS 1.3 for data in transit
  • Authentication: Multi-factor authentication (2FA) support, JWT-based sessions
  • Access Control: Role-based access control (Admin, Editor, User) with plan-based feature gating
  • Compliance: ISO 27001:2022 compliant, GDPR-ready with data protection measures
  • Audit Logging: Complete audit trails for all operations
  • Security Headers: HSTS, CSP, X-Frame-Options, and other security headers
  • Bot Protection: Google reCAPTCHA v3 for signup and password reset forms
  • Session Management: Device tracking, GeoIP location, and session revocation

All security measures follow industry best practices and OWASP Top 10 guidelines.


Getting Started

Getting started with anonymize.today is simple:

  1. Create an Account: Sign up for free at anonymize.today
  2. Choose a Plan: Start with the Free plan (300 tokens) or upgrade for more features
  3. Analyze Your First Text: Paste text and see PII detection in action
  4. Anonymize Data: Choose your anonymization strategy and protect sensitive information
  5. Explore Advanced Features: Create custom entities, use AI assistance, and manage presets

For detailed instructions, see the User Guide or Quick Start Guide.


Additional Resources


Last Updated: January 1, 2025
Current Version: 4.5.8
Website: https://anonymize.today

anonymize.today - Features

Last Updated: January 1, 2025
Classification: PUBLIC

Overview

anonymize.today provides comprehensive PII detection and anonymization capabilities with features organized by subscription plan. This document provides a complete feature comparison and detailed descriptions of all available features.


Feature Comparison by Plan

FeatureFreeBasicProBusiness
Monthly Tokens3005002,00010,000
Token Cycle30 days31 days31 days31 days
Price/Month€0€3€9€29
Core Features
PII Analyzer
PII Anonymizer
Personal Presets
Personal Entities
Advanced Features
Batch Processing✅ (100 items)✅ (500 items)✅ (5,000 items)
Deanonymization
Encryption (AES)
Hashing (SHA-256, SHA-512, MD5)
Masking
Sharing & Collaboration
Public Presets
Public Entities
Team Sharing
AI Features
AI Entity Creation
AI Pattern Generation
AI Pattern Refinement
API & Integration
API Access
API Token Management
Token Management
Token Top-ups✅ (+200 for €1)
Token History
Usage Analytics
Account Features
Two-Factor Authentication
Session Management
Email Change
Password Reset
Data Export (GDPR)

Detailed Feature Descriptions

Core Features

PII Analyzer

Detect personally identifiable information in text with high accuracy.

Capabilities:

  • 50+ Entity Types: Names, emails, phone numbers, credit cards, SSNs, addresses, IBANs, tax IDs, and more
  • 15 Languages: Full support for major world languages with intelligent lazy loading
  • Confidence Scoring: Each detection includes a confidence score (0.0-1.0)
  • Custom Entities: Use your own entity patterns alongside built-in types
  • Preset Support: Apply saved presets for common detection scenarios
  • Token Cost Estimation: See estimated token cost before analysis

Token Cost: Typically 1-10 tokens depending on text length and entities found

Available on: All plans


PII Anonymizer

Anonymize detected PII using multiple strategies.

Anonymization Operators:

  • Replace: Replace with custom text (e.g., "[REDACTED]")
  • Redact: Remove text completely
  • Hash: SHA-256, SHA-512, or MD5 hash (one-way transformation)
  • Encrypt: AES encryption (reversible with key) - Basic+ plans
  • Mask: Partial masking (e.g., "J*** D**" or "j***@example.com")

Features:

  • Operator-specific configuration
  • Preview before applying
  • Support for multiple operators per entity type
  • Encryption key management - Basic+ plans

Token Cost:

  • Apply-only (with analyzer results): 1-5 tokens
  • Full anonymize (without analyzer results): 2-10 tokens

Available on: All plans (Encryption requires Basic+)


Personal Presets

Save and reuse common entity and operator combinations.

Features:

  • Create custom presets with selected entities
  • Configure language and confidence threshold
  • Edit and delete presets (v4.5.7+)
  • Apply presets in Analyzer and Anonymizer
  • Language mismatch warnings

Available on: All plans


Personal Entities

Create custom entity patterns for specific detection needs.

Features:

  • Regex-based pattern matching
  • Multiple patterns per entity
  • Pattern testing with examples
  • Edit and delete entities (v4.5.7+)
  • Confidence scoring per pattern

Available on: All plans


Advanced Features

Batch Processing

Process multiple texts efficiently in a single operation.

Features:

  • Process up to 100/500/5,000 items per batch (plan-dependent)
  • Individual results for each text
  • Export results
  • Token-efficient bulk operations

Token Cost: Each text charged separately using analyze formula

Available on: Basic, Pro, Business plans


Deanonymization

Reverse encrypted anonymizations to restore original text.

Features:

  • Restore encrypted PII with the same encryption key
  • Key management interface
  • Support for multiple entity types
  • Secure key storage

Requirements:

  • Basic, Pro, or Business plan
  • Same encryption key used for anonymization
  • Key length: 16, 24, or 32 characters

Token Cost: Typically 1-4 tokens

Available on: Basic, Pro, Business plans


Encryption (AES)

Reversible encryption for anonymization with later restoration capability.

Features:

  • AES encryption (128, 192, or 256-bit keys)
  • Key management interface
  • Default keys per entity type
  • Secure key storage (encrypted at rest)

Use Cases:

  • Anonymize data for testing, then restore for production
  • Temporary anonymization with restoration capability
  • Secure data sharing with authorized parties

Available on: Basic, Pro, Business plans


Hashing

One-way transformation for permanent anonymization.

Hash Types:

  • SHA-256 (recommended)
  • SHA-512
  • MD5

Use Cases:

  • Permanent anonymization
  • Data anonymization for analytics
  • Compliance with data retention policies

Available on: All plans


Masking

Partial masking to preserve some information while protecting privacy.

Features:

  • Configurable masking characters
  • Configurable number of characters to mask
  • Preserves partial information for readability

Examples:

  • Names: "John Doe" → "J*** D**"
  • Emails: "[email protected]" → "j***@example.com"
  • Phone: "+1-555-123-4567" → "+1-555-***-4567"

Available on: All plans


Sharing & Collaboration

Public Presets

Share anonymization presets with all users.

Features:

  • Make presets public for community use
  • Discover presets created by others
  • Built-in presets available to all users:
    • General PII Detection (Default)
    • GDPR Compliance
    • HIPAA Medical
    • Financial Services
    • Development & Testing
    • Multi-Language European

Available on: Basic, Pro, Business plans


Public Entities

Share custom entity patterns with all users.

Features:

  • Make entities public for community use
  • Discover entities created by others
  • Quality scoring and ratings

Available on: Basic, Pro, Business plans


Team Sharing

Share presets and entities with your organization.

Features:

  • Share with team members
  • Organization-wide access
  • Collaborative entity and preset management

Available on: Basic, Pro, Business plans


AI Features

AI Entity Creation

AI-powered wizard for creating custom entity patterns.

Features:

  • 4-step wizard:
    1. Describe what you want to detect with examples
    2. AI generates 3 pattern options (Balanced, Simple, Complex)
    3. Test patterns and refine if needed
    4. Review and save
  • Quality metrics (precision, recall, overall score)
  • Pattern refinement based on test results
  • Multi-language support

Token Cost: 50 tokens per AI operation (generation, refinement, validation)

Available on: Basic, Pro, Business plans


API & Integration

API Access

Programmatic access to anonymize.today services.

Features:

  • RESTful API with comprehensive endpoints
  • JWT-based authentication
  • API token management
  • Complete API documentation
  • Rate limiting and error handling

Use Cases:

  • Integrate anonymization into applications
  • Automated data processing
  • CI/CD pipeline integration
  • Custom workflows

Available on: Basic, Pro, Business plans


Token Management

Token Top-ups

Purchase additional tokens when needed.

Features:

  • Purchase +200 tokens for €1
  • Immediate token addition
  • Tokens expire at cycle reset
  • Secure payment processing (PayPal, Stripe)

Available on: Basic plan only


Token History

View detailed token usage and transaction history.

Features:

  • Transaction history
  • Usage by operation type
  • Cycle tracking
  • Cost breakdown

Available on: All plans


Usage Analytics

Track and analyze your token usage.

Features:

  • Usage statistics
  • Cost analysis
  • Operation breakdown
  • Cycle summaries

Available on: All plans


Account Features

Two-Factor Authentication (2FA)

Add an extra layer of security to your account.

Methods:

  • Authenticator app (Google Authenticator, Microsoft Authenticator, etc.)
  • Email verification

Features:

  • Backup codes
  • Default method selection
  • Easy enable/disable

Available on: All plans


Session Management

Monitor and control where you're signed in.

Features:

  • View all active sessions
  • Device information (browser, OS, device type)
  • Location information (country, city from IP)
  • Individual session revocation
  • Bulk session revocation ("Log out everywhere")
  • Automatic revocation on password/email change

Available on: All plans


Email Change

Securely change your account email address.

Features:

  • Re-authentication required
  • Dual notification (old and new email)
  • 24-hour cancellation window
  • Force re-login after change

Available on: All plans


Password Reset

Secure password reset with email verification.

Features:

  • Secure token-based reset (SHA-256 hashing)
  • 1-hour expiry
  • One-time use tokens
  • Password history (prevents reuse of last 3 passwords)
  • All sessions revoked after reset

Available on: All plans


Data Export (GDPR)

Export all your personal data in machine-readable format.

Features:

  • One-click data export
  • JSON format
  • Includes: profile, entities, presets, usage history, token ledger, subscriptions, payments
  • Rate limited: 1 export per hour

Compliance: Implements GDPR Article 20 (Right to Data Portability)

Available on: All plans


Mobile Support

All features are available on mobile devices with a responsive, touch-optimized interface:

  • Mobile Navigation: Hamburger menu with slide-out drawer
  • Touch-Optimized: All buttons sized for easy tapping (44px minimum)
  • Responsive Layout: Content adapts to screen size
  • Full Feature Access: All desktop features available on mobile

Available on: All plans


Language Support

anonymize.today supports 15 languages with intelligent lazy loading:

Full NLP Support (12 languages)

  • English (en)
  • German (de)
  • Spanish (es)
  • French (fr)
  • Italian (it)
  • Portuguese (pt)
  • Dutch (nl)
  • Polish (pl)
  • Russian (ru)
  • Japanese (ja)
  • Chinese (zh)
  • Korean (ko)

Transformer-Only Languages (3 languages)

  • Arabic (ar)
  • Hindi (hi)
  • Turkish (tr)

Available on: All plans


Additional Resources


Last Updated: January 1, 2025
Current Version: 4.5.8

anonymize.today - Pricing & Plans

Last Updated: January 1, 2025
Classification: PUBLIC

Overview

anonymize.today offers flexible pricing plans to meet different needs, from individual users to enterprise organizations. All plans use a token-based system for fair, transparent pricing based on actual usage.


Plan Comparison

PlanTokens/CycleCyclePrice/MonthBest For
Free30030 days€0Trying out the platform, occasional use
Basic50031 days€3Regular users, small businesses, developers
Pro2,00031 days€9Power users, frequent processing
Business10,00031 days€29Enterprise users, high-volume processing

Plan Details

Free Plan

Price: €0/month
Tokens: 300 tokens per 30-day cycle
Token Top-ups: Not available

Included Features:

  • ✅ PII Analyzer
  • ✅ PII Anonymizer
  • ✅ Personal Presets
  • ✅ Personal Entities
  • ✅ Hashing (SHA-256, SHA-512, MD5)
  • ✅ Masking
  • ✅ Two-Factor Authentication
  • ✅ Session Management
  • ✅ Email Change
  • ✅ Password Reset
  • ✅ Data Export (GDPR)

Limitations:

  • ❌ No Batch Processing
  • ❌ No Deanonymization
  • ❌ No Encryption
  • ❌ No API Access
  • ❌ No Public Presets/Entities
  • ❌ No AI Entity Creation
  • ❌ No Token Top-ups

Best For:

  • Trying out the platform
  • Occasional PII detection and anonymization
  • Personal use
  • Learning the features

Basic Plan

Price: €3/month
Tokens: 500 tokens per 31-day cycle
Token Top-ups: ✅ Available (+200 tokens for €1)

Included Features:

  • ✅ All Free plan features
  • ✅ Batch Processing (up to 100 items)
  • ✅ Deanonymization
  • ✅ Encryption (AES)
  • ✅ API Access
  • ✅ Public Presets
  • ✅ Public Entities
  • ✅ Team Sharing
  • ✅ AI Entity Creation
  • ✅ Token Top-ups

Best For:

  • Regular users processing PII
  • Small businesses
  • Developers integrating anonymization
  • Users needing API access
  • Users who want to share presets/entities

Pro Plan

Price: €9/month
Tokens: 2,000 tokens per 31-day cycle
Token Top-ups: Not available

Included Features:

  • ✅ All Basic plan features
  • ✅ Higher token allocation (2,000 vs 500)
  • ✅ Larger batch sizes (up to 500 items)

Best For:

  • Power users with frequent processing needs
  • Organizations processing larger volumes
  • Users who need higher token limits

Business Plan

Price: €29/month
Tokens: 10,000 tokens per 31-day cycle
Token Top-ups: Not available

Included Features:

  • ✅ All Pro plan features
  • ✅ Maximum token allocation (10,000)
  • ✅ Largest batch sizes (up to 5,000 items)

Best For:

  • Enterprise users
  • High-volume processing
  • Organizations with extensive anonymization needs

Token System

Token Allocation

Tokens are allocated at the start of each billing cycle:

  • Free Plan: 300 tokens every 30 days
  • Basic Plan: 500 tokens every 31 days
  • Pro Plan: 2,000 tokens every 31 days
  • Business Plan: 10,000 tokens every 31 days

Token Costs

All token costs are reduced by 50% from base formulas:

Analyze Operation:

  • Base cost: 2 tokens
  • Plus: 1.0 × text length (in thousands of characters)
  • Plus: 0.2 × number of entity types selected
  • Plus: 0.1 × number of entities found
  • Total reduced by 50%

Typical Costs:

  • Short text (100-500 chars): 1-3 tokens
  • Medium text (500-2000 chars): 3-6 tokens
  • Long text (2000+ chars): 6-10 tokens

Anonymize Operation:

  • Apply-only (with analyzer results): 1-5 tokens
  • Full anonymize (without analyzer results): 2-10 tokens

Deanonymize Operation:

  • Typically: 1-4 tokens

AI Operations:

  • AI Entity Creation: 50 tokens (fixed per operation)

Token Top-ups (Basic Plan Only)

Basic plan users can purchase additional tokens:

  • Amount: +200 tokens
  • Price: €1
  • Expiry: Tokens expire at the end of the current billing cycle
  • Payment: PayPal or Stripe

Note: Pro and Business plans include sufficient tokens and cannot purchase top-ups.


Payment Methods

We accept secure payments via:

PayPal

  • PayPal account payments
  • Credit/debit card payments through PayPal
  • Inline checkout with tabbed interface
  • Official PayPal branding and security badges

Stripe

  • Direct credit/debit card payments
  • Accepts Visa, Mastercard, American Express, Discover
  • 3D Secure authentication
  • Secure hosted checkout page

Security:

  • All payments processed via PCI-DSS compliant providers
  • We never store your full card details
  • Industry-standard encryption (TLS 1.3)
  • Secure webhook verification

Subscription Management

Upgrading Your Plan

  1. Navigate to SettingsBilling tab or visit the Pricing Page
  2. Click "Upgrade Plan" or "Start [Plan Name] Plan"
  3. Review plan details and pricing
  4. Select your preferred payment method (PayPal or Stripe)
  5. Complete the secure checkout process
  6. Your plan activates immediately
  7. Tokens are added to your account instantly
  8. You'll receive a confirmation email with invoice

Canceling Your Subscription

  • User-Initiated: Via Settings → Billing tab
  • Effect: Access continues until the end of your current billing period
  • After Cancellation: Automatic downgrade to Free plan at cycle end
  • Refunds: No refunds for partial months

Billing Cycle

  • Free Plan: 30-day cycles
  • Paid Plans: 31-day cycles (monthly)
  • Renewal: Automatic renewal at cycle end
  • Reminders: Email reminder 3 days before renewal
  • Token Reset: Tokens reset at the start of each new cycle
  • Unused Tokens: Not carried over to next cycle

Payment History & Invoices

  • View all past payments in Settings → Billing
  • Download PDF invoices
  • Payment history with transaction details
  • Email confirmation for all payments

Upgrade/Downgrade Policies

Upgrading

  • Immediate Effect: Upgrades take effect immediately
  • Prorating: Not currently available (full month charge)
  • Token Allocation: New plan's token allocation added immediately
  • Cycle Reset: Token cycle resets to new plan's cycle length

Downgrading

  • At Cycle End: Downgrades take effect at the end of current billing period
  • Access: Full access to current plan until cycle end
  • Token Reset: Tokens reset to new plan's allocation at cycle start
  • Feature Access: Feature access changes at cycle start

Enterprise & Custom Plans

For organizations with specific needs:

  • Custom token allocations
  • Extended batch sizes
  • Priority support
  • Custom integrations
  • Dedicated account management

Contact us for enterprise pricing and custom solutions.


Frequently Asked Questions

Q: What happens if I run out of tokens?

A: You can:

  • Wait for your cycle reset (if close to cycle end)
  • Purchase a top-up (Basic plan only: +200 tokens for €1)
  • Upgrade to a higher plan for more tokens

Q: Can I change plans anytime?

A: Yes! You can upgrade or downgrade at any time. Upgrades take effect immediately. Downgrades take effect at the end of your current billing period.

Q: Do unused tokens carry over?

A: No, unused tokens do not carry over to the next cycle. Tokens reset at the start of each new cycle.

Q: What payment methods do you accept?

A: We accept PayPal and Stripe (credit/debit cards: Visa, Mastercard, Amex, Discover).

Q: Is there a free trial?

A: The Free plan is always available with 300 tokens per cycle - no credit card required!

Q: Can I cancel anytime?

A: Yes, you can cancel your subscription at any time. You'll retain access until the end of your current billing period.


Additional Resources


Last Updated: January 1, 2025
Current Version: 4.5.8
Pricing Page: https://anonymize.today/pricing

anonymize.today - Architecture

Last Updated: January 1, 2025
Classification: PUBLIC

Overview

anonymize.today is built on a modern, scalable architecture designed for performance, security, and reliability. This document provides a high-level overview of the system architecture, technology stack, and key components.


System Components

Frontend

  • Framework: Next.js 14 (App Router)
  • Language: TypeScript
  • UI Library: React 18
  • Styling: Tailwind CSS
  • Components: Radix UI, shadcn/ui
  • State Management: React Query, Zustand
  • Authentication: NextAuth.js

Backend Services

  • Language: Python 3.12
  • Framework: Microsoft Presidio
  • API Framework: FastAPI
  • NLP Library: spaCy
  • Services:
    • Analyzer (Port 8011)
    • Anonymizer (Port 8012)
    • Image Redactor (Port 8013)
    • Structured Data (Port 8014)

Database

  • Database: PostgreSQL 16
  • ORM: Prisma
  • Connection: Local connection for security

Infrastructure

  • Web Server: Nginx (reverse proxy)
  • SSL/TLS: Let's Encrypt certificates
  • Process Management: Systemd
  • Operating System: Ubuntu 24.04 LTS

Technology Stack

Frontend Technologies

  • Next.js 14.2.28
  • React 18.2.0
  • TypeScript 5.2.2
  • Tailwind CSS 3.3.3
  • Prisma 6.7.0

Backend Technologies

  • Python 3.12+
  • Microsoft Presidio
  • FastAPI
  • spaCy (NLP)
  • PostgreSQL 16

Infrastructure

  • Nginx
  • Systemd
  • UFW (Firewall)
  • Fail2Ban

System Architecture

┌─────────────────┐
│   Internet      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Nginx (443)   │  ← SSL/TLS Termination
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
    ▼         ▼
┌─────────┐ ┌──────────────────┐
│Frontend │ │  Backend Services│
│(Next.js)│ │  (Presidio)      │
│Port 3000│ │  Ports 8011-8014 │
└────┬────┘ └────────┬─────────┘
     │               │
     └───────┬───────┘
             │
             ▼
      ┌──────────────┐
      │  PostgreSQL  │
      │  Port 5432   │
      └──────────────┘

Data Flow

Analysis Flow

  1. User submits text via web interface
  2. Frontend sends request to Analyzer service
  3. Analyzer processes text with Presidio NLP
  4. Results returned to frontend
  5. Detected entities displayed to user

Anonymization Flow

  1. User selects anonymization operators
  2. Frontend sends text and operators to Anonymizer service
  3. Anonymizer applies operators to detected PII
  4. Anonymized text returned to user
  5. Token cost deducted from user balance

Integration Points

External Services

  • PayPal: Payment processing and subscriptions
  • Stripe: Payment processing and subscriptions
  • Microsoft 365: Email service for transactional emails
  • Google reCAPTCHA: Bot protection
  • GeoIP Service: Location detection for session tracking

API Access

  • RESTful API for programmatic access
  • JWT-based authentication
  • Rate limiting and error handling
  • Comprehensive API documentation

Scalability Considerations

Performance Optimizations

  • Lazy Loading: Language models loaded on-demand
  • Caching: Model caching for improved performance
  • Efficient Processing: Optimized NLP processing
  • Batch Processing: Efficient bulk operations

Resource Management

  • Memory Limits: Service-specific memory limits
  • CPU Optimization: Efficient resource usage
  • Database Optimization: Indexed queries and efficient schemas

Security Architecture

Security Layers

  1. Network Security: Firewall, DDoS protection
  2. Transport Security: TLS 1.3 encryption
  3. Application Security: Authentication, authorization, input validation
  4. Data Security: Encryption at rest, secure storage
  5. Monitoring: Logging, audit trails, intrusion detection

Authentication Flow

  1. User authenticates via NextAuth.js
  2. JWT token generated and stored in HTTP-only cookie
  3. Token validated on each request
  4. Session management with device tracking
  5. Automatic revocation on security events

Deployment Architecture

Service Deployment

  • Frontend: Deployed on high-speed volume
  • Backend Services: Deployed on high-speed volume
  • Database: Deployed on high-speed volume
  • Nginx: Reverse proxy and SSL termination

High Availability

  • Service Monitoring: Systemd service management
  • Automatic Restart: Service auto-restart on failure
  • Health Checks: Regular health check endpoints
  • Backup System: Automated backup and recovery

Performance Characteristics

Response Times

  • Analysis: < 1 second for typical text
  • Anonymization: < 1 second for typical text
  • API Calls: < 500ms average
  • Page Load: < 2 seconds

Throughput

  • Concurrent Users: Supports multiple concurrent users
  • Batch Processing: Efficient bulk processing
  • API Rate Limits: Appropriate rate limiting

Additional Resources


Last Updated: January 1, 2025
Current Version: 4.5.8

Note: Detailed architecture diagrams are available in the diagrams/ directory.