Files
monolog-gdpr-filter/docs/plugin-development.md
Ismo Vuorinen 8866daaf33 feat: add advanced architecture, documentation, and coverage improvements (#65)
* fix(style): resolve PHPCS line-length warnings in source files

* fix(style): resolve PHPCS line-length warnings in test files

* feat(audit): add structured audit logging with ErrorContext and AuditContext

- ErrorContext: standardized error information with sensitive data sanitization
- AuditContext: structured context for audit entries with operation types
- StructuredAuditLogger: enhanced audit logger wrapper with timing support

* feat(recovery): add recovery mechanism for failed masking operations

- FailureMode enum: FAIL_OPEN, FAIL_CLOSED, FAIL_SAFE modes
- RecoveryStrategy interface and RecoveryResult value object
- RetryStrategy: exponential backoff with configurable attempts
- FallbackMaskStrategy: type-aware fallback values

* feat(strategies): add CallbackMaskingStrategy for custom masking logic

- Wraps custom callbacks as MaskingStrategy implementations
- Factory methods: constant(), hash(), partial() for common use cases
- Supports exact match and prefix match for field paths

* docs: add framework integration guides and examples

- symfony-integration.md: Symfony service configuration and Monolog setup
- psr3-decorator.md: PSR-3 logger decorator pattern implementation
- framework-examples.md: CakePHP, CodeIgniter 4, Laminas, Yii2, PSR-15
- docker-development.md: Docker development environment guide

* chore(docker): add Docker development environment

- Dockerfile: PHP 8.2-cli-alpine with Xdebug for coverage
- docker-compose.yml: development services with volume mounts

* feat(demo): add interactive GDPR pattern tester playground

- PatternTester.php: pattern testing utility with strategy support
- index.php: web API endpoint with JSON response handling
- playground.html: interactive web interface for testing patterns

* docs(todo): update with completed medium priority items

- Mark all PHPCS warnings as fixed (81 → 0)
- Document new Audit and Recovery features
- Update test count to 1,068 tests with 2,953 assertions
- Move remaining items to low priority

* feat: add advanced architecture, documentation, and coverage improvements

- Add architecture improvements:
  - ArrayAccessorInterface and DotArrayAccessor for decoupled array access
  - MaskingOrchestrator for single-responsibility masking coordination
  - GdprProcessorBuilder for fluent configuration
  - MaskingPluginInterface and AbstractMaskingPlugin for plugin architecture
  - PluginAwareProcessor for plugin hook execution
  - AuditLoggerFactory for instance-based audit logger creation

- Add advanced features:
  - SerializedDataProcessor for handling print_r/var_export/serialize output
  - KAnonymizer with GeneralizationStrategy for GDPR k-anonymity
  - RetentionPolicy for configurable data retention periods
  - StreamingProcessor for memory-efficient large log processing

- Add comprehensive documentation:
  - docs/performance-tuning.md - benchmarking, optimization, caching
  - docs/troubleshooting.md - common issues and solutions
  - docs/logging-integrations.md - ELK, Graylog, Datadog, etc.
  - docs/plugin-development.md - complete plugin development guide

- Improve test coverage (84.41% → 85.07%):
  - ConditionalRuleFactoryInstanceTest (100% coverage)
  - GdprProcessorBuilderEdgeCasesTest (100% coverage)
  - StrategyEdgeCasesTest for ReDoS detection and type parsing
  - 78 new tests, 119 new assertions

- Update TODO.md with current statistics:
  - 141 PHP files, 1,346 tests, 85.07% line coverage

* chore: tests, update actions, sonarcloud issues

* chore: rector

* fix: more sonarcloud fixes

* chore: more fixes

* refactor: copilot review fix

* chore: rector
2025-12-22 13:38:18 +02:00

14 KiB

Plugin Development Guide

This guide explains how to create custom plugins for the Monolog GDPR Filter library.

Table of Contents

Introduction

Plugins extend the GDPR processor's functionality without modifying core code. Use plugins when you need to:

  • Add custom masking patterns for your domain
  • Transform messages before or after standard masking
  • Enrich context with metadata
  • Integrate with external systems
  • Apply organization-specific compliance rules

When to Use Plugins vs. Configuration

Scenario Use Plugin Use Configuration
Add regex patterns (via getPatterns()) (via constructor)
Custom transformation logic
Conditional processing
Multiple reusable rules
Simple field masking

Quick Start

Create a minimal plugin in three steps:

Step 1: Create the Plugin Class

<?php

namespace App\Logging\Plugins;

use Ivuorinen\MonologGdprFilter\Plugins\AbstractMaskingPlugin;

class MyCompanyPlugin extends AbstractMaskingPlugin
{
    public function getName(): string
    {
        return 'my-company-plugin';
    }

    public function getPatterns(): array
    {
        return [
            '/INTERNAL-\d{6}/' => '[INTERNAL-ID]',  // Internal ID format
            '/EMP-[A-Z]{2}\d{4}/' => '[EMPLOYEE-ID]',  // Employee IDs
        ];
    }
}

Step 2: Register the Plugin

<?php

use Ivuorinen\MonologGdprFilter\Builder\GdprProcessorBuilder;
use App\Logging\Plugins\MyCompanyPlugin;

$processor = GdprProcessorBuilder::create()
    ->withDefaultPatterns()
    ->addPlugin(new MyCompanyPlugin())
    ->buildWithPlugins();

Step 3: Use with Monolog

<?php

use Monolog\Logger;
use Monolog\Handler\StreamHandler;

$logger = new Logger('app');
$logger->pushHandler(new StreamHandler('app.log'));
$logger->pushProcessor($processor);

// Internal IDs and employee IDs are now masked
$logger->info('User INTERNAL-123456 (EMP-AB1234) logged in');
// Output: User [INTERNAL-ID] ([EMPLOYEE-ID]) logged in

Plugin Interface

All plugins must implement MaskingPluginInterface:

interface MaskingPluginInterface
{
    // Identification
    public function getName(): string;

    // Pre-processing hooks (before standard masking)
    public function preProcessContext(array $context): array;
    public function preProcessMessage(string $message): string;

    // Post-processing hooks (after standard masking)
    public function postProcessContext(array $context): array;
    public function postProcessMessage(string $message): string;

    // Configuration contribution
    public function getPatterns(): array;
    public function getFieldPaths(): array;

    // Execution order control
    public function getPriority(): int;
}

Method Reference

Method Purpose When Called
getName() Unique identifier for debugging On registration
preProcessContext() Modify context before masking Before core masking
preProcessMessage() Modify message before masking Before core masking
postProcessContext() Modify context after masking After core masking
postProcessMessage() Modify message after masking After core masking
getPatterns() Provide regex patterns During build
getFieldPaths() Provide field paths to mask During build
getPriority() Control execution order During sorting

Abstract Base Class

Extend AbstractMaskingPlugin to avoid implementing unused methods:

<?php

namespace Ivuorinen\MonologGdprFilter\Plugins;

abstract class AbstractMaskingPlugin implements MaskingPluginInterface
{
    public function __construct(protected readonly int $priority = 100)
    {
    }

    // Default implementations return input unchanged
    public function preProcessContext(array $context): array { return $context; }
    public function postProcessContext(array $context): array { return $context; }
    public function preProcessMessage(string $message): string { return $message; }
    public function postProcessMessage(string $message): string { return $message; }
    public function getPatterns(): array { return []; }
    public function getFieldPaths(): array { return []; }
    public function getPriority(): int { return $this->priority; }
}

Benefits

  • Override only the methods you need
  • Default priority of 100 (customizable via constructor)
  • All hooks pass data through unchanged by default

Registration

Register plugins using GdprProcessorBuilder:

<?php

use Ivuorinen\MonologGdprFilter\Builder\GdprProcessorBuilder;

// Single plugin
$processor = GdprProcessorBuilder::create()
    ->addPlugin($plugin)
    ->buildWithPlugins();

// Multiple plugins
$processor = GdprProcessorBuilder::create()
    ->addPlugins([$plugin1, $plugin2, $plugin3])
    ->buildWithPlugins();

// With other configuration
$processor = GdprProcessorBuilder::create()
    ->withDefaultPatterns()
    ->addPattern('/custom/', '[MASKED]')
    ->addFieldPath('secret', FieldMaskConfig::remove())
    ->addPlugin($plugin)
    ->withAuditLogger($auditLogger)
    ->buildWithPlugins();

Return Types

// No plugins: returns GdprProcessor (no wrapper overhead)
$processor = GdprProcessorBuilder::create()
    ->withDefaultPatterns()
    ->buildWithPlugins();  // GdprProcessor

// With plugins: returns PluginAwareProcessor (wraps GdprProcessor)
$processor = GdprProcessorBuilder::create()
    ->addPlugin($plugin)
    ->buildWithPlugins();  // PluginAwareProcessor

Hook Execution Order

Understanding execution order is critical for plugins that interact:

1. preProcessMessage()  - Plugins in priority order (10, 20, 30...)
2. preProcessContext()  - Plugins in priority order (10, 20, 30...)
3. [Core GdprProcessor masking]
4. postProcessMessage() - Plugins in REVERSE order (30, 20, 10...)
5. postProcessContext() - Plugins in REVERSE order (30, 20, 10...)

Why Reverse Order for Post-Processing?

Post-processing runs in reverse to properly "unwrap" transformations:

// Plugin A (priority 10) wraps: "data" -> "[A:data:A]"
// Plugin B (priority 20) wraps: "[A:data:A]" -> "[B:[A:data:A]:B]"

// Post-processing reverse order ensures proper unwrapping:
// Plugin B runs first: "[B:[A:masked:A]:B]" -> "[A:masked:A]"
// Plugin A runs second: "[A:masked:A]" -> "masked"

Priority System

Lower numbers execute earlier in pre-processing:

class HighPriorityPlugin extends AbstractMaskingPlugin
{
    public function __construct()
    {
        parent::__construct(priority: 10);  // Runs early
    }
}

class NormalPriorityPlugin extends AbstractMaskingPlugin
{
    // Default priority: 100
}

class LowPriorityPlugin extends AbstractMaskingPlugin
{
    public function __construct()
    {
        parent::__construct(priority: 200);  // Runs late
    }
}
Range Use Case Example
1-50 Security/validation Input sanitization
50-100 Standard processing Pattern masking
100-150 Business logic Domain-specific rules
150-200 Enrichment Adding metadata
200+ Cleanup/finalization Removing temp fields

Configuration Contribution

Plugins can contribute patterns and field paths that are merged into the processor:

Adding Patterns

public function getPatterns(): array
{
    return [
        '/ACME-\d{8}/' => '[ACME-ORDER]',
        '/INV-[A-Z]{2}-\d+/' => '[INVOICE]',
    ];
}

Adding Field Paths

use Ivuorinen\MonologGdprFilter\FieldMaskConfig;

public function getFieldPaths(): array
{
    return [
        // Static replacement
        'api_key' => FieldMaskConfig::replace('[API_KEY]'),

        // Remove field entirely
        'internal.debug' => FieldMaskConfig::remove(),

        // Apply regex to field value
        'user.notes' => FieldMaskConfig::regexMask('/\d{3}-\d{2}-\d{4}/', '[SSN]'),

        // Use processor's global patterns
        'user.bio' => FieldMaskConfig::useProcessorPatterns(),
    ];
}

Use Cases

Use Case 1: Message Transformation

Transform messages before masking:

class NormalizePlugin extends AbstractMaskingPlugin
{
    public function getName(): string
    {
        return 'normalize-plugin';
    }

    public function preProcessMessage(string $message): string
    {
        // Normalize whitespace before masking
        return preg_replace('/\s+/', ' ', trim($message));
    }
}

Use Case 2: Domain-Specific Patterns

Add patterns for your organization:

class HealthcarePlugin extends AbstractMaskingPlugin
{
    public function getName(): string
    {
        return 'healthcare-plugin';
    }

    public function getPatterns(): array
    {
        return [
            // Medical Record Number
            '/MRN-\d{10}/' => '[MRN]',
            // National Provider Identifier
            '/NPI-\d{10}/' => '[NPI]',
            // DEA Number
            '/DEA-[A-Z]{2}\d{7}/' => '[DEA]',
        ];
    }

    public function getFieldPaths(): array
    {
        return [
            'patient.diagnosis' => FieldMaskConfig::replace('[PHI]'),
            'patient.medications' => FieldMaskConfig::remove(),
        ];
    }
}

Use Case 3: Context Enrichment

Add metadata to context:

class AuditPlugin extends AbstractMaskingPlugin
{
    public function getName(): string
    {
        return 'audit-plugin';
    }

    public function __construct(private readonly string $environment)
    {
        parent::__construct(priority: 150);  // Run late
    }

    public function postProcessContext(array $context): array
    {
        $context['_audit'] = [
            'processed_at' => date('c'),
            'environment' => $this->environment,
            'plugin_version' => '1.0.0',
        ];
        return $context;
    }
}

Use Case 4: Conditional Masking

Apply masking based on conditions:

class EnvironmentAwarePlugin extends AbstractMaskingPlugin
{
    public function getName(): string
    {
        return 'environment-aware-plugin';
    }

    public function preProcessContext(array $context): array
    {
        // Only mask in production
        if (getenv('APP_ENV') !== 'production') {
            return $context;
        }

        // Add extra masking for production
        if (isset($context['debug_info'])) {
            $context['debug_info'] = '[REDACTED IN PRODUCTION]';
        }

        return $context;
    }
}

Use Case 5: External Integration

Integrate with external services:

class CompliancePlugin extends AbstractMaskingPlugin
{
    public function getName(): string
    {
        return 'compliance-plugin';
    }

    public function __construct(
        private readonly ComplianceService $service
    ) {
        parent::__construct(priority: 50);
    }

    public function postProcessContext(array $context): array
    {
        // Log to compliance system
        $this->service->recordMaskingEvent(
            fields: array_keys($context),
            timestamp: new \DateTimeImmutable()
        );

        return $context;
    }
}

Best Practices

1. Keep Plugins Focused

Each plugin should have a single responsibility:

// Good: Single purpose
class EmailPatternPlugin extends AbstractMaskingPlugin { /* ... */ }
class PhonePatternPlugin extends AbstractMaskingPlugin { /* ... */ }

// Avoid: Multiple unrelated responsibilities
class EverythingPlugin extends AbstractMaskingPlugin { /* ... */ }

2. Use Descriptive Names

Plugin names should be unique and descriptive:

// Good
public function getName(): string
{
    return 'acme-healthcare-hipaa-v2';
}

// Avoid
public function getName(): string
{
    return 'plugin1';
}

3. Handle Errors Gracefully

Plugins should not throw exceptions that break logging:

public function preProcessContext(array $context): array
{
    try {
        // Risky operation
        $context['processed'] = $this->riskyTransform($context);
    } catch (\Throwable $e) {
        // Log error but don't break logging
        error_log("Plugin error: " . $e->getMessage());
    }

    return $context;  // Always return context
}

4. Document Your Patterns

Add comments explaining pattern purpose:

public function getPatterns(): array
{
    return [
        // ACME internal order numbers: ACME-YYYYMMDD-NNNN
        '/ACME-\d{8}-\d{4}/' => '[ORDER-ID]',

        // Employee badges: EMP followed by 6 digits
        '/EMP\d{6}/' => '[EMPLOYEE]',
    ];
}

5. Test Your Plugins

Create comprehensive tests:

class MyPluginTest extends TestCase
{
    public function testPatternMasking(): void
    {
        $plugin = new MyPlugin();
        $patterns = $plugin->getPatterns();

        // Test each pattern
        foreach ($patterns as $pattern => $replacement) {
            $this->assertMatchesRegularExpression($pattern, 'INTERNAL-123456');
        }
    }

    public function testPreProcessing(): void
    {
        $plugin = new MyPlugin();
        $context = ['sensitive' => 'value'];

        $result = $plugin->preProcessContext($context);

        $this->assertArrayHasKey('sensitive', $result);
    }
}

6. Consider Performance

Avoid expensive operations in hooks that run for every log entry:

// Good: Simple operations
public function preProcessMessage(string $message): string
{
    return trim($message);
}

// Avoid: Heavy operations for every log
public function preProcessMessage(string $message): string
{
    return $this->httpClient->validateMessage($message);  // Slow!
}

7. Use Priority Thoughtfully

Consider how your plugin interacts with others:

// Security validation should run early
class SecurityPlugin extends AbstractMaskingPlugin
{
    public function __construct()
    {
        parent::__construct(priority: 10);
    }
}

// Metadata enrichment should run late
class MetadataPlugin extends AbstractMaskingPlugin
{
    public function __construct()
    {
        parent::__construct(priority: 180);
    }
}