feat: add advanced architecture, documentation, and coverage improvements (#65)

* fix(style): resolve PHPCS line-length warnings in source files

* fix(style): resolve PHPCS line-length warnings in test files

* feat(audit): add structured audit logging with ErrorContext and AuditContext

- ErrorContext: standardized error information with sensitive data sanitization
- AuditContext: structured context for audit entries with operation types
- StructuredAuditLogger: enhanced audit logger wrapper with timing support

* feat(recovery): add recovery mechanism for failed masking operations

- FailureMode enum: FAIL_OPEN, FAIL_CLOSED, FAIL_SAFE modes
- RecoveryStrategy interface and RecoveryResult value object
- RetryStrategy: exponential backoff with configurable attempts
- FallbackMaskStrategy: type-aware fallback values

* feat(strategies): add CallbackMaskingStrategy for custom masking logic

- Wraps custom callbacks as MaskingStrategy implementations
- Factory methods: constant(), hash(), partial() for common use cases
- Supports exact match and prefix match for field paths

* docs: add framework integration guides and examples

- symfony-integration.md: Symfony service configuration and Monolog setup
- psr3-decorator.md: PSR-3 logger decorator pattern implementation
- framework-examples.md: CakePHP, CodeIgniter 4, Laminas, Yii2, PSR-15
- docker-development.md: Docker development environment guide

* chore(docker): add Docker development environment

- Dockerfile: PHP 8.2-cli-alpine with Xdebug for coverage
- docker-compose.yml: development services with volume mounts

* feat(demo): add interactive GDPR pattern tester playground

- PatternTester.php: pattern testing utility with strategy support
- index.php: web API endpoint with JSON response handling
- playground.html: interactive web interface for testing patterns

* docs(todo): update with completed medium priority items

- Mark all PHPCS warnings as fixed (81 → 0)
- Document new Audit and Recovery features
- Update test count to 1,068 tests with 2,953 assertions
- Move remaining items to low priority

* feat: add advanced architecture, documentation, and coverage improvements

- Add architecture improvements:
  - ArrayAccessorInterface and DotArrayAccessor for decoupled array access
  - MaskingOrchestrator for single-responsibility masking coordination
  - GdprProcessorBuilder for fluent configuration
  - MaskingPluginInterface and AbstractMaskingPlugin for plugin architecture
  - PluginAwareProcessor for plugin hook execution
  - AuditLoggerFactory for instance-based audit logger creation

- Add advanced features:
  - SerializedDataProcessor for handling print_r/var_export/serialize output
  - KAnonymizer with GeneralizationStrategy for GDPR k-anonymity
  - RetentionPolicy for configurable data retention periods
  - StreamingProcessor for memory-efficient large log processing

- Add comprehensive documentation:
  - docs/performance-tuning.md - benchmarking, optimization, caching
  - docs/troubleshooting.md - common issues and solutions
  - docs/logging-integrations.md - ELK, Graylog, Datadog, etc.
  - docs/plugin-development.md - complete plugin development guide

- Improve test coverage (84.41% → 85.07%):
  - ConditionalRuleFactoryInstanceTest (100% coverage)
  - GdprProcessorBuilderEdgeCasesTest (100% coverage)
  - StrategyEdgeCasesTest for ReDoS detection and type parsing
  - 78 new tests, 119 new assertions

- Update TODO.md with current statistics:
  - 141 PHP files, 1,346 tests, 85.07% line coverage

* chore: tests, update actions, sonarcloud issues

* chore: rector

* fix: more sonarcloud fixes

* chore: more fixes

* refactor: copilot review fix

* chore: rector
This commit is contained in:
2025-12-22 13:38:18 +02:00
committed by GitHub
parent b1eb567b92
commit 8866daaf33
112 changed files with 15391 additions and 607 deletions

315
docs/docker-development.md Normal file
View File

@@ -0,0 +1,315 @@
# Docker Development Environment
This guide explains how to set up a Docker development environment for working with the Monolog GDPR Filter library.
## Quick Start
### Using Docker Compose
```bash
# Clone the repository
git clone https://github.com/ivuorinen/monolog-gdpr-filter.git
cd monolog-gdpr-filter
# Start the development environment
docker compose up -d
# Run tests
docker compose exec php composer test
# Run linting
docker compose exec php composer lint
```
## Docker Configuration Files
### docker/Dockerfile
```dockerfile
FROM php:8.2-cli-alpine
# Install system dependencies
RUN apk add --no-cache \
git \
unzip \
curl \
libzip-dev \
icu-dev \
&& docker-php-ext-install \
zip \
intl \
pcntl
# Install Composer
COPY --from=composer:2 /usr/bin/composer /usr/bin/composer
# Install Xdebug for code coverage
RUN apk add --no-cache $PHPIZE_DEPS \
&& pecl install xdebug \
&& docker-php-ext-enable xdebug
# Configure Xdebug
RUN echo "xdebug.mode=coverage,debug" >> /usr/local/etc/php/conf.d/docker-php-ext-xdebug.ini \
&& echo "xdebug.client_host=host.docker.internal" >> /usr/local/etc/php/conf.d/docker-php-ext-xdebug.ini
# Set working directory
WORKDIR /app
# Set recommended PHP settings for development
RUN echo "memory_limit=512M" >> /usr/local/etc/php/conf.d/docker-php-memory.ini \
&& echo "error_reporting=E_ALL" >> /usr/local/etc/php/conf.d/docker-php-errors.ini \
&& echo "display_errors=On" >> /usr/local/etc/php/conf.d/docker-php-errors.ini
# Create non-root user
RUN addgroup -g 1000 developer \
&& adduser -D -u 1000 -G developer developer
USER developer
CMD ["php", "-v"]
```
### docker/docker-compose.yml
```yaml
version: '3.8'
services:
php:
build:
context: .
dockerfile: Dockerfile
volumes:
- ..:/app
- composer-cache:/home/developer/.composer/cache
working_dir: /app
environment:
- COMPOSER_HOME=/home/developer/.composer
- XDEBUG_MODE=coverage
stdin_open: true
tty: true
command: tail -f /dev/null
# Optional: PHP 8.3 for testing compatibility
php83:
image: php:8.3-cli-alpine
volumes:
- ..:/app
working_dir: /app
profiles:
- testing
command: php -v
volumes:
composer-cache:
```
## Running Tests
### All Tests
```bash
docker compose exec php composer test
```
### With Coverage Report
```bash
docker compose exec php composer test:coverage
```
### Specific Test File
```bash
docker compose exec php ./vendor/bin/phpunit tests/GdprProcessorTest.php
```
### Specific Test Method
```bash
docker compose exec php ./vendor/bin/phpunit --filter testEmailMasking
```
## Running Linting Tools
### All Linting
```bash
docker compose exec php composer lint
```
### Individual Tools
```bash
# PHP CodeSniffer
docker compose exec php ./vendor/bin/phpcs
# Auto-fix with PHPCBF
docker compose exec php ./vendor/bin/phpcbf
# Psalm
docker compose exec php ./vendor/bin/psalm
# PHPStan
docker compose exec php ./vendor/bin/phpstan analyse
# Rector (dry-run)
docker compose exec php ./vendor/bin/rector --dry-run
```
## Development Workflow
### Initial Setup
```bash
# Build containers
docker compose build
# Start services
docker compose up -d
# Install dependencies
docker compose exec php composer install
# Run initial checks
docker compose exec php composer lint
docker compose exec php composer test
```
### Daily Development
```bash
# Start environment
docker compose up -d
# Make changes...
# Run tests
docker compose exec php composer test
# Run linting
docker compose exec php composer lint
# Auto-fix issues
docker compose exec php composer lint:fix
```
### Testing Multiple PHP Versions
```bash
# Test with PHP 8.3
docker compose --profile testing run php83 php -v
docker compose --profile testing run php83 ./vendor/bin/phpunit
```
## Debugging
### Enable Xdebug
The Docker configuration includes Xdebug. Configure your IDE to listen on port 9003.
For VS Code, add to `.vscode/launch.json`:
```json
{
"version": "0.2.0",
"configurations": [
{
"name": "Listen for Xdebug",
"type": "php",
"request": "launch",
"port": 9003,
"pathMappings": {
"/app": "${workspaceFolder}"
}
}
]
}
```
### Interactive Shell
```bash
docker compose exec php sh
```
### View Logs
```bash
docker compose logs -f php
```
## CI/CD Integration
### GitHub Actions Example
```yaml
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
php: ['8.2', '8.3']
steps:
- uses: actions/checkout@v4
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: ${{ matrix.php }}
extensions: intl, zip
coverage: xdebug
- name: Install dependencies
run: composer install --prefer-dist --no-progress
- name: Run linting
run: composer lint
- name: Run tests
run: composer test:coverage
```
## Troubleshooting
### Permission Issues
If you encounter permission issues:
```bash
# Fix ownership
docker compose exec -u root php chown -R developer:developer /app
# Or run as root temporarily
docker compose exec -u root php composer install
```
### Composer Memory Limit
```bash
docker compose exec php php -d memory_limit=-1 /usr/bin/composer install
```
### Clear Caches
```bash
# Clear composer cache
docker compose exec php composer clear-cache
# Clear Psalm cache
docker compose exec php ./vendor/bin/psalm --clear-cache
# Clear PHPStan cache
docker compose exec php ./vendor/bin/phpstan clear-result-cache
```
## See Also
- [Symfony Integration](symfony-integration.md)
- [PSR-3 Decorator](psr3-decorator.md)
- [Framework Examples](framework-examples.md)

372
docs/framework-examples.md Normal file
View File

@@ -0,0 +1,372 @@
# Framework Integration Examples
This guide provides integration examples for various PHP frameworks.
## CakePHP
### Installation
```bash
composer require ivuorinen/monolog-gdpr-filter
```
### Configuration
Create a custom log engine in `src/Log/Engine/GdprFileLog.php`:
```php
<?php
declare(strict_types=1);
namespace App\Log\Engine;
use Cake\Log\Engine\FileLog;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Monolog\Level;
use Monolog\LogRecord;
use DateTimeImmutable;
class GdprFileLog extends FileLog
{
protected GdprProcessor $gdprProcessor;
public function __construct(array $config = [])
{
parent::__construct($config);
$patterns = $config['gdpr_patterns'] ?? [
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
'/\b\d{3}-\d{2}-\d{4}\b/' => '***-**-****',
];
$this->gdprProcessor = new GdprProcessor($patterns);
}
public function log($level, string $message, array $context = []): void
{
$record = new LogRecord(
datetime: new DateTimeImmutable(),
channel: 'app',
level: $this->convertLevel($level),
message: $message,
context: $context
);
$processed = ($this->gdprProcessor)($record);
parent::log($level, $processed->message, $processed->context);
}
private function convertLevel(mixed $level): Level
{
return match ($level) {
'emergency' => Level::Emergency,
'alert' => Level::Alert,
'critical' => Level::Critical,
'error' => Level::Error,
'warning' => Level::Warning,
'notice' => Level::Notice,
'info' => Level::Info,
'debug' => Level::Debug,
default => Level::Info,
};
}
}
```
Configure in `config/app.php`:
```php
'Log' => [
'default' => [
'className' => \App\Log\Engine\GdprFileLog::class,
'path' => LOGS,
'file' => 'debug',
'levels' => ['notice', 'info', 'debug'],
'gdpr_patterns' => [
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
],
],
],
```
## CodeIgniter 4
### Configuration
Create a custom logger in `app/Libraries/GdprLogger.php`:
```php
<?php
namespace App\Libraries;
use CodeIgniter\Log\Logger;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Monolog\Level;
use Monolog\LogRecord;
use DateTimeImmutable;
class GdprLogger extends Logger
{
protected GdprProcessor $gdprProcessor;
public function __construct($config, bool $introspect = true)
{
parent::__construct($config, $introspect);
$patterns = $config->gdprPatterns ?? [
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
];
$this->gdprProcessor = new GdprProcessor($patterns);
}
public function log($level, $message, array $context = []): bool
{
$record = new LogRecord(
datetime: new DateTimeImmutable(),
channel: 'ci4',
level: $this->mapLevel($level),
message: (string) $message,
context: $context
);
$processed = ($this->gdprProcessor)($record);
return parent::log($level, $processed->message, $processed->context);
}
private function mapLevel(mixed $level): Level
{
return match (strtolower((string) $level)) {
'emergency' => Level::Emergency,
'alert' => Level::Alert,
'critical' => Level::Critical,
'error' => Level::Error,
'warning' => Level::Warning,
'notice' => Level::Notice,
'info' => Level::Info,
'debug' => Level::Debug,
default => Level::Info,
};
}
}
```
Register in `app/Config/Services.php`:
```php
public static function logger(bool $getShared = true): \App\Libraries\GdprLogger
{
if ($getShared) {
return static::getSharedInstance('logger');
}
return new \App\Libraries\GdprLogger(new \Config\Logger());
}
```
## Laminas (formerly Zend Framework)
### Service Configuration
```php
<?php
// config/autoload/logging.global.php
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Laminas\Log\Logger;
use Laminas\Log\Writer\Stream;
use Laminas\Log\Processor\ProcessorInterface;
use Psr\Container\ContainerInterface;
return [
'service_manager' => [
'factories' => [
GdprProcessor::class => function (ContainerInterface $container) {
$config = $container->get('config')['gdpr'] ?? [];
return new GdprProcessor(
$config['patterns'] ?? [],
$config['field_paths'] ?? []
);
},
'GdprLogProcessor' => function (ContainerInterface $container) {
$gdprProcessor = $container->get(GdprProcessor::class);
return new class($gdprProcessor) implements ProcessorInterface {
public function __construct(
private readonly GdprProcessor $gdprProcessor
) {}
public function process(array $event): array
{
// Convert to LogRecord, process, convert back
$record = new \Monolog\LogRecord(
datetime: new \DateTimeImmutable(),
channel: 'laminas',
level: \Monolog\Level::Info,
message: $event['message'] ?? '',
context: $event['extra'] ?? []
);
$processed = ($this->gdprProcessor)($record);
$event['message'] = $processed->message;
$event['extra'] = $processed->context;
return $event;
}
};
},
Logger::class => function (ContainerInterface $container) {
$logger = new Logger();
$logger->addWriter(new Stream('data/logs/app.log'));
$logger->addProcessor($container->get('GdprLogProcessor'));
return $logger;
},
],
],
'gdpr' => [
'patterns' => [
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
'/\b\d{3}-\d{2}-\d{4}\b/' => '***-**-****',
],
'field_paths' => [
'user.password' => '***REMOVED***',
],
],
];
```
## Yii2
### Component Configuration
```php
<?php
// config/web.php or config/console.php
return [
'components' => [
'log' => [
'traceLevel' => YII_DEBUG ? 3 : 0,
'targets' => [
[
'class' => 'app\components\GdprFileTarget',
'levels' => ['error', 'warning', 'info'],
'gdprPatterns' => [
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
],
],
],
],
],
];
```
Create `components/GdprFileTarget.php`:
```php
<?php
namespace app\components;
use yii\log\FileTarget;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Monolog\Level;
use Monolog\LogRecord;
use DateTimeImmutable;
class GdprFileTarget extends FileTarget
{
public array $gdprPatterns = [];
private ?GdprProcessor $processor = null;
public function init(): void
{
parent::init();
if (!empty($this->gdprPatterns)) {
$this->processor = new GdprProcessor($this->gdprPatterns);
}
}
public function formatMessage($message): string
{
if ($this->processor !== null) {
[$text, $level, $category, $timestamp] = $message;
$record = new LogRecord(
datetime: new DateTimeImmutable('@' . $timestamp),
channel: $category,
level: Level::Info,
message: is_string($text) ? $text : json_encode($text) ?: '',
context: []
);
$processed = ($this->processor)($record);
$message[0] = $processed->message;
}
return parent::formatMessage($message);
}
}
```
## Generic PSR-15 Middleware
For any framework supporting PSR-15 middleware:
```php
<?php
namespace YourApp\Middleware;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;
use Psr\Log\LoggerInterface;
class GdprLoggingMiddleware implements MiddlewareInterface
{
public function __construct(
private readonly LoggerInterface $logger,
private readonly GdprProcessor $gdprProcessor
) {
}
public function process(
ServerRequestInterface $request,
RequestHandlerInterface $handler
): ResponseInterface {
// Log request (with GDPR filtering applied via decorator)
$this->logger->info('Request received', [
'method' => $request->getMethod(),
'uri' => (string) $request->getUri(),
'body' => $request->getParsedBody(),
]);
$response = $handler->handle($request);
// Log response
$this->logger->info('Response sent', [
'status' => $response->getStatusCode(),
]);
return $response;
}
}
```
## See Also
- [Symfony Integration](symfony-integration.md)
- [PSR-3 Decorator](psr3-decorator.md)
- [Docker Development](docker-development.md)

View File

@@ -0,0 +1,595 @@
# Logging Platform Integrations
This guide covers integrating the Monolog GDPR Filter with popular logging platforms and services.
## Table of Contents
- [ELK Stack (Elasticsearch, Logstash, Kibana)](#elk-stack)
- [Graylog](#graylog)
- [Datadog](#datadog)
- [New Relic](#new-relic)
- [Sentry](#sentry)
- [Papertrail](#papertrail)
- [Loggly](#loggly)
- [AWS CloudWatch](#aws-cloudwatch)
- [Google Cloud Logging](#google-cloud-logging)
- [Fluentd/Fluent Bit](#fluentdfluent-bit)
## ELK Stack
### Elasticsearch with Monolog
```php
<?php
use Monolog\Logger;
use Monolog\Handler\ElasticsearchHandler;
use Monolog\Formatter\ElasticsearchFormatter;
use Elastic\Elasticsearch\ClientBuilder;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
// Create Elasticsearch client
$client = ClientBuilder::create()
->setHosts(['localhost:9200'])
->build();
// Create handler
$handler = new ElasticsearchHandler($client, [
'index' => 'app-logs',
'type' => '_doc',
]);
$handler->setFormatter(new ElasticsearchFormatter('app-logs', '_doc'));
// Create logger with GDPR processor
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
// Logs are now GDPR-compliant before reaching Elasticsearch
$logger->info('User login', ['email' => 'user@example.com', 'ip' => '192.168.1.1']);
```
### Logstash Integration
For Logstash, use the Gelf handler or send JSON to a TCP/UDP input:
```php
<?php
use Monolog\Logger;
use Monolog\Handler\SocketHandler;
use Monolog\Formatter\JsonFormatter;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$handler = new SocketHandler('tcp://logstash.example.com:5000');
$handler->setFormatter(new JsonFormatter());
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
```
Logstash configuration:
```ruby
input {
tcp {
port => 5000
codec => json
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "app-logs-%{+YYYY.MM.dd}"
}
}
```
## Graylog
### GELF Handler Integration
```php
<?php
use Monolog\Logger;
use Monolog\Handler\GelfHandler;
use Gelf\Publisher;
use Gelf\Transport\UdpTransport;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
// Create GELF transport
$transport = new UdpTransport('graylog.example.com', 12201);
$publisher = new Publisher($transport);
// Create handler
$handler = new GelfHandler($publisher);
// Create logger with GDPR processor
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
$logger->info('Payment processed', [
'user_email' => 'customer@example.com',
'card_last_four' => '4242',
]);
```
### Graylog Stream Configuration
Create a stream to filter GDPR-sensitive logs:
1. Create an extractor to identify masked fields
2. Set up alerts for potential data leaks (unmasked patterns)
```php
<?php
// Add metadata to help Graylog categorize
$logger->pushProcessor(function ($record) {
$record['extra']['gdpr_processed'] = true;
$record['extra']['app_version'] = '1.0.0';
return $record;
});
```
## Datadog
### Datadog Handler Integration
```php
<?php
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use Monolog\Formatter\JsonFormatter;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
// Datadog agent reads from file or stdout
$handler = new StreamHandler('php://stdout');
$handler->setFormatter(new JsonFormatter());
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
// Add Datadog-specific context
$logger->pushProcessor(function ($record) {
$record['extra']['dd'] = [
'service' => 'my-php-app',
'env' => getenv('DD_ENV') ?: 'production',
'version' => '1.0.0',
];
return $record;
});
$logger->info('User action', ['user_id' => 123, 'email' => 'user@example.com']);
```
### Datadog APM Integration
```php
<?php
use DDTrace\GlobalTracer;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
// Add trace context to logs
$logger->pushProcessor(function ($record) {
$tracer = GlobalTracer::get();
$span = $tracer->getActiveSpan();
if ($span) {
$record['extra']['dd.trace_id'] = $span->getTraceId();
$record['extra']['dd.span_id'] = $span->getSpanId();
}
return $record;
});
```
## New Relic
### New Relic Handler Integration
```php
<?php
use Monolog\Logger;
use Monolog\Handler\NewRelicHandler;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$handler = new NewRelicHandler(
level: Logger::ERROR,
appName: 'My PHP App'
);
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
// Errors are sent to New Relic with masked PII
$logger->error('Authentication failed', [
'email' => 'user@example.com',
'ip' => '192.168.1.1',
]);
```
### Custom Attributes
```php
<?php
// Add New Relic custom attributes
$logger->pushProcessor(function ($record) {
if (function_exists('newrelic_add_custom_parameter')) {
newrelic_add_custom_parameter('log_level', $record['level_name']);
newrelic_add_custom_parameter('channel', $record['channel']);
}
return $record;
});
```
## Sentry
### Sentry Handler Integration
```php
<?php
use Monolog\Logger;
use Sentry\Monolog\Handler;
use Sentry\State\Hub;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
\Sentry\init(['dsn' => 'https://key@sentry.io/project']);
$handler = new Handler(Hub::getCurrent());
$logger = new Logger('app');
$logger->pushHandler($handler);
// IMPORTANT: Add GDPR processor BEFORE Sentry handler processes
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
$logger->error('Payment failed', [
'user_email' => 'customer@example.com',
'card_number' => '4111111111111111',
]);
```
### Sentry Breadcrumbs
```php
<?php
use Sentry\Breadcrumb;
// Add breadcrumb processor that respects GDPR
$logger->pushProcessor(function ($record) {
\Sentry\addBreadcrumb(new Breadcrumb(
Breadcrumb::LEVEL_INFO,
Breadcrumb::TYPE_DEFAULT,
$record['channel'],
$record['message'], // Already masked by GDPR processor
$record['context'] // Already masked
));
return $record;
});
```
## Papertrail
### Papertrail Handler Integration
```php
<?php
use Monolog\Logger;
use Monolog\Handler\SyslogUdpHandler;
use Monolog\Formatter\LineFormatter;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$handler = new SyslogUdpHandler(
'logs.papertrailapp.com',
12345 // Your Papertrail port
);
$formatter = new LineFormatter(
"%channel%.%level_name%: %message% %context% %extra%\n",
null,
true,
true
);
$handler->setFormatter($formatter);
$logger = new Logger('my-app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
```
## Loggly
### Loggly Handler Integration
```php
<?php
use Monolog\Logger;
use Monolog\Handler\LogglyHandler;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$handler = new LogglyHandler('your-loggly-token/tag/monolog');
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
$logger->info('User registered', [
'email' => 'newuser@example.com',
'phone' => '+1-555-123-4567',
]);
```
## AWS CloudWatch
### CloudWatch Handler Integration
```php
<?php
use Monolog\Logger;
use Aws\CloudWatchLogs\CloudWatchLogsClient;
use Maxbanton\Cwh\Handler\CloudWatch;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$client = new CloudWatchLogsClient([
'region' => 'us-east-1',
'version' => 'latest',
]);
$handler = new CloudWatch(
$client,
'app-log-group',
'app-log-stream',
retentionDays: 14
);
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
$logger->info('API request', [
'user_email' => 'api-user@example.com',
'endpoint' => '/api/v1/users',
]);
```
### CloudWatch with Laravel
```php
<?php
// config/logging.php
return [
'channels' => [
'cloudwatch' => [
'driver' => 'custom',
'via' => App\Logging\CloudWatchLoggerFactory::class,
'retention' => 14,
'group' => env('CLOUDWATCH_LOG_GROUP', 'laravel'),
'stream' => env('CLOUDWATCH_LOG_STREAM', 'app'),
],
],
];
```
```php
<?php
// app/Logging/CloudWatchLoggerFactory.php
namespace App\Logging;
use Aws\CloudWatchLogs\CloudWatchLogsClient;
use Maxbanton\Cwh\Handler\CloudWatch;
use Monolog\Logger;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
class CloudWatchLoggerFactory
{
public function __invoke(array $config): Logger
{
$client = new CloudWatchLogsClient([
'region' => config('services.aws.region'),
'version' => 'latest',
]);
$handler = new CloudWatch(
$client,
$config['group'],
$config['stream'],
$config['retention']
);
$logger = new Logger('cloudwatch');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
return $logger;
}
}
```
## Google Cloud Logging
### Google Cloud Handler Integration
```php
<?php
use Monolog\Logger;
use Google\Cloud\Logging\LoggingClient;
use Google\Cloud\Logging\PsrLogger;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$logging = new LoggingClient([
'projectId' => 'your-project-id',
]);
$psrLogger = $logging->psrLogger('app-logs');
// Wrap in Monolog for processor support
$monologLogger = new Logger('app');
$monologLogger->pushHandler(new \Monolog\Handler\PsrHandler($psrLogger));
$monologLogger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
$monologLogger->info('User action', [
'email' => 'user@example.com',
'action' => 'login',
]);
```
## Fluentd/Fluent Bit
### Fluentd Integration
```php
<?php
use Monolog\Logger;
use Monolog\Handler\SocketHandler;
use Monolog\Formatter\JsonFormatter;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
// Send to Fluentd forward input
$handler = new SocketHandler('tcp://fluentd:24224');
$handler->setFormatter(new JsonFormatter());
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
// Add Fluentd tag
$logger->pushProcessor(function ($record) {
$record['extra']['fluent_tag'] = 'app.logs';
return $record;
});
```
Fluentd configuration:
```ruby
<source>
@type forward
port 24224
</source>
<match app.**>
@type elasticsearch
host elasticsearch
port 9200
index_name app-logs
</match>
```
### Fluent Bit with File Tail
```php
<?php
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use Monolog\Formatter\JsonFormatter;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
// Write JSON logs to file for Fluent Bit to tail
$handler = new StreamHandler('/var/log/app/app.json.log');
$handler->setFormatter(new JsonFormatter());
$logger = new Logger('app');
$logger->pushHandler($handler);
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
```
Fluent Bit configuration:
```ini
[INPUT]
Name tail
Path /var/log/app/*.json.log
Parser json
[OUTPUT]
Name es
Host elasticsearch
Port 9200
Index app-logs
```
## Best Practices
### 1. Always Process Before Sending
Ensure the GDPR processor runs before logs leave your application:
```php
<?php
// Correct order: GDPR processor added AFTER handlers
$logger = new Logger('app');
$logger->pushHandler($externalHandler);
$logger->pushProcessor(new GdprProcessor($patterns)); // Runs before handlers
```
### 2. Add Compliance Metadata
```php
<?php
$logger->pushProcessor(function ($record) {
$record['extra']['gdpr'] = [
'processed' => true,
'processor_version' => '3.0.0',
'timestamp' => date('c'),
];
return $record;
});
```
### 3. Monitor for Leaks
Set up alerts in your logging platform for unmasked PII patterns:
```json
{
"query": {
"regexp": {
"message": "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}"
}
}
}
```
### 4. Retention Policies
Configure retention aligned with GDPR requirements:
- Most platforms support automatic log deletion
- Set retention to 30 days for most operational logs
- Archive critical audit logs separately with longer retention

453
docs/performance-tuning.md Normal file
View File

@@ -0,0 +1,453 @@
# Performance Tuning Guide
This guide covers optimization strategies for the Monolog GDPR Filter library in high-throughput environments.
## Table of Contents
- [Benchmarking Your Setup](#benchmarking-your-setup)
- [Pattern Optimization](#pattern-optimization)
- [Memory Management](#memory-management)
- [Caching Strategies](#caching-strategies)
- [Rate Limiting](#rate-limiting)
- [Streaming Large Logs](#streaming-large-logs)
- [Production Configuration](#production-configuration)
## Benchmarking Your Setup
Before optimizing, establish baseline metrics:
```php
<?php
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
$processor = new GdprProcessor(DefaultPatterns::all());
$record = [
'message' => 'User john@example.com logged in from 192.168.1.100',
'context' => [
'user' => ['email' => 'john@example.com', 'ssn' => '123-45-6789'],
'ip' => '192.168.1.100',
],
'level' => 200,
'level_name' => 'INFO',
'channel' => 'app',
'datetime' => new DateTimeImmutable(),
'extra' => [],
];
// Benchmark
$iterations = 10000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
$processor($record);
}
$elapsed = microtime(true) - $start;
$perSecond = $iterations / $elapsed;
echo "Processed {$iterations} records in {$elapsed:.4f} seconds\n";
echo "Throughput: {$perSecond:.0f} records/second\n";
```
**Target benchmarks:**
- Simple patterns: 50,000+ records/second
- Complex patterns with nested context: 10,000+ records/second
- With audit logging: 5,000+ records/second
## Pattern Optimization
### 1. Order Patterns by Frequency
Place most frequently matched patterns first:
```php
<?php
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Ivuorinen\MonologGdprFilter\MaskConstants;
// Good: Email (common) before SSN (rare)
$patterns = [
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => MaskConstants::MASK_EMAIL,
'/\b\d{3}-\d{2}-\d{4}\b/' => MaskConstants::MASK_SSN,
];
$processor = new GdprProcessor($patterns);
```
### 2. Use Specific Patterns Over Generic
Specific patterns are faster than broad ones:
```php
<?php
// Slow: Generic catch-all
$slowPattern = '/\b\d{3}[-.\s]?\d{3}[-.\s]?\d{4}\b/';
// Fast: Specific format
$fastPattern = '/\b\d{3}-\d{3}-\d{4}\b/';
```
### 3. Avoid Catastrophic Backtracking
```php
<?php
// Bad: Potential backtracking issues
$badPattern = '/.*@.*\..*/';
// Good: Bounded repetition
$goodPattern = '/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/';
```
### 4. Use Non-Capturing Groups
```php
<?php
// Slower: Capturing groups
$slowPattern = '/(foo|bar|baz)/';
// Faster: Non-capturing groups
$fastPattern = '/(?:foo|bar|baz)/';
```
### 5. Pre-validate Patterns
Use the PatternValidator to cache validation results:
```php
<?php
use Ivuorinen\MonologGdprFilter\PatternValidator;
$validator = new PatternValidator();
// Cache all patterns at startup
$validator->cacheAllPatterns($patterns);
```
## Memory Management
### 1. Limit Recursion Depth
```php
<?php
use Ivuorinen\MonologGdprFilter\GdprProcessor;
// Default is 10, reduce for memory-constrained environments
$processor = new GdprProcessor(
patterns: $patterns,
maxDepth: 5 // Limit nested array processing
);
```
### 2. Use Streaming for Large Logs
```php
<?php
use Ivuorinen\MonologGdprFilter\Streaming\StreamingProcessor;
use Ivuorinen\MonologGdprFilter\MaskingOrchestrator;
$orchestrator = new MaskingOrchestrator($patterns);
$streaming = new StreamingProcessor(
orchestrator: $orchestrator,
chunkSize: 500 // Process 500 records at a time
);
// Process large file with constant memory usage
$lineParser = fn(string $line): array => [
'message' => $line,
'context' => [],
];
foreach ($streaming->processFile('/var/log/large.log', $lineParser) as $record) {
// Handle processed record
}
```
### 3. Disable Audit Logging in High-Volume Scenarios
```php
<?php
use Ivuorinen\MonologGdprFilter\GdprProcessor;
// No audit logger = less memory allocation
$processor = new GdprProcessor(
patterns: $patterns,
auditLogger: null
);
```
## Caching Strategies
### 1. Pattern Compilation Caching
Patterns are compiled once and cached internally. Ensure you reuse processor instances:
```php
<?php
// Good: Singleton pattern
class ProcessorFactory
{
private static ?GdprProcessor $instance = null;
public static function getInstance(): GdprProcessor
{
if (self::$instance === null) {
self::$instance = new GdprProcessor(DefaultPatterns::all());
}
return self::$instance;
}
}
```
### 2. Result Caching for Repeated Values
For applications processing similar data repeatedly:
```php
<?php
class CachedGdprProcessor
{
private GdprProcessor $processor;
private array $cache = [];
private int $maxCacheSize = 1000;
public function __construct(GdprProcessor $processor)
{
$this->processor = $processor;
}
public function process(array $record): array
{
$key = md5(serialize($record['message'] . json_encode($record['context'])));
if (isset($this->cache[$key])) {
return $this->cache[$key];
}
$result = ($this->processor)($record);
if (count($this->cache) >= $this->maxCacheSize) {
array_shift($this->cache);
}
$this->cache[$key] = $result;
return $result;
}
}
```
## Rate Limiting
### 1. Rate-Limited Audit Logging
Prevent audit log flooding:
```php
<?php
use Ivuorinen\MonologGdprFilter\RateLimitedAuditLogger;
use Ivuorinen\MonologGdprFilter\RateLimiter;
$rateLimiter = new RateLimiter(
maxEvents: 100, // Max 100 events
windowSeconds: 60, // Per 60 seconds
burstLimit: 20 // Allow burst of 20
);
$auditLogger = new RateLimitedAuditLogger(
baseLogger: fn($path, $original, $masked) => error_log("Masked: $path"),
rateLimiter: $rateLimiter
);
```
### 2. Sampling for High-Volume Logging
```php
<?php
class SampledProcessor
{
private GdprProcessor $processor;
private float $sampleRate;
public function __construct(GdprProcessor $processor, float $sampleRate = 0.1)
{
$this->processor = $processor;
$this->sampleRate = $sampleRate;
}
public function __invoke(array $record): array
{
// Only process sample of records for audit
$shouldAudit = (mt_rand() / mt_getrandmax()) < $this->sampleRate;
if (!$shouldAudit) {
// Process without audit logging
return $this->processWithoutAudit($record);
}
return ($this->processor)($record);
}
private function processWithoutAudit(array $record): array
{
// Implement lightweight processing
return $record;
}
}
```
## Streaming Large Logs
### 1. Chunk Size Optimization
```php
<?php
use Ivuorinen\MonologGdprFilter\Streaming\StreamingProcessor;
// For memory-constrained environments
$smallChunks = new StreamingProcessor($orchestrator, chunkSize: 100);
// For throughput-optimized environments
$largeChunks = new StreamingProcessor($orchestrator, chunkSize: 1000);
```
### 2. Parallel Processing
For multi-core systems, process chunks in parallel:
```php
<?php
// Using pcntl_fork for parallel processing
function processInParallel(array $files, StreamingProcessor $processor): void
{
$pids = [];
foreach ($files as $file) {
$pid = pcntl_fork();
if ($pid === 0) {
// Child process
$lineParser = fn(string $line): array => ['message' => $line, 'context' => []];
foreach ($processor->processFile($file, $lineParser) as $record) {
// Process record
}
exit(0);
}
$pids[] = $pid;
}
// Wait for all children
foreach ($pids as $pid) {
pcntl_waitpid($pid, $status);
}
}
```
## Production Configuration
### 1. Minimal Pattern Set
Only include patterns you actually need:
```php
<?php
use Ivuorinen\MonologGdprFilter\DefaultPatterns;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
// Instead of DefaultPatterns::all(), use specific patterns
$patterns = array_merge(
DefaultPatterns::emails(),
DefaultPatterns::creditCards(),
// Only what you need
);
$processor = new GdprProcessor($patterns);
```
### 2. Disable Debug Features
```php
<?php
use Ivuorinen\MonologGdprFilter\Builder\GdprProcessorBuilder;
$processor = (new GdprProcessorBuilder())
->withDefaultPatterns()
->withMaxDepth(5) // Limit recursion
->withAuditLogger(null) // Disable audit logging
->build();
```
### 3. OPcache Configuration
Ensure OPcache is properly configured in `php.ini`:
```ini
opcache.enable=1
opcache.memory_consumption=256
opcache.interned_strings_buffer=16
opcache.max_accelerated_files=10000
opcache.jit=1255
opcache.jit_buffer_size=128M
```
### 4. Preloading (PHP 8.0+)
Create a preload script:
```php
<?php
// preload.php
require_once __DIR__ . '/vendor/autoload.php';
// Preload core classes
$classes = [
\Ivuorinen\MonologGdprFilter\GdprProcessor::class,
\Ivuorinen\MonologGdprFilter\MaskingOrchestrator::class,
\Ivuorinen\MonologGdprFilter\DefaultPatterns::class,
\Ivuorinen\MonologGdprFilter\PatternValidator::class,
];
foreach ($classes as $class) {
class_exists($class);
}
```
Configure in `php.ini`:
```ini
opcache.preload=/path/to/preload.php
opcache.preload_user=www-data
```
## Performance Checklist
- [ ] Benchmark baseline performance
- [ ] Order patterns by frequency
- [ ] Use specific patterns over generic
- [ ] Limit recursion depth appropriately
- [ ] Use streaming for large log files
- [ ] Implement rate limiting for audit logs
- [ ] Enable OPcache with JIT
- [ ] Consider preloading in production
- [ ] Reuse processor instances (singleton)
- [ ] Disable unnecessary features in production

599
docs/plugin-development.md Normal file
View File

@@ -0,0 +1,599 @@
# Plugin Development Guide
This guide explains how to create custom plugins for the Monolog GDPR Filter library.
## Table of Contents
- [Introduction](#introduction)
- [Quick Start](#quick-start)
- [Plugin Interface](#plugin-interface)
- [Abstract Base Class](#abstract-base-class)
- [Registration](#registration)
- [Hook Execution Order](#hook-execution-order)
- [Priority System](#priority-system)
- [Configuration Contribution](#configuration-contribution)
- [Use Cases](#use-cases)
- [Best Practices](#best-practices)
## Introduction
Plugins extend the GDPR processor's functionality without modifying core code. Use plugins when you need to:
- Add custom masking patterns for your domain
- Transform messages before or after standard masking
- Enrich context with metadata
- Integrate with external systems
- Apply organization-specific compliance rules
### When to Use Plugins vs. Configuration
| Scenario | Use Plugin | Use Configuration |
|----------|-----------|-------------------|
| Add regex patterns | ✅ (via `getPatterns()`) | ✅ (via constructor) |
| Custom transformation logic | ✅ | ❌ |
| Conditional processing | ✅ | ❌ |
| Multiple reusable rules | ✅ | ❌ |
| Simple field masking | ❌ | ✅ |
## Quick Start
Create a minimal plugin in three steps:
### Step 1: Create the Plugin Class
```php
<?php
namespace App\Logging\Plugins;
use Ivuorinen\MonologGdprFilter\Plugins\AbstractMaskingPlugin;
class MyCompanyPlugin extends AbstractMaskingPlugin
{
public function getName(): string
{
return 'my-company-plugin';
}
public function getPatterns(): array
{
return [
'/INTERNAL-\d{6}/' => '[INTERNAL-ID]', // Internal ID format
'/EMP-[A-Z]{2}\d{4}/' => '[EMPLOYEE-ID]', // Employee IDs
];
}
}
```
### Step 2: Register the Plugin
```php
<?php
use Ivuorinen\MonologGdprFilter\Builder\GdprProcessorBuilder;
use App\Logging\Plugins\MyCompanyPlugin;
$processor = GdprProcessorBuilder::create()
->withDefaultPatterns()
->addPlugin(new MyCompanyPlugin())
->buildWithPlugins();
```
### Step 3: Use with Monolog
```php
<?php
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
$logger = new Logger('app');
$logger->pushHandler(new StreamHandler('app.log'));
$logger->pushProcessor($processor);
// Internal IDs and employee IDs are now masked
$logger->info('User INTERNAL-123456 (EMP-AB1234) logged in');
// Output: User [INTERNAL-ID] ([EMPLOYEE-ID]) logged in
```
## Plugin Interface
All plugins must implement `MaskingPluginInterface`:
```php
interface MaskingPluginInterface
{
// Identification
public function getName(): string;
// Pre-processing hooks (before standard masking)
public function preProcessContext(array $context): array;
public function preProcessMessage(string $message): string;
// Post-processing hooks (after standard masking)
public function postProcessContext(array $context): array;
public function postProcessMessage(string $message): string;
// Configuration contribution
public function getPatterns(): array;
public function getFieldPaths(): array;
// Execution order control
public function getPriority(): int;
}
```
### Method Reference
| Method | Purpose | When Called |
|--------|---------|-------------|
| `getName()` | Unique identifier for debugging | On registration |
| `preProcessContext()` | Modify context before masking | Before core masking |
| `preProcessMessage()` | Modify message before masking | Before core masking |
| `postProcessContext()` | Modify context after masking | After core masking |
| `postProcessMessage()` | Modify message after masking | After core masking |
| `getPatterns()` | Provide regex patterns | During build |
| `getFieldPaths()` | Provide field paths to mask | During build |
| `getPriority()` | Control execution order | During sorting |
## Abstract Base Class
Extend `AbstractMaskingPlugin` to avoid implementing unused methods:
```php
<?php
namespace Ivuorinen\MonologGdprFilter\Plugins;
abstract class AbstractMaskingPlugin implements MaskingPluginInterface
{
public function __construct(protected readonly int $priority = 100)
{
}
// Default implementations return input unchanged
public function preProcessContext(array $context): array { return $context; }
public function postProcessContext(array $context): array { return $context; }
public function preProcessMessage(string $message): string { return $message; }
public function postProcessMessage(string $message): string { return $message; }
public function getPatterns(): array { return []; }
public function getFieldPaths(): array { return []; }
public function getPriority(): int { return $this->priority; }
}
```
### Benefits
- Override only the methods you need
- Default priority of 100 (customizable via constructor)
- All hooks pass data through unchanged by default
## Registration
Register plugins using `GdprProcessorBuilder`:
```php
<?php
use Ivuorinen\MonologGdprFilter\Builder\GdprProcessorBuilder;
// Single plugin
$processor = GdprProcessorBuilder::create()
->addPlugin($plugin)
->buildWithPlugins();
// Multiple plugins
$processor = GdprProcessorBuilder::create()
->addPlugins([$plugin1, $plugin2, $plugin3])
->buildWithPlugins();
// With other configuration
$processor = GdprProcessorBuilder::create()
->withDefaultPatterns()
->addPattern('/custom/', '[MASKED]')
->addFieldPath('secret', FieldMaskConfig::remove())
->addPlugin($plugin)
->withAuditLogger($auditLogger)
->buildWithPlugins();
```
### Return Types
```php
// No plugins: returns GdprProcessor (no wrapper overhead)
$processor = GdprProcessorBuilder::create()
->withDefaultPatterns()
->buildWithPlugins(); // GdprProcessor
// With plugins: returns PluginAwareProcessor (wraps GdprProcessor)
$processor = GdprProcessorBuilder::create()
->addPlugin($plugin)
->buildWithPlugins(); // PluginAwareProcessor
```
## Hook Execution Order
Understanding execution order is critical for plugins that interact:
```text
1. preProcessMessage() - Plugins in priority order (10, 20, 30...)
2. preProcessContext() - Plugins in priority order (10, 20, 30...)
3. [Core GdprProcessor masking]
4. postProcessMessage() - Plugins in REVERSE order (30, 20, 10...)
5. postProcessContext() - Plugins in REVERSE order (30, 20, 10...)
```
### Why Reverse Order for Post-Processing?
Post-processing runs in reverse to properly "unwrap" transformations:
```php
// Plugin A (priority 10) wraps: "data" -> "[A:data:A]"
// Plugin B (priority 20) wraps: "[A:data:A]" -> "[B:[A:data:A]:B]"
// Post-processing reverse order ensures proper unwrapping:
// Plugin B runs first: "[B:[A:masked:A]:B]" -> "[A:masked:A]"
// Plugin A runs second: "[A:masked:A]" -> "masked"
```
## Priority System
Lower numbers execute earlier in pre-processing:
```php
class HighPriorityPlugin extends AbstractMaskingPlugin
{
public function __construct()
{
parent::__construct(priority: 10); // Runs early
}
}
class NormalPriorityPlugin extends AbstractMaskingPlugin
{
// Default priority: 100
}
class LowPriorityPlugin extends AbstractMaskingPlugin
{
public function __construct()
{
parent::__construct(priority: 200); // Runs late
}
}
```
### Recommended Priority Ranges
| Range | Use Case | Example |
|-------|----------|---------|
| 1-50 | Security/validation | Input sanitization |
| 50-100 | Standard processing | Pattern masking |
| 100-150 | Business logic | Domain-specific rules |
| 150-200 | Enrichment | Adding metadata |
| 200+ | Cleanup/finalization | Removing temp fields |
## Configuration Contribution
Plugins can contribute patterns and field paths that are merged into the processor:
### Adding Patterns
```php
public function getPatterns(): array
{
return [
'/ACME-\d{8}/' => '[ACME-ORDER]',
'/INV-[A-Z]{2}-\d+/' => '[INVOICE]',
];
}
```
### Adding Field Paths
```php
use Ivuorinen\MonologGdprFilter\FieldMaskConfig;
public function getFieldPaths(): array
{
return [
// Static replacement
'api_key' => FieldMaskConfig::replace('[API_KEY]'),
// Remove field entirely
'internal.debug' => FieldMaskConfig::remove(),
// Apply regex to field value
'user.notes' => FieldMaskConfig::regexMask('/\d{3}-\d{2}-\d{4}/', '[SSN]'),
// Use processor's global patterns
'user.bio' => FieldMaskConfig::useProcessorPatterns(),
];
}
```
## Use Cases
### Use Case 1: Message Transformation
Transform messages before masking:
```php
class NormalizePlugin extends AbstractMaskingPlugin
{
public function getName(): string
{
return 'normalize-plugin';
}
public function preProcessMessage(string $message): string
{
// Normalize whitespace before masking
return preg_replace('/\s+/', ' ', trim($message));
}
}
```
### Use Case 2: Domain-Specific Patterns
Add patterns for your organization:
```php
class HealthcarePlugin extends AbstractMaskingPlugin
{
public function getName(): string
{
return 'healthcare-plugin';
}
public function getPatterns(): array
{
return [
// Medical Record Number
'/MRN-\d{10}/' => '[MRN]',
// National Provider Identifier
'/NPI-\d{10}/' => '[NPI]',
// DEA Number
'/DEA-[A-Z]{2}\d{7}/' => '[DEA]',
];
}
public function getFieldPaths(): array
{
return [
'patient.diagnosis' => FieldMaskConfig::replace('[PHI]'),
'patient.medications' => FieldMaskConfig::remove(),
];
}
}
```
### Use Case 3: Context Enrichment
Add metadata to context:
```php
class AuditPlugin extends AbstractMaskingPlugin
{
public function getName(): string
{
return 'audit-plugin';
}
public function __construct(private readonly string $environment)
{
parent::__construct(priority: 150); // Run late
}
public function postProcessContext(array $context): array
{
$context['_audit'] = [
'processed_at' => date('c'),
'environment' => $this->environment,
'plugin_version' => '1.0.0',
];
return $context;
}
}
```
### Use Case 4: Conditional Masking
Apply masking based on conditions:
```php
class EnvironmentAwarePlugin extends AbstractMaskingPlugin
{
public function getName(): string
{
return 'environment-aware-plugin';
}
public function preProcessContext(array $context): array
{
// Only mask in production
if (getenv('APP_ENV') !== 'production') {
return $context;
}
// Add extra masking for production
if (isset($context['debug_info'])) {
$context['debug_info'] = '[REDACTED IN PRODUCTION]';
}
return $context;
}
}
```
### Use Case 5: External Integration
Integrate with external services:
```php
class CompliancePlugin extends AbstractMaskingPlugin
{
public function getName(): string
{
return 'compliance-plugin';
}
public function __construct(
private readonly ComplianceService $service
) {
parent::__construct(priority: 50);
}
public function postProcessContext(array $context): array
{
// Log to compliance system
$this->service->recordMaskingEvent(
fields: array_keys($context),
timestamp: new \DateTimeImmutable()
);
return $context;
}
}
```
## Best Practices
### 1. Keep Plugins Focused
Each plugin should have a single responsibility:
```php
// Good: Single purpose
class EmailPatternPlugin extends AbstractMaskingPlugin { /* ... */ }
class PhonePatternPlugin extends AbstractMaskingPlugin { /* ... */ }
// Avoid: Multiple unrelated responsibilities
class EverythingPlugin extends AbstractMaskingPlugin { /* ... */ }
```
### 2. Use Descriptive Names
Plugin names should be unique and descriptive:
```php
// Good
public function getName(): string
{
return 'acme-healthcare-hipaa-v2';
}
// Avoid
public function getName(): string
{
return 'plugin1';
}
```
### 3. Handle Errors Gracefully
Plugins should not throw exceptions that break logging:
```php
public function preProcessContext(array $context): array
{
try {
// Risky operation
$context['processed'] = $this->riskyTransform($context);
} catch (\Throwable $e) {
// Log error but don't break logging
error_log("Plugin error: " . $e->getMessage());
}
return $context; // Always return context
}
```
### 4. Document Your Patterns
Add comments explaining pattern purpose:
```php
public function getPatterns(): array
{
return [
// ACME internal order numbers: ACME-YYYYMMDD-NNNN
'/ACME-\d{8}-\d{4}/' => '[ORDER-ID]',
// Employee badges: EMP followed by 6 digits
'/EMP\d{6}/' => '[EMPLOYEE]',
];
}
```
### 5. Test Your Plugins
Create comprehensive tests:
```php
class MyPluginTest extends TestCase
{
public function testPatternMasking(): void
{
$plugin = new MyPlugin();
$patterns = $plugin->getPatterns();
// Test each pattern
foreach ($patterns as $pattern => $replacement) {
$this->assertMatchesRegularExpression($pattern, 'INTERNAL-123456');
}
}
public function testPreProcessing(): void
{
$plugin = new MyPlugin();
$context = ['sensitive' => 'value'];
$result = $plugin->preProcessContext($context);
$this->assertArrayHasKey('sensitive', $result);
}
}
```
### 6. Consider Performance
Avoid expensive operations in hooks that run for every log entry:
```php
// Good: Simple operations
public function preProcessMessage(string $message): string
{
return trim($message);
}
// Avoid: Heavy operations for every log
public function preProcessMessage(string $message): string
{
return $this->httpClient->validateMessage($message); // Slow!
}
```
### 7. Use Priority Thoughtfully
Consider how your plugin interacts with others:
```php
// Security validation should run early
class SecurityPlugin extends AbstractMaskingPlugin
{
public function __construct()
{
parent::__construct(priority: 10);
}
}
// Metadata enrichment should run late
class MetadataPlugin extends AbstractMaskingPlugin
{
public function __construct()
{
parent::__construct(priority: 180);
}
}
```

334
docs/psr3-decorator.md Normal file
View File

@@ -0,0 +1,334 @@
# PSR-3 Logger Decorator Guide
This guide explains how to wrap any PSR-3 compatible logger with GDPR masking capabilities.
## Overview
The PSR-3 decorator pattern allows you to add GDPR filtering to any logger that implements `Psr\Log\LoggerInterface`, making the library compatible with virtually any PHP logging framework.
## Basic Usage
### Creating a PSR-3 Wrapper
Here's a simple decorator that wraps any PSR-3 logger:
```php
<?php
namespace YourApp\Logging;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Psr\Log\LoggerInterface;
use Psr\Log\LogLevel;
use Monolog\Level;
use Monolog\LogRecord;
use DateTimeImmutable;
use Stringable;
class GdprLoggerDecorator implements LoggerInterface
{
public function __construct(
private readonly LoggerInterface $innerLogger,
private readonly GdprProcessor $gdprProcessor
) {
}
public function emergency(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::EMERGENCY, $message, $context);
}
public function alert(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::ALERT, $message, $context);
}
public function critical(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::CRITICAL, $message, $context);
}
public function error(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::ERROR, $message, $context);
}
public function warning(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::WARNING, $message, $context);
}
public function notice(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::NOTICE, $message, $context);
}
public function info(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::INFO, $message, $context);
}
public function debug(string|Stringable $message, array $context = []): void
{
$this->log(LogLevel::DEBUG, $message, $context);
}
public function log($level, string|Stringable $message, array $context = []): void
{
// Convert PSR-3 level to Monolog level
$monologLevel = $this->convertLevel($level);
// Create a Monolog LogRecord for processing
$record = new LogRecord(
datetime: new DateTimeImmutable(),
channel: 'app',
level: $monologLevel,
message: (string) $message,
context: $context
);
// Apply GDPR processing
$processedRecord = ($this->gdprProcessor)($record);
// Pass to inner logger
$this->innerLogger->log($level, $processedRecord->message, $processedRecord->context);
}
private function convertLevel(mixed $level): Level
{
return match ($level) {
LogLevel::EMERGENCY => Level::Emergency,
LogLevel::ALERT => Level::Alert,
LogLevel::CRITICAL => Level::Critical,
LogLevel::ERROR => Level::Error,
LogLevel::WARNING => Level::Warning,
LogLevel::NOTICE => Level::Notice,
LogLevel::INFO => Level::Info,
LogLevel::DEBUG => Level::Debug,
default => Level::Info,
};
}
}
```
## Usage Examples
### With Any PSR-3 Logger
```php
<?php
use YourApp\Logging\GdprLoggerDecorator;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
// Your existing PSR-3 logger (could be Monolog, any other, etc.)
$existingLogger = new Logger('app');
$existingLogger->pushHandler(new StreamHandler('php://stdout'));
// Create GDPR processor
$gdprProcessor = new GdprProcessor([
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
'/\b\d{3}-\d{2}-\d{4}\b/' => '***-**-****',
]);
// Wrap with GDPR decorator
$logger = new GdprLoggerDecorator($existingLogger, $gdprProcessor);
// Use as normal
$logger->info('User john@example.com logged in with SSN 123-45-6789');
// Output: User [email] logged in with SSN ***-**-****
```
### With Dependency Injection
```php
<?php
use YourApp\Logging\GdprLoggerDecorator;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Psr\Log\LoggerInterface;
class UserService
{
public function __construct(
private readonly LoggerInterface $logger
) {
}
public function createUser(string $email, string $ssn): void
{
// Log will be automatically GDPR-filtered
$this->logger->info("Creating user: {email}, SSN: {ssn}", [
'email' => $email,
'ssn' => $ssn,
]);
}
}
// Container configuration (pseudo-code)
$container->register(GdprProcessor::class, function () {
return new GdprProcessor([
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
]);
});
$container->register(LoggerInterface::class, function ($container) {
return new GdprLoggerDecorator(
$container->get('original_logger'),
$container->get(GdprProcessor::class)
);
});
```
## Enhanced Decorator with Channel Support
```php
<?php
namespace YourApp\Logging;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Psr\Log\LoggerInterface;
use Psr\Log\LogLevel;
use Monolog\Level;
use Monolog\LogRecord;
use DateTimeImmutable;
use Stringable;
class GdprLoggerDecorator implements LoggerInterface
{
public function __construct(
private readonly LoggerInterface $innerLogger,
private readonly GdprProcessor $gdprProcessor,
private readonly string $channel = 'app'
) {
}
/**
* Create a new instance with a different channel.
*/
public function withChannel(string $channel): self
{
return new self($this->innerLogger, $this->gdprProcessor, $channel);
}
public function log($level, string|Stringable $message, array $context = []): void
{
$record = new LogRecord(
datetime: new DateTimeImmutable(),
channel: $this->channel,
level: $this->convertLevel($level),
message: (string) $message,
context: $context
);
$processedRecord = ($this->gdprProcessor)($record);
$this->innerLogger->log($level, $processedRecord->message, $processedRecord->context);
}
// ... other methods remain the same
}
```
## Using with Popular Frameworks
### Laravel
```php
<?php
// app/Providers/LoggingServiceProvider.php
namespace App\Providers;
use App\Logging\GdprLoggerDecorator;
use Illuminate\Support\ServiceProvider;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Psr\Log\LoggerInterface;
class LoggingServiceProvider extends ServiceProvider
{
public function register(): void
{
$this->app->extend(LoggerInterface::class, function ($logger) {
$processor = new GdprProcessor(
config('gdpr.patterns', [])
);
return new GdprLoggerDecorator($logger, $processor);
});
}
}
```
### Slim Framework
```php
<?php
// config/container.php
use DI\Container;
use YourApp\Logging\GdprLoggerDecorator;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use Psr\Log\LoggerInterface;
return [
LoggerInterface::class => function (Container $c) {
$baseLogger = new Logger('app');
$baseLogger->pushHandler(new StreamHandler('logs/app.log'));
$processor = new GdprProcessor([
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
]);
return new GdprLoggerDecorator($baseLogger, $processor);
},
];
```
## Testing Your Decorator
```php
<?php
namespace Tests\Logging;
use PHPUnit\Framework\TestCase;
use YourApp\Logging\GdprLoggerDecorator;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Psr\Log\LoggerInterface;
use Psr\Log\LogLevel;
class GdprLoggerDecoratorTest extends TestCase
{
public function testEmailIsMasked(): void
{
$logs = [];
$mockLogger = $this->createMock(LoggerInterface::class);
$mockLogger->method('log')
->willReturnCallback(function ($level, $message, $context) use (&$logs) {
$logs[] = ['level' => $level, 'message' => $message, 'context' => $context];
});
$processor = new GdprProcessor([
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
]);
$decorator = new GdprLoggerDecorator($mockLogger, $processor);
$decorator->info('Contact: john@example.com');
$this->assertCount(1, $logs);
$this->assertStringContainsString('[email]', $logs[0]['message']);
$this->assertStringNotContainsString('john@example.com', $logs[0]['message']);
}
}
```
## See Also
- [Symfony Integration](symfony-integration.md)
- [Framework Examples](framework-examples.md)

264
docs/symfony-integration.md Normal file
View File

@@ -0,0 +1,264 @@
# Symfony Integration Guide
This guide explains how to integrate the Monolog GDPR Filter with Symfony applications.
## Installation
```bash
composer require ivuorinen/monolog-gdpr-filter
```
## Basic Service Configuration
Add the GDPR processor as a service in `config/services.yaml`:
```yaml
services:
App\Logging\GdprProcessor:
class: Ivuorinen\MonologGdprFilter\GdprProcessor
arguments:
$patterns: '%gdpr.patterns%'
$fieldPaths: '%gdpr.field_paths%'
$customCallbacks: []
$auditLogger: null
$maxDepth: 100
$dataTypeMasks: []
$conditionalRules: []
```
## Parameters Configuration
Define GDPR patterns in `config/services.yaml` or a dedicated parameters file:
```yaml
parameters:
gdpr.patterns:
'/\b\d{3}-\d{2}-\d{4}\b/': '***-**-****' # US SSN
'/\b[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}\b/': '****' # IBAN
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/': '[email]' # Email
'/\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/': '****-****-****-****' # Credit Card
gdpr.field_paths:
'user.password': '***REMOVED***'
'user.ssn': '***-**-****'
'payment.card_number': '****-****-****-****'
```
## Monolog Handler Configuration
Configure Monolog to use the GDPR processor in `config/packages/monolog.yaml`:
```yaml
monolog:
handlers:
main:
type: stream
path: "%kernel.logs_dir%/%kernel.environment%.log"
level: debug
channels: ["!event"]
formatter: monolog.formatter.json
processor: ['@App\Logging\GdprProcessor']
# For production with file rotation
production:
type: rotating_file
path: "%kernel.logs_dir%/%kernel.environment%.log"
level: info
max_files: 14
processor: ['@App\Logging\GdprProcessor']
```
## Environment-Specific Configuration
Create environment-specific configurations:
### config/packages/dev/monolog.yaml
```yaml
monolog:
handlers:
main:
type: stream
path: "%kernel.logs_dir%/%kernel.environment%.log"
level: debug
# In dev, you might want less aggressive masking
```
### config/packages/prod/monolog.yaml
```yaml
monolog:
handlers:
main:
type: fingers_crossed
action_level: error
handler: nested
excluded_http_codes: [404, 405]
buffer_size: 50
nested:
type: rotating_file
path: "%kernel.logs_dir%/%kernel.environment%.log"
level: info
max_files: 14
processor: ['@App\Logging\GdprProcessor']
```
## Advanced Configuration with Audit Logging
Enable audit logging for compliance tracking:
```yaml
services:
App\Logging\AuditLogger:
class: Ivuorinen\MonologGdprFilter\RateLimitedAuditLogger
arguments:
$auditLogger: '@App\Logging\AuditCallback'
$maxRequestsPerMinute: 100
$windowSeconds: 60
App\Logging\AuditCallback:
class: Closure
factory: ['App\Logging\AuditCallbackFactory', 'create']
arguments:
$logger: '@monolog.logger.audit'
App\Logging\GdprProcessor:
class: Ivuorinen\MonologGdprFilter\GdprProcessor
arguments:
$patterns: '%gdpr.patterns%'
$fieldPaths: '%gdpr.field_paths%'
$auditLogger: '@App\Logging\AuditLogger'
```
Create the factory class:
```php
<?php
// src/Logging/AuditCallbackFactory.php
namespace App\Logging;
use Psr\Log\LoggerInterface;
class AuditCallbackFactory
{
public static function create(LoggerInterface $logger): callable
{
return function (string $path, mixed $original, mixed $masked) use ($logger): void {
$logger->info('GDPR masking applied', [
'path' => $path,
'original_type' => gettype($original),
'masked_preview' => substr((string) $masked, 0, 20) . '...',
]);
};
}
}
```
## Conditional Masking by Environment
Apply different masking rules based on log level or channel:
```yaml
services:
App\Logging\ConditionalRuleFactory:
class: App\Logging\ConditionalRuleFactory
App\Logging\GdprProcessor:
class: Ivuorinen\MonologGdprFilter\GdprProcessor
arguments:
$conditionalRules:
error_only: '@=service("App\\Logging\\ConditionalRuleFactory").createErrorOnlyRule()'
```
```php
<?php
// src/Logging/ConditionalRuleFactory.php
namespace App\Logging;
use Monolog\Level;
use Monolog\LogRecord;
class ConditionalRuleFactory
{
public function createErrorOnlyRule(): callable
{
return fn(LogRecord $record): bool =>
$record->level->value >= Level::Error->value;
}
public function createChannelRule(array $channels): callable
{
return fn(LogRecord $record): bool =>
in_array($record->channel, $channels, true);
}
}
```
## Testing in Symfony
Create a test to verify GDPR filtering works:
```php
<?php
// tests/Logging/GdprProcessorTest.php
namespace App\Tests\Logging;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
use Monolog\Level;
use Monolog\LogRecord;
use PHPUnit\Framework\TestCase;
use DateTimeImmutable;
class GdprProcessorTest extends TestCase
{
public function testEmailMasking(): void
{
$processor = new GdprProcessor([
'/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/' => '[email]',
]);
$record = new LogRecord(
datetime: new DateTimeImmutable(),
channel: 'test',
level: Level::Info,
message: 'User logged in: user@example.com',
context: []
);
$result = $processor($record);
$this->assertStringContainsString('[email]', $result->message);
$this->assertStringNotContainsString('user@example.com', $result->message);
}
}
```
## Troubleshooting
### Patterns Not Matching
1. Verify regex patterns are valid: `preg_match('/your-pattern/', 'test-string')`
2. Check pattern escaping in YAML (may need quotes)
3. Enable debug mode to see which patterns are applied
### Performance Issues
1. Use the rate-limited audit logger
2. Consider caching pattern validation results
3. Profile with Symfony profiler
### Memory Issues
1. Set appropriate `maxDepth` to prevent deep recursion
2. Monitor rate limiter statistics
3. Use cleanup intervals for long-running processes
## See Also
- [PSR-3 Decorator Guide](psr3-decorator.md)
- [Framework Examples](framework-examples.md)
- [Docker Development](docker-development.md)

530
docs/troubleshooting.md Normal file
View File

@@ -0,0 +1,530 @@
# Troubleshooting Guide
This guide helps diagnose and resolve common issues with the Monolog GDPR Filter library.
## Table of Contents
- [Installation Issues](#installation-issues)
- [Pattern Matching Problems](#pattern-matching-problems)
- [Performance Issues](#performance-issues)
- [Memory Problems](#memory-problems)
- [Integration Issues](#integration-issues)
- [Audit Logging Issues](#audit-logging-issues)
- [Error Messages Reference](#error-messages-reference)
## Installation Issues
### Composer Installation Fails
**Symptom:** `composer require` fails with dependency conflicts.
**Solution:**
```bash
# Check PHP version
php -v # Must be 8.2 or higher
# Clear Composer cache
composer clear-cache
# Update Composer
composer self-update
# Try again with verbose output
composer require ivuorinen/monolog-gdpr-filter -vvv
```
### Class Not Found Errors
**Symptom:** `Class 'Ivuorinen\MonologGdprFilter\GdprProcessor' not found`
**Solutions:**
1. Regenerate autoloader:
```bash
composer dump-autoload
```
2. Verify installation:
```bash
composer show ivuorinen/monolog-gdpr-filter
```
3. Check namespace in your code:
```php
<?php
// Correct
use Ivuorinen\MonologGdprFilter\GdprProcessor;
// Wrong
use MonologGdprFilter\GdprProcessor;
```
## Pattern Matching Problems
### Pattern Not Matching Expected Data
**Symptom:** Sensitive data is not being masked.
**Diagnostic steps:**
```php
<?php
use Ivuorinen\MonologGdprFilter\PatternValidator;
$validator = new PatternValidator();
$pattern = '/your-pattern-here/';
// Test 1: Validate pattern syntax
$result = $validator->validate($pattern);
if (!$result['valid']) {
echo "Invalid pattern: " . $result['error'] . "\n";
}
// Test 2: Test pattern directly
$testData = 'your test data with sensitive@email.com';
if (preg_match($pattern, $testData, $matches)) {
echo "Pattern matches: " . print_r($matches, true);
} else {
echo "Pattern does not match\n";
}
// Test 3: Test with processor
$processor = new GdprProcessor([$pattern => '[MASKED]']);
$record = [
'message' => $testData,
'context' => [],
'level' => 200,
'level_name' => 'INFO',
'channel' => 'app',
'datetime' => new DateTimeImmutable(),
'extra' => [],
];
$result = $processor($record);
echo "Result: " . $result['message'] . "\n";
```
### Pattern Matches Too Much
**Symptom:** Non-sensitive data is being masked.
**Solutions:**
1. Add word boundaries:
```php
<?php
// Too broad
$pattern = '/\d{4}/'; // Matches any 4 digits
// Better - with boundaries
$pattern = '/\b\d{4}\b/'; // Matches standalone 4-digit numbers
```
2. Use more specific patterns:
```php
<?php
// Too broad for credit cards
$pattern = '/\d{16}/';
// Better - credit card format
$pattern = '/\b(?:\d{4}[-\s]?){3}\d{4}\b/';
```
3. Add negative lookahead/lookbehind:
```php
<?php
// Avoid matching dates that look like years
$pattern = '/(?<!\d{2}\/)\b\d{4}\b(?!\/\d{2})/';
```
### Special Characters in Patterns
**Symptom:** Pattern with special characters fails.
**Solution:** Escape special regex characters:
```php
<?php
// Wrong - unescaped special chars
$pattern = '/user.name@domain.com/';
// Correct - escaped dots
$pattern = '/user\.name@domain\.com/';
// Using preg_quote for dynamic patterns
$email = 'user.name@domain.com';
$pattern = '/' . preg_quote($email, '/') . '/';
```
## Performance Issues
### Slow Processing
**Symptom:** Log processing is slower than expected.
**Diagnostic:**
```php
<?php
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
$processor($record);
}
$elapsed = microtime(true) - $start;
echo "1000 records: {$elapsed}s\n";
```
**Solutions:**
1. Reduce pattern count:
```php
<?php
// Only include patterns you need
$patterns = DefaultPatterns::emails() + DefaultPatterns::creditCards();
```
2. Simplify complex patterns:
```php
<?php
// Slow: Complex pattern with many alternatives
$slow = '/(january|february|march|april|may|june|july|august|september|october|november|december)/i';
// Faster: Simpler pattern
$fast = '/\b[A-Z][a-z]{2,8}\b/';
```
3. Limit recursion depth:
```php
<?php
$processor = new GdprProcessor($patterns, [], [], null, 5); // Max depth 5
```
See [Performance Tuning Guide](performance-tuning.md) for detailed optimization strategies.
### High CPU Usage
**Symptom:** Processing causes CPU spikes.
**Solutions:**
1. Check for catastrophic backtracking:
```php
<?php
// Problematic pattern
$bad = '/.*@.*\..*/'; // Can cause backtracking
// Fixed pattern
$good = '/[^@]+@[^.]+\.[a-z]+/i';
```
2. Add pattern timeout (PHP 7.3+):
```php
<?php
// Set PCRE backtrack limit
ini_set('pcre.backtrack_limit', '100000');
```
## Memory Problems
### Out of Memory Errors
**Symptom:** `Allowed memory size exhausted`
**Solutions:**
1. Use streaming for large files:
```php
<?php
use Ivuorinen\MonologGdprFilter\Streaming\StreamingProcessor;
use Ivuorinen\MonologGdprFilter\MaskingOrchestrator;
$orchestrator = new MaskingOrchestrator($patterns);
$streaming = new StreamingProcessor($orchestrator, chunkSize: 100);
// Process file without loading entirely into memory
$lineParser = fn(string $line): array => ['message' => $line, 'context' => []];
foreach ($streaming->processFile($largefile, $lineParser) as $record) {
// Process one record at a time
}
```
2. Reduce recursion depth:
```php
<?php
$processor = new GdprProcessor($patterns, [], [], null, 3);
```
3. Disable audit logging:
```php
<?php
$processor = new GdprProcessor($patterns, [], [], null); // No audit logger
```
### Memory Leaks
**Symptom:** Memory usage grows over time in long-running processes.
**Solutions:**
1. Clear caches periodically:
```php
<?php
// In long-running workers
if ($processedCount % 10000 === 0) {
gc_collect_cycles();
}
```
2. Use fresh processor instances for batch jobs:
```php
<?php
foreach ($batches as $batch) {
$processor = new GdprProcessor($patterns); // Fresh instance
foreach ($batch as $record) {
$processor($record);
}
unset($processor); // Release memory
}
```
## Integration Issues
### Laravel Integration
**Symptom:** Processor not being applied to logs.
**Solutions:**
1. Verify service provider registration:
```php
<?php
// config/app.php
'providers' => [
Ivuorinen\MonologGdprFilter\Laravel\GdprServiceProvider::class,
],
```
2. Check logging configuration:
```php
<?php
// config/logging.php
'channels' => [
'stack' => [
'driver' => 'stack',
'channels' => ['gdpr'],
],
'gdpr' => [
'driver' => 'single',
'path' => storage_path('logs/laravel.log'),
'tap' => [GdprLogTap::class],
],
],
```
3. Clear config cache:
```bash
php artisan config:clear
php artisan cache:clear
```
### Monolog Integration
**Symptom:** Processor not working with Monolog logger.
**Solution:** Ensure processor is pushed to logger:
```php
<?php
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use Ivuorinen\MonologGdprFilter\GdprProcessor;
$logger = new Logger('app');
$logger->pushHandler(new StreamHandler('app.log'));
$logger->pushProcessor(new GdprProcessor(DefaultPatterns::all()));
// Test it
$logger->info('User email: test@example.com');
```
### Symfony Integration
See [Symfony Integration Guide](symfony-integration.md) for detailed setup.
## Audit Logging Issues
### Audit Logger Not Receiving Events
**Symptom:** Audit callback never called.
**Solutions:**
1. Verify audit logger is set:
```php
<?php
$auditLogs = [];
$auditLogger = function (string $path, mixed $original, mixed $masked) use (&$auditLogs): void {
$auditLogs[] = compact('path', 'original', 'masked');
};
$processor = new GdprProcessor(
patterns: $patterns,
auditLogger: $auditLogger
);
```
2. Verify masking is actually occurring:
```php
<?php
// Audit is only called when data is actually masked
$record = ['message' => 'No sensitive data here', 'context' => []];
// This won't trigger audit because nothing is masked
```
### Rate-Limited Audit Missing Events
**Symptom:** Some audit events are being dropped.
**Solution:** Adjust rate limit settings:
```php
<?php
use Ivuorinen\MonologGdprFilter\RateLimiter;
use Ivuorinen\MonologGdprFilter\RateLimitedAuditLogger;
$rateLimiter = new RateLimiter(
maxEvents: 1000, // Increase limit
windowSeconds: 60,
burstLimit: 100 // Increase burst
);
$rateLimitedLogger = new RateLimitedAuditLogger($baseLogger, $rateLimiter);
```
## Error Messages Reference
### InvalidRegexPatternException
**Message:** `Invalid regex pattern: [pattern]`
**Cause:** The pattern has invalid regex syntax.
**Solution:**
```php
<?php
// Test pattern before using
$pattern = '/[invalid/';
if (@preg_match($pattern, '') === false) {
echo "Invalid pattern: " . preg_last_error_msg();
}
```
### RecursionDepthExceededException
**Message:** `Maximum recursion depth exceeded`
**Cause:** Nested data structure exceeds max depth.
**Solutions:**
```php
<?php
// Increase max depth
$processor = new GdprProcessor($patterns, [], [], null, 20);
// Or flatten your data before processing
$flatContext = iterator_to_array(
new RecursiveIteratorIterator(
new RecursiveArrayIterator($context)
),
false
);
```
### MaskingOperationFailedException
**Message:** `Masking operation failed: [details]`
**Cause:** An error occurred during masking.
**Solution:** Enable recovery mode:
```php
<?php
use Ivuorinen\MonologGdprFilter\Recovery\FallbackMaskStrategy;
use Ivuorinen\MonologGdprFilter\Recovery\FailureMode;
$fallback = new FallbackMaskStrategy(FailureMode::FAIL_SAFE);
// Use with your processor
```
### InvalidConfigurationException
**Message:** `Invalid configuration: [details]`
**Cause:** Invalid processor configuration.
**Solution:** Validate configuration:
```php
<?php
use Ivuorinen\MonologGdprFilter\Builder\GdprProcessorBuilder;
try {
$processor = (new GdprProcessorBuilder())
->addPattern('/valid-pattern/', '[MASKED]')
->build();
} catch (InvalidConfigurationException $e) {
echo "Configuration error: " . $e->getMessage();
}
```
## Getting Help
If you're still experiencing issues:
1. **Check the tests:** The test suite contains many usage examples:
```bash
ls tests/
```
2. **Enable debug mode:** Add verbose logging:
```php
<?php
$auditLogger = function ($path, $original, $masked): void {
error_log("GDPR Mask: $path | $original -> $masked");
};
```
3. **Report issues:** Open an issue on GitHub with:
- PHP version (`php -v`)
- Library version (`composer show ivuorinen/monolog-gdpr-filter`)
- Minimal reproduction code
- Expected vs actual behavior