mirror of
https://github.com/ivuorinen/monolog-gdpr-filter.git
synced 2026-01-26 03:34:00 +00:00
feat: performance, integrations, advanced features (#2)
* feat: performance, integrations, advanced features * chore: fix linting problems * chore: suppressions and linting * chore(lint): pre-commit linting, fixes * feat: comprehensive input validation, security hardening, and regression testing - Add extensive input validation throughout codebase with proper error handling - Implement comprehensive security hardening with ReDoS protection and bounds checking - Add 3 new regression test suites covering critical bugs, security, and validation scenarios - Enhance rate limiting with memory management and configurable cleanup intervals - Update configuration security settings and improve Laravel integration - Fix TODO.md timestamps to reflect actual development timeline - Strengthen static analysis configuration and improve code quality standards * feat: configure static analysis tools and enhance development workflow - Complete configuration of Psalm, PHPStan, and Rector for harmonious static analysis. - Fix invalid configurations and tool conflicts that prevented proper code quality analysis. - Add comprehensive safe analysis script with interactive workflow, backup/restore capabilities, and dry-run modes. Update documentation with linting policy requiring issue resolution over suppression. - Clean completed items from TODO to focus on actionable improvements. - All static analysis tools now work together seamlessly to provide code quality insights without breaking existing functionality. * fix(test): update Invalid regex pattern expectation * chore: phpstan, psalm fixes * chore: phpstan, psalm fixes, more tests * chore: tooling tweaks, cleanup * chore: tweaks to get the tests pass * fix(lint): rector config tweaks and successful run * feat: refactoring, more tests, fixes, cleanup * chore: deduplication, use constants * chore: psalm fixes * chore: ignore phpstan deliberate errors in tests * chore: improve codebase, deduplicate code * fix: lint * chore: deduplication, codebase simplification, sonarqube fixes * fix: resolve SonarQube reliability rating issues Fix useless object instantiation warnings in test files by assigning instantiated objects to variables. This resolves the SonarQube reliability rating issue (was C, now targeting A). Changes: - tests/Strategies/MaskingStrategiesTest.php: Fix 3 instances - tests/Strategies/FieldPathMaskingStrategyTest.php: Fix 1 instance The tests use expectException() to verify that constructors throw exceptions for invalid input. SonarQube flagged standalone `new` statements as useless. Fixed by assigning to variables with explicit unset() and fail() calls. All tests pass (623/623) and static analysis tools pass. * fix: resolve more SonarQube detected issues * fix: resolve psalm detected issues * fix: resolve more SonarQube detected issues * fix: resolve psalm detected issues * fix: duplications * fix: resolve SonarQube reliability rating issues * fix: resolve psalm and phpstan detected issues
This commit is contained in:
234
README.md
234
README.md
@@ -196,11 +196,245 @@ To automatically fix code style and static analysis issues:
|
||||
composer lint:fix
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Pattern Optimization
|
||||
|
||||
The library processes patterns sequentially, so pattern order can affect performance:
|
||||
|
||||
```php
|
||||
// Good: More specific patterns first
|
||||
$patterns = [
|
||||
'/\b\d{3}-\d{2}-\d{4}\b/' => '***SSN***', // Specific format
|
||||
'/\b\d+\b/' => '***NUMBER***', // Generic pattern last
|
||||
];
|
||||
|
||||
// Avoid: Too many broad patterns
|
||||
$patterns = [
|
||||
'/.*sensitive.*/' => '***MASKED***', // Too broad, may be slow
|
||||
];
|
||||
```
|
||||
|
||||
### Large Dataset Handling
|
||||
|
||||
For applications processing large volumes of logs:
|
||||
|
||||
```php
|
||||
// Consider pattern count vs. performance
|
||||
$processor = new GdprProcessor(
|
||||
$patterns, // Keep to essential patterns only
|
||||
$fieldPaths, // More efficient than regex for known fields
|
||||
$callbacks // Most efficient for complex logic
|
||||
);
|
||||
```
|
||||
|
||||
### Memory Usage
|
||||
|
||||
- **Regex Compilation**: Patterns are compiled on each use. Consider caching for high-volume applications.
|
||||
- **Deep Nesting**: The `recursiveMask()` method processes nested arrays. Very deep structures may impact memory.
|
||||
- **Audit Logging**: Be mindful of audit logger memory usage in high-volume scenarios.
|
||||
|
||||
### Benchmarking
|
||||
|
||||
Test performance with your actual data patterns:
|
||||
|
||||
```php
|
||||
$start = microtime(true);
|
||||
$processor = new GdprProcessor($patterns);
|
||||
$result = $processor->regExpMessage($yourLogMessage);
|
||||
$time = microtime(true) - $start;
|
||||
echo "Processing time: " . ($time * 1000) . "ms\n";
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### Pattern Not Matching
|
||||
|
||||
**Problem**: Custom regex pattern isn't masking expected data.
|
||||
|
||||
**Solutions**:
|
||||
|
||||
```php
|
||||
// 1. Test pattern in isolation
|
||||
$testPattern = '/your-pattern/';
|
||||
if (preg_match($testPattern, $testString)) {
|
||||
echo "Pattern matches!";
|
||||
} else {
|
||||
echo "Pattern doesn't match.";
|
||||
}
|
||||
|
||||
// 2. Validate pattern safety
|
||||
try {
|
||||
GdprProcessor::validatePatterns([
|
||||
'/your-pattern/' => '***MASKED***'
|
||||
]);
|
||||
echo "Pattern is valid and safe.";
|
||||
} catch (InvalidArgumentException $e) {
|
||||
echo "Pattern error: " . $e->getMessage();
|
||||
}
|
||||
|
||||
// 3. Enable audit logging to see what's happening
|
||||
$auditLogger = function ($path, $original, $masked) {
|
||||
error_log("GDPR Debug: {$path} - Original type: " . gettype($original));
|
||||
};
|
||||
```
|
||||
|
||||
#### Performance Issues
|
||||
|
||||
**Problem**: Slow log processing with many patterns.
|
||||
|
||||
**Solutions**:
|
||||
|
||||
```php
|
||||
// 1. Reduce pattern count
|
||||
$essentialPatterns = [
|
||||
'/\b\d{3}-\d{2}-\d{4}\b/' => '***SSN***',
|
||||
'/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b/' => '***EMAIL***',
|
||||
];
|
||||
|
||||
// 2. Use field-specific masking instead of global patterns
|
||||
$fieldPaths = [
|
||||
'user.email' => GdprProcessor::maskWithRegex(), // Only for specific fields
|
||||
'user.ssn' => GdprProcessor::replaceWith('***SSN***'),
|
||||
];
|
||||
|
||||
// 3. Profile pattern performance
|
||||
$start = microtime(true);
|
||||
// ... processing
|
||||
$duration = microtime(true) - $start;
|
||||
if ($duration > 0.1) { // 100ms threshold
|
||||
error_log("Slow GDPR processing: {$duration}s");
|
||||
}
|
||||
```
|
||||
|
||||
#### Audit Logging Issues
|
||||
|
||||
**Problem**: Audit logger not being called or logging sensitive data.
|
||||
|
||||
**Solutions**:
|
||||
|
||||
```php
|
||||
// 1. Verify audit logger is callable
|
||||
$auditLogger = function ($path, $original, $masked) {
|
||||
// SECURITY: Never log original sensitive data!
|
||||
$safeLog = [
|
||||
'path' => $path,
|
||||
'original_type' => gettype($original),
|
||||
'was_masked' => $original !== $masked,
|
||||
'timestamp' => date('c'),
|
||||
];
|
||||
error_log('GDPR Audit: ' . json_encode($safeLog));
|
||||
};
|
||||
|
||||
// 2. Test audit logger independently
|
||||
$processor = new GdprProcessor($patterns, [], [], $auditLogger);
|
||||
$processor->regExpMessage('test@example.com'); // Should trigger audit log
|
||||
|
||||
// 3. Check if masking actually occurred
|
||||
if ($original === $masked) {
|
||||
// No masking happened - check your patterns
|
||||
}
|
||||
```
|
||||
|
||||
#### Laravel Integration Issues
|
||||
|
||||
**Problem**: GDPR processor not working in Laravel.
|
||||
|
||||
**Solutions**:
|
||||
|
||||
```php
|
||||
// 1. Verify processor is registered
|
||||
Log::info('Test message with email@example.com');
|
||||
// Check logs to see if masking occurred
|
||||
|
||||
// 2. Check logging channel configuration
|
||||
// In config/logging.php, ensure tap is properly configured
|
||||
'single' => [
|
||||
'driver' => 'single',
|
||||
'path' => storage_path('logs/laravel.log'),
|
||||
'level' => 'debug',
|
||||
'tap' => [App\Logging\GdprTap::class], // Ensure this line exists
|
||||
],
|
||||
|
||||
// 3. Debug in service provider
|
||||
class AppServiceProvider extends ServiceProvider
|
||||
{
|
||||
public function boot()
|
||||
{
|
||||
$logger = Log::getLogger();
|
||||
$processor = new GdprProcessor($patterns, $fieldPaths);
|
||||
$logger->pushProcessor($processor);
|
||||
|
||||
// Test immediately
|
||||
Log::info('GDPR test: email@example.com should be masked');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Error Messages
|
||||
|
||||
#### "Invalid regex pattern"
|
||||
|
||||
- **Cause**: Pattern fails validation due to syntax error or security risk
|
||||
- **Solution**: Check pattern syntax and avoid nested quantifiers
|
||||
|
||||
#### "Compilation failed"
|
||||
|
||||
- **Cause**: PHP regex compilation error
|
||||
- **Solution**: Test pattern with `preg_match()` in isolation
|
||||
|
||||
#### "Unknown modifier"
|
||||
|
||||
- **Cause**: Invalid regex modifiers or malformed pattern
|
||||
- **Solution**: Use standard modifiers like `/pattern/i` for case-insensitive
|
||||
|
||||
### Debugging Tips
|
||||
|
||||
1. **Enable Error Logging**:
|
||||
|
||||
```php
|
||||
error_reporting(E_ALL);
|
||||
ini_set('display_errors', 1);
|
||||
```
|
||||
|
||||
2. **Test Patterns Separately**:
|
||||
|
||||
```php
|
||||
foreach ($patterns as $pattern => $replacement) {
|
||||
echo "Testing: {$pattern}\n";
|
||||
$result = preg_replace($pattern, $replacement, 'test string');
|
||||
if ($result === null) {
|
||||
echo "Error in pattern: {$pattern}\n";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Monitor Performance**:
|
||||
|
||||
```php
|
||||
$processor = new GdprProcessor($patterns, $fieldPaths, [], function($path, $orig, $masked) {
|
||||
if (microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'] > 1.0) {
|
||||
error_log("Slow GDPR processing detected");
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Getting Help
|
||||
|
||||
- **Documentation**: Check [CONTRIBUTING.md](CONTRIBUTING.md) for development setup
|
||||
- **Security Issues**: See [SECURITY.md](SECURITY.md) for responsible disclosure
|
||||
- **Bug Reports**: Create an issue on GitHub with minimal reproduction example
|
||||
- **Performance Issues**: Include profiling data and pattern counts
|
||||
|
||||
## Notable Implementation Details
|
||||
|
||||
- If a regex replacement in `regExpMessage` results in an empty string or the string "0", the original message is
|
||||
returned. This is covered by dedicated PHPUnit tests.
|
||||
- If a regex pattern is invalid, the audit logger (if set) is called, and the original message is returned.
|
||||
- All patterns are validated for security before use to prevent regex injection attacks.
|
||||
- The library includes ReDoS (Regular Expression Denial of Service) protection.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
|
||||
Reference in New Issue
Block a user