feat: add advanced architecture, documentation, and coverage improvements (#65)

* fix(style): resolve PHPCS line-length warnings in source files

* fix(style): resolve PHPCS line-length warnings in test files

* feat(audit): add structured audit logging with ErrorContext and AuditContext

- ErrorContext: standardized error information with sensitive data sanitization
- AuditContext: structured context for audit entries with operation types
- StructuredAuditLogger: enhanced audit logger wrapper with timing support

* feat(recovery): add recovery mechanism for failed masking operations

- FailureMode enum: FAIL_OPEN, FAIL_CLOSED, FAIL_SAFE modes
- RecoveryStrategy interface and RecoveryResult value object
- RetryStrategy: exponential backoff with configurable attempts
- FallbackMaskStrategy: type-aware fallback values

* feat(strategies): add CallbackMaskingStrategy for custom masking logic

- Wraps custom callbacks as MaskingStrategy implementations
- Factory methods: constant(), hash(), partial() for common use cases
- Supports exact match and prefix match for field paths

* docs: add framework integration guides and examples

- symfony-integration.md: Symfony service configuration and Monolog setup
- psr3-decorator.md: PSR-3 logger decorator pattern implementation
- framework-examples.md: CakePHP, CodeIgniter 4, Laminas, Yii2, PSR-15
- docker-development.md: Docker development environment guide

* chore(docker): add Docker development environment

- Dockerfile: PHP 8.2-cli-alpine with Xdebug for coverage
- docker-compose.yml: development services with volume mounts

* feat(demo): add interactive GDPR pattern tester playground

- PatternTester.php: pattern testing utility with strategy support
- index.php: web API endpoint with JSON response handling
- playground.html: interactive web interface for testing patterns

* docs(todo): update with completed medium priority items

- Mark all PHPCS warnings as fixed (81 → 0)
- Document new Audit and Recovery features
- Update test count to 1,068 tests with 2,953 assertions
- Move remaining items to low priority

* feat: add advanced architecture, documentation, and coverage improvements

- Add architecture improvements:
  - ArrayAccessorInterface and DotArrayAccessor for decoupled array access
  - MaskingOrchestrator for single-responsibility masking coordination
  - GdprProcessorBuilder for fluent configuration
  - MaskingPluginInterface and AbstractMaskingPlugin for plugin architecture
  - PluginAwareProcessor for plugin hook execution
  - AuditLoggerFactory for instance-based audit logger creation

- Add advanced features:
  - SerializedDataProcessor for handling print_r/var_export/serialize output
  - KAnonymizer with GeneralizationStrategy for GDPR k-anonymity
  - RetentionPolicy for configurable data retention periods
  - StreamingProcessor for memory-efficient large log processing

- Add comprehensive documentation:
  - docs/performance-tuning.md - benchmarking, optimization, caching
  - docs/troubleshooting.md - common issues and solutions
  - docs/logging-integrations.md - ELK, Graylog, Datadog, etc.
  - docs/plugin-development.md - complete plugin development guide

- Improve test coverage (84.41% → 85.07%):
  - ConditionalRuleFactoryInstanceTest (100% coverage)
  - GdprProcessorBuilderEdgeCasesTest (100% coverage)
  - StrategyEdgeCasesTest for ReDoS detection and type parsing
  - 78 new tests, 119 new assertions

- Update TODO.md with current statistics:
  - 141 PHP files, 1,346 tests, 85.07% line coverage

* chore: tests, update actions, sonarcloud issues

* chore: rector

* fix: more sonarcloud fixes

* chore: more fixes

* refactor: copilot review fix

* chore: rector
This commit is contained in:
2025-12-22 13:38:18 +02:00
committed by GitHub
parent b1eb567b92
commit 8866daaf33
112 changed files with 15391 additions and 607 deletions

216
src/Audit/AuditContext.php Normal file
View File

@@ -0,0 +1,216 @@
<?php
declare(strict_types=1);
namespace Ivuorinen\MonologGdprFilter\Audit;
/**
* Structured context for audit log entries.
*
* Provides a standardized format for tracking masking operations,
* including timing, retry attempts, and error information.
*
* @api
*/
final readonly class AuditContext
{
public const STATUS_SUCCESS = 'success';
public const STATUS_FAILED = 'failed';
public const STATUS_RECOVERED = 'recovered';
public const STATUS_SKIPPED = 'skipped';
public const OP_REGEX = 'regex';
public const OP_FIELD_PATH = 'field_path';
public const OP_CALLBACK = 'callback';
public const OP_DATA_TYPE = 'data_type';
public const OP_JSON = 'json';
public const OP_CONDITIONAL = 'conditional';
/**
* @param string $operationType Type of masking operation performed
* @param string $status Operation result status
* @param string|null $correlationId Unique ID linking related operations
* @param int $attemptNumber Retry attempt number (1 = first attempt)
* @param float $durationMs Operation duration in milliseconds
* @param ErrorContext|null $error Error details if operation failed
* @param array<string, mixed> $metadata Additional context information
*/
public function __construct(
public string $operationType,
public string $status = self::STATUS_SUCCESS,
public ?string $correlationId = null,
public int $attemptNumber = 1,
public float $durationMs = 0.0,
public ?ErrorContext $error = null,
public array $metadata = [],
) {
}
/**
* Create a success audit context.
*
* @param string $operationType The type of masking operation
* @param float $durationMs Operation duration in milliseconds
* @param array<string, mixed> $metadata Additional context
*/
public static function success(
string $operationType,
float $durationMs = 0.0,
array $metadata = []
): self {
return new self(
operationType: $operationType,
status: self::STATUS_SUCCESS,
durationMs: $durationMs,
metadata: $metadata,
);
}
/**
* Create a failed audit context.
*
* @param string $operationType The type of masking operation
* @param ErrorContext $error The error that occurred
* @param int $attemptNumber Which attempt this was
* @param float $durationMs Operation duration in milliseconds
* @param array<string, mixed> $metadata Additional context
*/
public static function failed(
string $operationType,
ErrorContext $error,
int $attemptNumber = 1,
float $durationMs = 0.0,
array $metadata = []
): self {
return new self(
operationType: $operationType,
status: self::STATUS_FAILED,
attemptNumber: $attemptNumber,
durationMs: $durationMs,
error: $error,
metadata: $metadata,
);
}
/**
* Create a recovered audit context (after retry/fallback).
*
* @param string $operationType The type of masking operation
* @param int $attemptNumber Final attempt number before success
* @param float $durationMs Total duration including retries
* @param array<string, mixed> $metadata Additional context
*/
public static function recovered(
string $operationType,
int $attemptNumber,
float $durationMs = 0.0,
array $metadata = []
): self {
return new self(
operationType: $operationType,
status: self::STATUS_RECOVERED,
attemptNumber: $attemptNumber,
durationMs: $durationMs,
metadata: $metadata,
);
}
/**
* Create a skipped audit context (conditional rule prevented masking).
*
* @param string $operationType The type of masking operation
* @param string $reason Why the operation was skipped
* @param array<string, mixed> $metadata Additional context
*/
public static function skipped(
string $operationType,
string $reason,
array $metadata = []
): self {
return new self(
operationType: $operationType,
status: self::STATUS_SKIPPED,
metadata: array_merge($metadata, ['skip_reason' => $reason]),
);
}
/**
* Create a copy with a correlation ID.
*/
public function withCorrelationId(string $correlationId): self
{
return new self(
operationType: $this->operationType,
status: $this->status,
correlationId: $correlationId,
attemptNumber: $this->attemptNumber,
durationMs: $this->durationMs,
error: $this->error,
metadata: $this->metadata,
);
}
/**
* Create a copy with additional metadata.
*
* @param array<string, mixed> $additionalMetadata
*/
public function withMetadata(array $additionalMetadata): self
{
return new self(
operationType: $this->operationType,
status: $this->status,
correlationId: $this->correlationId,
attemptNumber: $this->attemptNumber,
durationMs: $this->durationMs,
error: $this->error,
metadata: array_merge($this->metadata, $additionalMetadata),
);
}
/**
* Check if the operation succeeded.
*/
public function isSuccess(): bool
{
return $this->status === self::STATUS_SUCCESS
|| $this->status === self::STATUS_RECOVERED;
}
/**
* Convert to array for serialization/logging.
*
* @return array<string, mixed>
*/
public function toArray(): array
{
$data = [
'operation_type' => $this->operationType,
'status' => $this->status,
'attempt_number' => $this->attemptNumber,
'duration_ms' => round($this->durationMs, 3),
];
if ($this->correlationId !== null) {
$data['correlation_id'] = $this->correlationId;
}
if ($this->error instanceof ErrorContext) {
$data['error'] = $this->error->toArray();
}
if ($this->metadata !== []) {
$data['metadata'] = $this->metadata;
}
return $data;
}
/**
* Generate a unique correlation ID for tracking related operations.
*/
public static function generateCorrelationId(): string
{
return bin2hex(random_bytes(8));
}
}

147
src/Audit/ErrorContext.php Normal file
View File

@@ -0,0 +1,147 @@
<?php
declare(strict_types=1);
namespace Ivuorinen\MonologGdprFilter\Audit;
use Throwable;
/**
* Standardized error information for audit logging.
*
* Captures error details in a structured format while ensuring
* sensitive information is sanitized before logging.
*
* @api
*/
final readonly class ErrorContext
{
/**
* @param string $errorType The type/class of error that occurred
* @param string $message Sanitized error message (sensitive data removed)
* @param int $code Error code if available
* @param string|null $file File where error occurred (optional)
* @param int|null $line Line number where error occurred (optional)
* @param array<string, mixed> $metadata Additional error metadata
*/
public function __construct(
public string $errorType,
public string $message,
public int $code = 0,
public ?string $file = null,
public ?int $line = null,
public array $metadata = [],
) {
}
/**
* Create an ErrorContext from a Throwable.
*
* @param Throwable $throwable The exception/error to capture
* @param bool $includeSensitive Whether to include potentially sensitive details
*/
public static function fromThrowable(
Throwable $throwable,
bool $includeSensitive = false
): self {
$message = $includeSensitive
? $throwable->getMessage()
: self::sanitizeMessage($throwable->getMessage());
$metadata = [];
if ($includeSensitive) {
$metadata['trace'] = array_slice($throwable->getTrace(), 0, 5);
}
return new self(
errorType: $throwable::class,
message: $message,
code: (int) $throwable->getCode(),
file: $includeSensitive ? $throwable->getFile() : null,
line: $includeSensitive ? $throwable->getLine() : null,
metadata: $metadata,
);
}
/**
* Create an ErrorContext for a generic error.
*
* @param string $errorType The type of error
* @param string $message The error message
* @param array<string, mixed> $metadata Additional context
*/
public static function create(
string $errorType,
string $message,
array $metadata = []
): self {
return new self(
errorType: $errorType,
message: self::sanitizeMessage($message),
metadata: $metadata,
);
}
/**
* Sanitize an error message to remove potentially sensitive information.
*
* @param string $message The original error message
*/
private static function sanitizeMessage(string $message): string
{
$patterns = [
// Passwords and secrets
'/password[=:]\s*[^\s,;]+/i' => 'password=[REDACTED]',
'/secret[=:]\s*[^\s,;]+/i' => 'secret=[REDACTED]',
'/api[_-]?key[=:]\s*[^\s,;]+/i' => 'api_key=[REDACTED]',
'/token[=:]\s*[^\s,;]+/i' => 'token=[REDACTED]',
'/bearer\s+\S+/i' => 'bearer [REDACTED]',
// Connection strings
'/:[^@]+@/' => ':[REDACTED]@',
'/user[=:]\s*[^\s,;@]+/i' => 'user=[REDACTED]',
'/host[=:]\s*[^\s,;]+/i' => 'host=[REDACTED]',
// File paths (partial - keep filename)
'/\/(?:var|home|etc|usr|opt)\/[^\s:]+/' => '/[PATH_REDACTED]',
];
$sanitized = $message;
foreach ($patterns as $pattern => $replacement) {
$result = preg_replace($pattern, $replacement, $sanitized);
if ($result !== null) {
$sanitized = $result;
}
}
return $sanitized;
}
/**
* Convert to array for serialization/logging.
*
* @return array<string, mixed>
*/
public function toArray(): array
{
$data = [
'error_type' => $this->errorType,
'message' => $this->message,
'code' => $this->code,
];
if ($this->file !== null) {
$data['file'] = $this->file;
}
if ($this->line !== null) {
$data['line'] = $this->line;
}
if ($this->metadata !== []) {
$data['metadata'] = $this->metadata;
}
return $data;
}
}

View File

@@ -0,0 +1,232 @@
<?php
declare(strict_types=1);
namespace Ivuorinen\MonologGdprFilter\Audit;
use Ivuorinen\MonologGdprFilter\RateLimitedAuditLogger;
/**
* Enhanced audit logger wrapper with structured context support.
*
* Wraps a base audit logger (callable or RateLimitedAuditLogger) and
* provides structured context information for better audit trails.
*
* @api
*/
final class StructuredAuditLogger
{
/** @var callable(string, mixed, mixed): void */
private $wrappedLogger;
/**
* @param callable|RateLimitedAuditLogger $auditLogger Base logger to wrap
* @param bool $includeTimestamp Whether to include timestamp in metadata
* @param bool $includeDuration Whether to include operation duration
*/
public function __construct(
callable|RateLimitedAuditLogger $auditLogger,
private readonly bool $includeTimestamp = true,
private readonly bool $includeDuration = true
) {
$this->wrappedLogger = $auditLogger;
}
/**
* Create a structured audit logger from a base logger.
*
* @param callable|RateLimitedAuditLogger $auditLogger Base logger
*/
public static function wrap(
callable|RateLimitedAuditLogger $auditLogger
): self {
return new self($auditLogger);
}
/**
* Log an audit entry with structured context.
*
* @param string $path The field path being masked
* @param mixed $original The original value
* @param mixed $masked The masked value
* @param AuditContext|null $context Structured audit context
*/
public function log(
string $path,
mixed $original,
mixed $masked,
?AuditContext $context = null
): void {
$enrichedContext = $context;
if ($enrichedContext instanceof AuditContext) {
$metadata = [];
if ($this->includeTimestamp) {
$metadata['timestamp'] = time();
$metadata['timestamp_micro'] = microtime(true);
}
if ($this->includeDuration && $enrichedContext->durationMs > 0) {
$metadata['duration_ms'] = $enrichedContext->durationMs;
}
if ($metadata !== []) {
$enrichedContext = $enrichedContext->withMetadata($metadata);
}
}
// Call the wrapped logger
// The wrapped logger may be a simple callable (3 params) or enhanced (4 params)
($this->wrappedLogger)($path, $original, $masked);
// If we have context and the wrapped logger doesn't handle it,
// we store it separately (could be extended to log to a separate channel)
if ($enrichedContext instanceof AuditContext) {
$this->logContext($path, $enrichedContext);
}
}
/**
* Log a success operation.
*
* @param string $path The field path
* @param mixed $original The original value
* @param mixed $masked The masked value
* @param string $operationType Type of masking operation
* @param float $durationMs Duration in milliseconds
*/
public function logSuccess(
string $path,
mixed $original,
mixed $masked,
string $operationType,
float $durationMs = 0.0
): void {
$context = AuditContext::success($operationType, $durationMs, [
'path' => $path,
]);
$this->log($path, $original, $masked, $context);
}
/**
* Log a failed operation.
*
* @param string $path The field path
* @param mixed $original The original value
* @param string $operationType Type of masking operation
* @param ErrorContext $error Error information
* @param int $attemptNumber Which attempt failed
*/
public function logFailure(
string $path,
mixed $original,
string $operationType,
ErrorContext $error,
int $attemptNumber = 1
): void {
$context = AuditContext::failed(
$operationType,
$error,
$attemptNumber,
0.0,
['path' => $path]
);
// For failures, the "masked" value indicates the failure
$this->log($path, $original, '[MASKING_FAILED]', $context);
}
/**
* Log a recovered operation (after retry/fallback).
*
* @param string $path The field path
* @param mixed $original The original value
* @param mixed $masked The masked value (from recovery)
* @param string $operationType Type of masking operation
* @param int $attemptNumber Final successful attempt number
* @param float $totalDurationMs Total duration including retries
*/
public function logRecovery(
string $path,
mixed $original,
mixed $masked,
string $operationType,
int $attemptNumber,
float $totalDurationMs = 0.0
): void {
$context = AuditContext::recovered(
$operationType,
$attemptNumber,
$totalDurationMs,
['path' => $path]
);
$this->log($path, $original, $masked, $context);
}
/**
* Log a skipped operation.
*
* @param string $path The field path
* @param mixed $value The value that was not masked
* @param string $operationType Type of masking operation
* @param string $reason Why masking was skipped
*/
public function logSkipped(
string $path,
mixed $value,
string $operationType,
string $reason
): void {
$context = AuditContext::skipped($operationType, $reason, [
'path' => $path,
]);
$this->log($path, $value, $value, $context);
}
/**
* Start timing an operation.
*
* @return float Start time in microseconds
*/
public function startTimer(): float
{
return microtime(true);
}
/**
* Calculate elapsed time since start.
*
* @param float $startTime From startTimer()
* @return float Duration in milliseconds
*/
public function elapsed(float $startTime): float
{
return (microtime(true) - $startTime) * 1000.0;
}
/**
* Log structured context (for extended audit trails).
*
* Override this method to send context to a separate logging channel.
*/
protected function logContext(string $path, AuditContext $context): void
{
// Default implementation does nothing extra
// Subclasses can override to log to a separate channel
unset($path, $context);
}
/**
* Get the wrapped logger for direct access if needed.
*
* @return callable
*/
public function getWrappedLogger(): callable
{
return $this->wrappedLogger;
}
}