feat: use our own actions in our workflows (#377)

* feat: use our own actions in our workflows

* fix: add missing inputs to validate-inputs, refactor node

* chore: cr comment fixes

* fix: update-validators formatting

* chore: update validators, add tests, conventions

* feat: validate severity with severity_enum

* feat: add 10 generic validators to improve input validation coverage

Add comprehensive validation system improvements across multiple phases:

Phase 2A - Quick Wins:
- Add multi_value_enum validator for 2-10 value enumerations
- Add exit_code_list validator for Unix/Linux exit codes (0-255)
- Refactor coverage_driver to use multi_value_enum

Phase 2B - High-Value Validators:
- Add key_value_list validator with shell injection prevention
- Add path_list validator with path traversal and glob support

Quick Wins - Additional Enums:
- Add network_mode validator for Docker network modes
- Add language_enum validator for language detection
- Add framework_mode validator for PHP framework modes
- Update boolean pattern to include 'push'

Phase 2C - Specialized Validators:
- Add json_format validator for JSON syntax validation
- Add cache_config validator for Docker BuildKit cache configs

Improvements:
- All validators include comprehensive security checks
- Pattern-based validation with clear error messages
- 23 new test methods with edge case coverage
- Update special case mappings for 20+ inputs
- Fix build-args mapping test expectation

Coverage impact: 22 actions now at 100% validation (88% → 92%)
Test suite: 762 → 785 tests (+23 tests, all passing)

* chore: regenerate rules.yml with improved validator coverage

Regenerate validation rules for all actions with new validators:

- compress-images: 86% → 100% (+1 input: ignore-paths)
- docker-build: 63% → 100% (+4 inputs: cache configs, platform-build-args)
- docker-publish: 73% → 100% (+1 input: build-args)
- language-version-detect: 67% → 100% (+1 input: language)
- php-tests: 89% (fixed framework→framework_mode mapping)
- prettier-lint: 86% → 100% (+2 inputs: file-pattern, plugins)
- security-scan: 86% (maintained coverage)

Overall: 23 of 25 actions now at 100% validation coverage (92%)

* fix: address PR #377 review comments

- Add | None type annotations to 6 optional parameters (PEP 604)
- Standardize injection pattern: remove @# from comma_separated_list validator
  (@ and # are not shell injection vectors, allows npm scoped packages)
- Remove dead code: unused value expression in key_value_list validator
- Update tests to reflect injection pattern changes
This commit is contained in:
2025-11-25 23:51:03 +02:00
committed by GitHub
parent e58465e5d3
commit 9aa16a8164
32 changed files with 2823 additions and 523 deletions

View File

@@ -8,56 +8,62 @@ Centralized Python-based input validation for GitHub Actions with PCRE regex sup
### Inputs
| name | description | required | default |
|---------------------|-------------------------------------------------------------------------------------|----------|---------|
| `action` | <p>Action name to validate (alias for action-type)</p> | `false` | `""` |
| `action-type` | <p>Type of action to validate (e.g., csharp-publish, docker-build, eslint-lint)</p> | `false` | `""` |
| `rules-file` | <p>Path to validation rules file</p> | `false` | `""` |
| `fail-on-error` | <p>Whether to fail on validation errors</p> | `false` | `true` |
| `token` | <p>GitHub token for authentication</p> | `false` | `""` |
| `namespace` | <p>Namespace/username for validation</p> | `false` | `""` |
| `email` | <p>Email address for validation</p> | `false` | `""` |
| `username` | <p>Username for validation</p> | `false` | `""` |
| `dotnet-version` | <p>.NET version string</p> | `false` | `""` |
| `terraform-version` | <p>Terraform version string</p> | `false` | `""` |
| `tflint-version` | <p>TFLint version string</p> | `false` | `""` |
| `node-version` | <p>Node.js version string</p> | `false` | `""` |
| `force-version` | <p>Force version override</p> | `false` | `""` |
| `default-version` | <p>Default version fallback</p> | `false` | `""` |
| `image-name` | <p>Docker image name</p> | `false` | `""` |
| `tag` | <p>Docker image tag</p> | `false` | `""` |
| `architectures` | <p>Target architectures</p> | `false` | `""` |
| `dockerfile` | <p>Dockerfile path</p> | `false` | `""` |
| `context` | <p>Docker build context</p> | `false` | `""` |
| `build-args` | <p>Docker build arguments</p> | `false` | `""` |
| `buildx-version` | <p>Docker Buildx version</p> | `false` | `""` |
| `max-retries` | <p>Maximum retry attempts</p> | `false` | `""` |
| `image-quality` | <p>Image quality percentage</p> | `false` | `""` |
| `png-quality` | <p>PNG quality percentage</p> | `false` | `""` |
| `parallel-builds` | <p>Number of parallel builds</p> | `false` | `""` |
| `days-before-stale` | <p>Number of days before marking as stale</p> | `false` | `""` |
| `days-before-close` | <p>Number of days before closing stale items</p> | `false` | `""` |
| `pre-commit-config` | <p>Pre-commit configuration file path</p> | `false` | `""` |
| `base-branch` | <p>Base branch name</p> | `false` | `""` |
| `dry-run` | <p>Dry run mode</p> | `false` | `""` |
| `is_fiximus` | <p>Use Fiximus bot</p> | `false` | `""` |
| `prefix` | <p>Release tag prefix</p> | `false` | `""` |
| `language` | <p>Language to analyze (for CodeQL)</p> | `false` | `""` |
| `queries` | <p>CodeQL queries to run</p> | `false` | `""` |
| `packs` | <p>CodeQL query packs</p> | `false` | `""` |
| `config-file` | <p>CodeQL configuration file path</p> | `false` | `""` |
| `config` | <p>CodeQL configuration YAML string</p> | `false` | `""` |
| `build-mode` | <p>Build mode for compiled languages</p> | `false` | `""` |
| `source-root` | <p>Source code root directory</p> | `false` | `""` |
| `category` | <p>Analysis category</p> | `false` | `""` |
| `checkout-ref` | <p>Git reference to checkout</p> | `false` | `""` |
| `working-directory` | <p>Working directory for analysis</p> | `false` | `""` |
| `upload-results` | <p>Upload results to GitHub Security</p> | `false` | `""` |
| `ram` | <p>Memory in MB for CodeQL</p> | `false` | `""` |
| `threads` | <p>Number of threads for CodeQL</p> | `false` | `""` |
| `output` | <p>Output path for SARIF results</p> | `false` | `""` |
| `skip-queries` | <p>Skip running queries</p> | `false` | `""` |
| `add-snippets` | <p>Add code snippets to SARIF</p> | `false` | `""` |
| name | description | required | default |
|----------------------|-------------------------------------------------------------------------------------|----------|---------|
| `action` | <p>Action name to validate (alias for action-type)</p> | `false` | `""` |
| `action-type` | <p>Type of action to validate (e.g., csharp-publish, docker-build, eslint-lint)</p> | `false` | `""` |
| `rules-file` | <p>Path to validation rules file</p> | `false` | `""` |
| `fail-on-error` | <p>Whether to fail on validation errors</p> | `false` | `true` |
| `token` | <p>GitHub token for authentication</p> | `false` | `""` |
| `namespace` | <p>Namespace/username for validation</p> | `false` | `""` |
| `email` | <p>Email address for validation</p> | `false` | `""` |
| `username` | <p>Username for validation</p> | `false` | `""` |
| `dotnet-version` | <p>.NET version string</p> | `false` | `""` |
| `terraform-version` | <p>Terraform version string</p> | `false` | `""` |
| `tflint-version` | <p>TFLint version string</p> | `false` | `""` |
| `node-version` | <p>Node.js version string</p> | `false` | `""` |
| `force-version` | <p>Force version override</p> | `false` | `""` |
| `default-version` | <p>Default version fallback</p> | `false` | `""` |
| `image-name` | <p>Docker image name</p> | `false` | `""` |
| `tag` | <p>Docker image tag</p> | `false` | `""` |
| `architectures` | <p>Target architectures</p> | `false` | `""` |
| `dockerfile` | <p>Dockerfile path</p> | `false` | `""` |
| `context` | <p>Docker build context</p> | `false` | `""` |
| `build-args` | <p>Docker build arguments</p> | `false` | `""` |
| `buildx-version` | <p>Docker Buildx version</p> | `false` | `""` |
| `max-retries` | <p>Maximum retry attempts</p> | `false` | `""` |
| `image-quality` | <p>Image quality percentage</p> | `false` | `""` |
| `png-quality` | <p>PNG quality percentage</p> | `false` | `""` |
| `parallel-builds` | <p>Number of parallel builds</p> | `false` | `""` |
| `days-before-stale` | <p>Number of days before marking as stale</p> | `false` | `""` |
| `days-before-close` | <p>Number of days before closing stale items</p> | `false` | `""` |
| `pre-commit-config` | <p>Pre-commit configuration file path</p> | `false` | `""` |
| `base-branch` | <p>Base branch name</p> | `false` | `""` |
| `dry-run` | <p>Dry run mode</p> | `false` | `""` |
| `is_fiximus` | <p>Use Fiximus bot</p> | `false` | `""` |
| `prefix` | <p>Release tag prefix</p> | `false` | `""` |
| `language` | <p>Language to analyze (for CodeQL)</p> | `false` | `""` |
| `queries` | <p>CodeQL queries to run</p> | `false` | `""` |
| `packs` | <p>CodeQL query packs</p> | `false` | `""` |
| `config-file` | <p>CodeQL configuration file path</p> | `false` | `""` |
| `config` | <p>CodeQL configuration YAML string</p> | `false` | `""` |
| `build-mode` | <p>Build mode for compiled languages</p> | `false` | `""` |
| `source-root` | <p>Source code root directory</p> | `false` | `""` |
| `category` | <p>Analysis category</p> | `false` | `""` |
| `checkout-ref` | <p>Git reference to checkout</p> | `false` | `""` |
| `working-directory` | <p>Working directory for analysis</p> | `false` | `""` |
| `upload-results` | <p>Upload results to GitHub Security</p> | `false` | `""` |
| `ram` | <p>Memory in MB for CodeQL</p> | `false` | `""` |
| `threads` | <p>Number of threads for CodeQL</p> | `false` | `""` |
| `output` | <p>Output path for SARIF results</p> | `false` | `""` |
| `skip-queries` | <p>Skip running queries</p> | `false` | `""` |
| `add-snippets` | <p>Add code snippets to SARIF</p> | `false` | `""` |
| `gitleaks-license` | <p>Gitleaks license key</p> | `false` | `""` |
| `gitleaks-config` | <p>Gitleaks configuration file path</p> | `false` | `""` |
| `trivy-severity` | <p>Trivy severity levels to scan</p> | `false` | `""` |
| `trivy-scanners` | <p>Trivy scanner types to run</p> | `false` | `""` |
| `trivy-timeout` | <p>Trivy scan timeout</p> | `false` | `""` |
| `actionlint-enabled` | <p>Enable actionlint scanning</p> | `false` | `""` |
### Outputs
@@ -365,4 +371,40 @@ This action is a `composite` action.
#
# Required: false
# Default: ""
gitleaks-license:
# Gitleaks license key
#
# Required: false
# Default: ""
gitleaks-config:
# Gitleaks configuration file path
#
# Required: false
# Default: ""
trivy-severity:
# Trivy severity levels to scan
#
# Required: false
# Default: ""
trivy-scanners:
# Trivy scanner types to run
#
# Required: false
# Default: ""
trivy-timeout:
# Trivy scan timeout
#
# Required: false
# Default: ""
actionlint-enabled:
# Enable actionlint scanning
#
# Required: false
# Default: ""
```

View File

@@ -173,6 +173,26 @@ inputs:
description: 'Add code snippets to SARIF'
required: false
# Security-scan specific inputs
gitleaks-license:
description: 'Gitleaks license key'
required: false
gitleaks-config:
description: 'Gitleaks configuration file path'
required: false
trivy-severity:
description: 'Trivy severity levels to scan'
required: false
trivy-scanners:
description: 'Trivy scanner types to run'
required: false
trivy-timeout:
description: 'Trivy scan timeout'
required: false
actionlint-enabled:
description: 'Enable actionlint scanning'
required: false
outputs:
validation-status:
description: 'Overall validation status (success/failure)'

View File

@@ -114,7 +114,7 @@ class ValidationRuleGenerator:
"prefix": re.compile(r"\b(prefix|tag[_-]?prefix)\b", re.IGNORECASE),
# Boolean patterns (broad, should be lower priority)
"boolean": re.compile(
r"\b(dry-?run|verbose|enable|disable|auto|skip|force|cache|provenance|sbom|scan|sign|fail[_-]?on[_-]?error|nightly)\b",
r"\b(dry-?run|verbose|enable|disable|auto|skip|force|cache|provenance|sbom|scan|sign|push|fail[_-]?on[_-]?error|nightly)\b",
re.IGNORECASE,
),
# File extensions pattern
@@ -160,36 +160,36 @@ class ValidationRuleGenerator:
"npm_token": "github_token",
"password": "github_token",
# Complex fields that should skip validation
"build-args": None, # Can be empty
"context": None, # Default handled
"cache-from": None, # Complex cache syntax
"cache-export": None, # Complex cache syntax
"cache-import": None, # Complex cache syntax
"build-contexts": None, # Complex syntax
"secrets": None, # Complex syntax
"platform-build-args": None, # JSON format
"extensions": None, # PHP extensions list
"tools": None, # PHP tools list
"build-args": "key_value_list", # Docker build arguments (KEY=VALUE format)
"context": "file_path", # Build context path
"cache-from": "cache_config", # Docker cache configuration
"cache-export": "cache_config", # Docker cache configuration
"cache-import": "cache_config", # Docker cache configuration
"build-contexts": "key_value_list", # Docker build contexts (KEY=VALUE format)
"secrets": "key_value_list", # Docker secrets (KEY=VALUE format)
"platform-build-args": "json_format", # JSON format for platform-specific args
"extensions": "php_extensions", # PHP extensions list
"tools": "linter_list", # PHP tools list - same pattern as linters
"framework": "framework_mode", # PHP framework mode (auto, laravel, generic)
"args": None, # Composer args
"stability": None, # Composer stability
"registry-url": "url", # URL format
"scope": "scope", # NPM scope
"plugins": None, # Prettier plugins
"plugins": "linter_list", # Prettier plugins - same pattern as linters
"file-extensions": "file_extensions", # File extension list
"file-pattern": None, # Glob pattern
"enable-linters": None, # Linter list
"disable-linters": None, # Linter list
"success-codes": None, # Exit code list
"retry-codes": None, # Exit code list
"ignore-paths": None, # Path patterns
"key-files": None, # Cache key files
"restore-keys": None, # Cache restore keys
"env-vars": None, # Environment variables
"file-pattern": "path_list", # Glob pattern for file paths
"enable-linters": "linter_list", # Linter list
"disable-linters": "linter_list", # Linter list
"success-codes": "exit_code_list", # Exit code list
"retry-codes": "exit_code_list", # Exit code list
"ignore-paths": "path_list", # Path patterns to ignore
"key-files": "path_list", # Cache key files (paths)
"restore-keys": "path_list", # Cache restore keys (paths)
"env-vars": "key_value_list", # Environment variables (KEY=VALUE format)
# Action-specific fields that need special handling
"type": None, # Cache type enum (npm, composer, go, etc.) - complex enum,
# skip validation
"paths": None, # File paths for caching (comma-separated) - complex format,
# skip validation
"paths": "path_list", # File paths for caching (comma-separated)
"command": None, # Shell command - complex format, skip validation for safety
"backoff-strategy": None, # Retry strategy enum - complex enum, skip validation
"shell": None, # Shell type enum - simple enum, skip validation
@@ -199,10 +199,13 @@ class ValidationRuleGenerator:
"retry-delay": "numeric_range_1_300", # Retry delay should support higher values
"max-warnings": "numeric_range_0_10000",
# version-file-parser specific fields
"language": None, # Simple enum (node, php, python, go, dotnet)
"tool-versions-key": None, # Simple string (nodejs, python, php, golang, dotnet)
"dockerfile-image": None, # Simple string (node, python, php, golang, dotnet)
"validation-regex": "regex_pattern", # Regex pattern - validate for ReDoS
# Docker network mode
"network": "network_mode", # Docker network mode (host, none, default)
# Language enum for version detection
"language": "language_enum", # Language type (php, python, go, dotnet)
}
def get_action_directories(self) -> list[str]:
@@ -314,7 +317,6 @@ class ValidationRuleGenerator:
"docker-publish": {
"registry": "registry_enum",
"cache-mode": "cache_mode",
"platforms": None, # Skip validation - complex platform format
},
"docker-publish-hub": {
"password": "docker_password",
@@ -354,26 +356,28 @@ class ValidationRuleGenerator:
"prettier-lint": {
"mode": "mode_enum",
},
"security-scan": {
"gitleaks-config": "file_path",
"trivy-severity": "severity_enum",
"trivy-scanners": "scanner_list",
"trivy-timeout": "timeout_with_unit",
"actionlint-enabled": "boolean",
"token": "github_token",
},
}
if action_name in action_overrides:
# Apply overrides for existing conventions
overrides.update(
{
input_name: override_value
for input_name, override_value in action_overrides[action_name].items()
if input_name in conventions
},
)
# Add missing inputs from overrides to conventions
for input_name, override_value in action_overrides[action_name].items():
if input_name not in conventions and input_name in action_data["inputs"]:
if input_name in action_data["inputs"]:
overrides[input_name] = override_value
# Update conventions to match override (or set to None if skipped)
conventions[input_name] = override_value
# Calculate statistics
total_inputs = len(action_data["inputs"])
validated_inputs = len(conventions)
skipped_inputs = sum(1 for v in overrides.values() if v is None)
validated_inputs = sum(1 for v in conventions.values() if v is not None)
skipped_inputs = sum(1 for v in conventions.values() if v is None)
coverage = round((validated_inputs / total_inputs) * 100) if total_inputs > 0 else 0
# Generate rules object with enhanced metadata
@@ -432,8 +436,20 @@ class ValidationRuleGenerator:
# Use a custom yaml dumper to ensure proper indentation
class CustomYamlDumper(yaml.SafeDumper):
def increase_indent(self, flow: bool = False, *, indentless: bool = False) -> None: # noqa: FBT001, FBT002
return super().increase_indent(flow, indentless=indentless)
def increase_indent(self, flow: bool = False, *, indentless: bool = False) -> None: # noqa: FBT001, FBT002, ARG002 # type: ignore[override]
return super().increase_indent(flow, False)
def choose_scalar_style(self):
"""Choose appropriate quote style based on string content."""
if hasattr(self, "event") and hasattr(self.event, "value") and self.event.value: # type: ignore[attr-defined]
value = self.event.value # type: ignore[attr-defined]
# Use literal block style for multiline strings
if "\n" in value:
return "|"
# Use double quotes for strings with single quotes
if "'" in value:
return '"'
return super().choose_scalar_style()
yaml_content = yaml.dump(
rules,

File diff suppressed because it is too large Load Diff

View File

@@ -274,6 +274,71 @@ class TestDockerValidator:
result = self.validator.validate_inputs(inputs)
assert isinstance(result, bool)
def test_validate_registry_valid(self):
"""Test registry enum validation with valid values."""
valid_registries = [
"dockerhub",
"github",
"both",
]
for registry in valid_registries:
self.validator.errors = []
result = self.validator.validate_registry(registry)
assert result is True, f"Should accept registry: {registry}"
def test_validate_registry_invalid(self):
"""Test registry enum validation with invalid values."""
invalid_registries = [
"", # Empty
" ", # Whitespace only
"docker", # Wrong value (should be dockerhub)
"hub", # Wrong value
"ghcr", # Wrong value
"gcr", # Wrong value
"both,github", # Comma-separated not allowed
"DOCKERHUB", # Uppercase
"DockerHub", # Mixed case
"docker hub", # Space
"github.com", # Full URL not allowed
]
for registry in invalid_registries:
self.validator.errors = []
result = self.validator.validate_registry(registry)
assert result is False, f"Should reject registry: {registry}"
def test_validate_sbom_format_valid(self):
"""Test SBOM format validation with valid values."""
valid_formats = [
"spdx-json",
"cyclonedx-json",
"", # Empty is optional
]
for sbom_format in valid_formats:
self.validator.errors = []
result = self.validator.validate_sbom_format(sbom_format)
assert result is True, f"Should accept SBOM format: {sbom_format}"
def test_validate_sbom_format_invalid(self):
"""Test SBOM format validation with invalid values."""
invalid_formats = [
"spdx", # Missing -json suffix
"cyclonedx", # Missing -json suffix
"json", # Just json
"spdx-xml", # Wrong format
"cyclonedx-xml", # Wrong format
"SPDX-JSON", # Uppercase
"spdx json", # Space
"invalid", # Invalid value
]
for sbom_format in invalid_formats:
self.validator.errors = []
result = self.validator.validate_sbom_format(sbom_format)
assert result is False, f"Should reject SBOM format: {sbom_format}"
def test_empty_values_handling(self):
"""Test that empty values are handled appropriately."""
# Some Docker fields might be required, others optional
@@ -281,3 +346,5 @@ class TestDockerValidator:
assert isinstance(self.validator.validate_docker_tag(""), bool)
assert isinstance(self.validator.validate_architectures(""), bool)
assert isinstance(self.validator.validate_prefix(""), bool)
# Registry should reject empty values
assert self.validator.validate_registry("") is False

View File

@@ -151,7 +151,7 @@ class TestValidationRuleGenerator:
generator = ValidationRuleGenerator()
# Test special cases from the mapping
assert generator.detect_validation_type("build-args", {}) is None
assert generator.detect_validation_type("build-args", {}) == "key_value_list"
assert generator.detect_validation_type("version", {}) == "flexible_version"
assert (
generator.detect_validation_type("dotnet-version", {}) == "dotnet_version"

View File

@@ -556,13 +556,33 @@ class ConventionBasedValidator(BaseValidator):
self._validator_modules["codeql"] = codeql.CodeQLValidator()
return self._validator_modules["codeql"], f"validate_{validator_type}"
# PHP-specific validators
if validator_type in ["php_extensions", "coverage_driver", "mode_enum"]:
# Return self for PHP-specific validation methods
# Convention-based validators
if validator_type in [
"php_extensions",
"coverage_driver",
"mode_enum",
"binary_enum",
"multi_value_enum",
"report_format",
"format_enum",
"linter_list",
"timeout_with_unit",
"severity_enum",
"scanner_list",
"exit_code_list",
"key_value_list",
"path_list",
"network_mode",
"language_enum",
"framework_mode",
"json_format",
"cache_config",
]:
# Return self for validation methods implemented in this class
return self, f"_validate_{validator_type}"
# Package manager and report format validators
if validator_type in ["package_manager_enum", "report_format"]:
# Package manager validators
if validator_type in ["package_manager_enum"]:
# These could be in a separate module, but for now we'll put them in file validator
if "file" not in self._validator_modules:
from . import file
@@ -592,9 +612,104 @@ class ConventionBasedValidator(BaseValidator):
# Default range
return 0, 100
def _validate_comma_separated_list(
self,
value: str,
input_name: str,
item_pattern: str | None = None,
valid_items: list | None = None,
check_injection: bool = False,
item_name: str = "item",
) -> bool:
"""Validate comma-separated list of items (generic validator).
This is a generic validator that can be used for any comma-separated list
with either pattern-based or enum-based validation.
Args:
value: The comma-separated list value
input_name: The input name for error messages
item_pattern: Regex pattern each item must match (default: alphanumeric+hyphens+underscores)
valid_items: Optional list of valid items for enum-style validation
check_injection: Whether to check for shell injection patterns
item_name: Descriptive name for items in error messages (e.g., "linter", "extension")
Returns:
True if valid, False otherwise
Examples:
>>> # Pattern-based validation
>>> validator._validate_comma_separated_list(
... "gosec,govet", "enable-linters",
... item_pattern=r'^[a-zA-Z0-9_-]+$',
... item_name="linter"
... )
True
>>> # Enum-based validation
>>> validator._validate_comma_separated_list(
... "vuln,config", "scanners",
... valid_items=["vuln", "config", "secret", "license"],
... item_name="scanner"
... )
True
"""
import re
if not value or value.strip() == "":
return True # Optional
# Security check for injection patterns
if check_injection and re.search(r"[;&|`$()]", value):
self.add_error(
f"Potential injection detected in {input_name}: {value}. "
f"Avoid using shell metacharacters (;, &, |, `, $, parentheses)"
)
return False
# Split by comma and validate each item
items = [item.strip() for item in value.split(",")]
for item in items:
if not item: # Empty after strip
self.add_error(f"Invalid {input_name}: {value}. Contains empty {item_name}")
return False
# Enum-based validation (if valid_items provided)
if valid_items is not None:
if item not in valid_items:
self.add_error(
f"Invalid {item_name} '{item}' in {input_name}. "
f"Must be one of: {', '.join(valid_items)}"
)
return False
# Pattern-based validation (if no valid_items and pattern provided)
elif item_pattern is not None:
if not re.match(item_pattern, item):
self.add_error(
f"Invalid {item_name} '{item}' in {input_name}. "
f"Must match pattern: alphanumeric with hyphens/underscores"
)
return False
# Default pattern if neither valid_items nor item_pattern provided
elif not re.match(r"^[a-zA-Z0-9_-]+$", item):
self.add_error(
f"Invalid {item_name} '{item}' in {input_name}. "
f"Must be alphanumeric with hyphens/underscores"
)
return False
return True
def _validate_php_extensions(self, value: str, input_name: str) -> bool:
"""Validate PHP extensions format.
Wrapper for comma-separated list validator with PHP extension-specific rules.
Allows alphanumeric characters, underscores, and spaces.
Checks for shell injection patterns.
Args:
value: The extensions value (comma-separated list)
input_name: The input name for error messages
@@ -602,59 +717,736 @@ class ConventionBasedValidator(BaseValidator):
Returns:
True if valid, False otherwise
"""
import re
return self._validate_comma_separated_list(
value,
input_name,
item_pattern=r"^[a-zA-Z0-9_\s]+$",
check_injection=True,
item_name="extension",
)
if not value:
return True
def _validate_binary_enum(
self,
value: str,
input_name: str,
valid_values: list | None = None,
case_sensitive: bool = True,
) -> bool:
"""Validate binary enum (two-value choice) (generic validator).
# Check for injection patterns
if re.search(r"[;&|`$()@#]", value):
self.add_error(f"Potential injection detected in {input_name}: {value}")
This is a generic validator for two-value enums (e.g., check/fix, enabled/disabled).
Args:
value: The enum value
input_name: The input name for error messages
valid_values: List of exactly 2 valid values (default: ["check", "fix"])
case_sensitive: Whether validation is case-sensitive (default: True)
Returns:
True if valid, False otherwise
Examples:
>>> # Default check/fix mode
>>> validator._validate_binary_enum("check", "mode")
True
>>> # Custom binary enum
>>> validator._validate_binary_enum(
... "enabled", "status",
... valid_values=["enabled", "disabled"]
... )
True
"""
if valid_values is None:
valid_values = ["check", "fix"]
if len(valid_values) != 2:
raise ValueError(
f"Binary enum requires exactly 2 valid values, got {len(valid_values)}"
)
if not value or value.strip() == "":
return True # Optional
# Case-insensitive comparison if needed
if not case_sensitive:
value_lower = value.lower()
valid_values_lower = [v.lower() for v in valid_values]
if value_lower not in valid_values_lower:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_values)}"
)
return False
else:
if value not in valid_values:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_values)}"
)
return False
return True
def _validate_format_enum(
self,
value: str,
input_name: str,
valid_formats: list | None = None,
allow_custom: bool = False,
) -> bool:
"""Validate output format enum (generic validator).
Generic validator for tool output formats (SARIF, JSON, XML, etc.).
Supports common formats across linting/analysis tools.
Args:
value: The format value
input_name: The input name for error messages
valid_formats: List of valid formats (default: comprehensive list)
allow_custom: Whether to allow formats not in the predefined list (default: False)
Returns:
True if valid, False otherwise
Examples:
>>> # Default comprehensive format list
>>> validator._validate_format_enum("json", "format")
True
>>> # Tool-specific format list
>>> validator._validate_format_enum(
... "sarif", "output-format",
... valid_formats=["json", "sarif", "text"]
... )
True
"""
if valid_formats is None:
# Comprehensive list of common formats across all tools
valid_formats = [
"checkstyle",
"colored-line-number",
"compact",
"github-actions",
"html",
"json",
"junit",
"junit-xml",
"line-number",
"sarif",
"stylish",
"tab",
"teamcity",
"xml",
]
if not value or value.strip() == "":
return True # Optional
# Check if format is valid
if value not in valid_formats and not allow_custom:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_formats)}"
)
return False
# Check format - should be alphanumeric, underscores, commas, spaces only
if not re.match(r"^[a-zA-Z0-9_,\s]+$", value):
self.add_error(f"Invalid format for {input_name}: {value}")
return False
return True
def _validate_multi_value_enum(
self,
value: str,
input_name: str,
valid_values: list | None = None,
case_sensitive: bool = True,
min_values: int = 2,
max_values: int = 10,
) -> bool:
"""Validate multi-value enum (2-10 value choice) (generic validator).
Generic validator for enums with 2-10 predefined values.
For exactly 2 values, use _validate_binary_enum instead.
Args:
value: The enum value
input_name: The input name for error messages
valid_values: List of valid values (2-10 items required)
case_sensitive: Whether validation is case-sensitive (default: True)
min_values: Minimum number of valid values (default: 2)
max_values: Maximum number of valid values (default: 10)
Returns:
True if valid, False otherwise
Examples:
>>> # Framework selection (3 values)
>>> validator._validate_multi_value_enum(
... "laravel", "framework",
... valid_values=["auto", "laravel", "generic"]
... )
True
>>> # Language selection (4 values)
>>> validator._validate_multi_value_enum(
... "python", "language",
... valid_values=["php", "python", "go", "dotnet"]
... )
True
"""
if valid_values is None:
raise ValueError("valid_values is required for multi_value_enum validator")
# Validate valid_values count
if len(valid_values) < min_values:
raise ValueError(
f"Multi-value enum requires at least {min_values} valid values, got {len(valid_values)}"
)
if len(valid_values) > max_values:
raise ValueError(
f"Multi-value enum supports at most {max_values} valid values, got {len(valid_values)}"
)
if not value or value.strip() == "":
return True # Optional
# Case-insensitive comparison if needed
if not case_sensitive:
value_lower = value.lower()
valid_values_lower = [v.lower() for v in valid_values]
if value_lower not in valid_values_lower:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_values)}"
)
return False
else:
if value not in valid_values:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_values)}"
)
return False
return True
def _validate_coverage_driver(self, value: str, input_name: str) -> bool:
"""Validate coverage driver enum.
Wrapper for multi_value_enum validator with PHP coverage driver options.
Args:
value: The coverage driver value
input_name: The input name for error messages
Returns:
True if valid, False otherwise
Examples:
Valid: "xdebug", "pcov", "xdebug3", "none", ""
Invalid: "xdebug2", "XDEBUG", "coverage"
"""
valid_drivers = ["none", "xdebug", "pcov", "xdebug3"]
if value and value not in valid_drivers:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_drivers)}"
)
return False
return True
return self._validate_multi_value_enum(
value,
input_name,
valid_values=["none", "xdebug", "pcov", "xdebug3"],
case_sensitive=True,
)
def _validate_mode_enum(self, value: str, input_name: str) -> bool:
"""Validate mode enum for linting actions.
Wrapper for binary_enum validator with check/fix modes.
Args:
value: The mode value
input_name: The input name for error messages
Returns:
True if valid, False otherwise
"""
valid_modes = ["check", "fix"]
if value and value not in valid_modes:
Examples:
Valid: "check", "fix", ""
Invalid: "invalid", "CHECK", "Fix"
"""
return self._validate_binary_enum(
value,
input_name,
valid_values=["check", "fix"],
case_sensitive=True,
)
def _validate_report_format(self, value: str, input_name: str) -> bool:
"""Validate report format for linting/analysis actions.
Wrapper for format_enum validator with comprehensive format list.
Supports multiple report formats used across different tools.
Args:
value: The report format value
input_name: The input name for error messages
Returns:
True if valid, False otherwise
Examples:
Valid: "json", "sarif", "checkstyle", "github-actions", ""
Invalid: "invalid", "txt", "pdf"
"""
return self._validate_format_enum(value, input_name)
def _validate_linter_list(self, value: str, input_name: str) -> bool:
"""Validate comma-separated list of linter names.
Wrapper for comma-separated list validator with linter-specific rules.
Allows alphanumeric characters, hyphens, and underscores.
Args:
value: The linter list value
input_name: The input name for error messages
Returns:
True if valid, False otherwise
Examples:
Valid: "gosec,govet,staticcheck", "errcheck"
Invalid: "gosec,,govet", "invalid linter", "linter@123"
"""
return self._validate_comma_separated_list(
value,
input_name,
item_pattern=r"^[a-zA-Z0-9_-]+$",
item_name="linter",
)
def _validate_timeout_with_unit(self, value: str, input_name: str) -> bool:
"""Validate timeout duration with unit (Go duration format).
Args:
value: The timeout value
input_name: The input name for error messages
Returns:
True if valid, False otherwise
"""
import re
if not value or value.strip() == "":
return True # Optional
# Go duration format: number + unit (ns, us/µs, ms, s, m, h)
pattern = r"^[0-9]+(ns|us|µs|ms|s|m|h)$"
if not re.match(pattern, value):
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_modes)}"
f"Invalid {input_name}: {value}. Expected format: number with unit "
"(e.g., 5m, 30s, 1h, 500ms)"
)
return False
return True
def _validate_severity_enum(self, value: str, input_name: str) -> bool:
"""Validate severity levels enum (generalized).
Generic validator for security tool severity levels.
Supports common severity formats used by various security tools.
Default levels: UNKNOWN, LOW, MEDIUM, HIGH, CRITICAL (Trivy/CVSSv3 style)
Case-sensitive by default.
Args:
value: The severity value (comma-separated for multiple levels)
input_name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Optional
# Standard severity levels (Trivy/CVSSv3/OWASP compatible)
# Can be extended for specific tools by creating tool-specific validators
valid_severities = ["UNKNOWN", "LOW", "MEDIUM", "HIGH", "CRITICAL"]
# Split by comma and validate each severity
severities = [s.strip() for s in value.split(",")]
for severity in severities:
if not severity: # Empty after strip
self.add_error(f"Invalid {input_name}: {value}. Contains empty severity level")
return False
# Case-sensitive validation
if severity not in valid_severities:
self.add_error(
f"Invalid {input_name}: {value}. Severity '{severity}' is not valid. "
f"Must be one of: {', '.join(valid_severities)}"
)
return False
return True
def _validate_scanner_list(self, value: str, input_name: str) -> bool:
"""Validate comma-separated list of scanner types (for Trivy).
Wrapper for comma-separated list validator with Trivy scanner enum validation.
Supports: vuln, config, secret, license
Args:
value: The scanner list value (comma-separated)
input_name: The input name for error messages
Returns:
True if valid, False otherwise
Examples:
Valid: "vuln,config,secret", "vuln", "config,license"
Invalid: "invalid", "vuln,invalid,config", "vuln,,config"
"""
return self._validate_comma_separated_list(
value,
input_name,
valid_items=["vuln", "config", "secret", "license"],
item_name="scanner",
)
def _validate_exit_code_list(self, value: str, input_name: str) -> bool:
"""Validate comma-separated list of exit codes.
Validates Unix/Linux exit codes (0-255) in comma-separated format.
Used for retry logic, success codes, and error handling.
Args:
value: The exit code list value (comma-separated integers)
input_name: The input name for error messages
Returns:
True if valid, False otherwise
Examples:
Valid: "0", "0,1,2", "5,10,15", "0,130", ""
Invalid: "256", "0,256", "-1", "0,abc", "0,,1"
"""
import re
if not value or value.strip() == "":
return True # Optional
# Split by comma and validate each exit code
codes = [code.strip() for code in value.split(",")]
for code in codes:
if not code: # Empty after strip
self.add_error(f"Invalid {input_name}: {value}. Contains empty exit code")
return False
# Check if code is numeric
if not re.match(r"^[0-9]+$", code):
self.add_error(
f"Invalid exit code '{code}' in {input_name}. "
f"Exit codes must be integers (0-255)"
)
return False
# Validate range (0-255 for Unix/Linux exit codes)
code_int = int(code)
if code_int < 0 or code_int > 255:
self.add_error(
f"Invalid exit code '{code}' in {input_name}. Exit codes must be in range 0-255"
)
return False
return True
def _validate_key_value_list(
self,
value: str,
input_name: str,
key_pattern: str | None = None,
check_injection: bool = True,
) -> bool:
"""Validate comma-separated list of key-value pairs (generic validator).
Validates KEY=VALUE,KEY2=VALUE2 format commonly used for Docker build-args,
environment variables, and other configuration parameters.
Args:
value: The key-value list value (comma-separated KEY=VALUE pairs)
input_name: The input name for error messages
key_pattern: Regex pattern for key validation (default: alphanumeric+underscores+hyphens)
check_injection: Whether to check for shell injection patterns in values (default: True)
Returns:
True if valid, False otherwise
Examples:
Valid: "KEY=value", "KEY1=value1,KEY2=value2", "BUILD_ARG=hello", ""
Invalid: "KEY", "=value", "KEY=", "KEY=value,", "KEY=val;whoami"
"""
import re
if not value or value.strip() == "":
return True # Optional
if key_pattern is None:
# Default: alphanumeric, underscores, hyphens (common for env vars and build args)
key_pattern = r"^[a-zA-Z0-9_-]+$"
# Security check for injection patterns in the entire value
if check_injection and re.search(r"[;&|`$()]", value):
self.add_error(
f"Potential injection detected in {input_name}: {value}. "
f"Avoid using shell metacharacters (;, &, |, `, $, parentheses)"
)
return False
# Split by comma and validate each key-value pair
pairs = [pair.strip() for pair in value.split(",")]
for pair in pairs:
if not pair: # Empty after strip
self.add_error(f"Invalid {input_name}: {value}. Contains empty key-value pair")
return False
# Check for KEY=VALUE format
if "=" not in pair:
self.add_error(
f"Invalid key-value pair '{pair}' in {input_name}. Expected format: KEY=VALUE"
)
return False
# Split by first = only (value may contain =)
parts = pair.split("=", 1)
key = parts[0].strip()
# Validate key is not empty
if not key:
self.add_error(
f"Invalid key-value pair '{pair}' in {input_name}. Key cannot be empty"
)
return False
# Validate key pattern
if not re.match(key_pattern, key):
self.add_error(
f"Invalid key '{key}' in {input_name}. "
f"Keys must be alphanumeric with underscores/hyphens"
)
return False
# Note: Value can be empty (KEY=) - this is valid for some use cases
# Value validation is optional and handled by the check_injection flag above
return True
def _validate_path_list(
self,
value: str,
input_name: str,
allow_glob: bool = True,
check_injection: bool = True,
) -> bool:
"""Validate comma-separated list of file paths or glob patterns (generic validator).
Validates file paths and glob patterns commonly used for ignore-paths,
restore-keys, file-pattern, and other path-based inputs.
Args:
value: The path list to validate
input_name: Name of the input being validated
allow_glob: Whether to allow glob patterns (*, **, ?, [])
check_injection: Whether to check for shell injection patterns
Examples:
Valid: "*.js", "src/**/*.ts", "dist/,build/", ".github/workflows/*", ""
Invalid: "../etc/passwd", "file;rm -rf /", "path|whoami"
Returns:
bool: True if valid, False otherwise
"""
import re
if not value or value.strip() == "":
return True # Optional
# Security check for injection patterns
if check_injection and re.search(r"[;&|`$()]", value):
self.add_error(
f"Potential injection detected in {input_name}: {value}. "
f"Avoid using shell metacharacters (;, &, |, `, $, parentheses)"
)
return False
# Split by comma and validate each path
paths = [path.strip() for path in value.split(",")]
for path in paths:
if not path: # Empty after strip
self.add_error(f"Invalid {input_name}: {value}. Contains empty path")
return False
# Check for path traversal attempts
if "../" in path or "/.." in path or path.startswith(".."):
self.add_error(
f"Path traversal detected in {input_name}: {path}. Avoid using '..' in paths"
)
return False
# Validate glob patterns if allowed
if allow_glob:
# Glob patterns are valid: *, **, ?, [], {}
# Check for valid glob characters
glob_pattern = r"^[a-zA-Z0-9_\-./\*\?\[\]\{\},@~+]+$"
if not re.match(glob_pattern, path):
self.add_error(
f"Invalid path '{path}' in {input_name}. "
f"Paths may contain alphanumeric characters, hyphens, underscores, "
f"slashes, and glob patterns (*, **, ?, [], {{}})"
)
return False
else:
# No glob patterns allowed - only alphanumeric, hyphens, underscores, slashes
path_pattern = r"^[a-zA-Z0-9_\-./,@~+]+$"
if not re.match(path_pattern, path):
self.add_error(
f"Invalid path '{path}' in {input_name}. "
f"Paths may only contain alphanumeric characters, hyphens, "
f"underscores, and slashes"
)
return False
return True
def _validate_network_mode(self, value: str, input_name: str) -> bool:
"""Validate Docker network mode enum.
Wrapper for multi_value_enum validator with Docker network mode options.
Examples:
Valid: "host", "none", "default", ""
Invalid: "bridge", "NONE", "custom"
Returns:
bool: True if valid, False otherwise
"""
return self._validate_multi_value_enum(
value,
input_name,
valid_values=["host", "none", "default"],
case_sensitive=True,
)
def _validate_language_enum(self, value: str, input_name: str) -> bool:
"""Validate language enum for version detection.
Wrapper for multi_value_enum validator with supported language options.
Examples:
Valid: "php", "python", "go", "dotnet", ""
Invalid: "node", "ruby", "PHP"
Returns:
bool: True if valid, False otherwise
"""
return self._validate_multi_value_enum(
value,
input_name,
valid_values=["php", "python", "go", "dotnet"],
case_sensitive=True,
)
def _validate_framework_mode(self, value: str, input_name: str) -> bool:
"""Validate PHP framework detection mode.
Wrapper for multi_value_enum validator with framework mode options.
Examples:
Valid: "auto", "laravel", "generic", ""
Invalid: "symfony", "Auto", "LARAVEL"
Returns:
bool: True if valid, False otherwise
"""
return self._validate_multi_value_enum(
value,
input_name,
valid_values=["auto", "laravel", "generic"],
case_sensitive=True,
)
def _validate_json_format(self, value: str, input_name: str) -> bool:
"""Validate JSON format string.
Validates that input is valid JSON. Used for structured configuration
data like platform-specific build arguments.
Examples:
Valid: '{"key":"value"}', '[]', '{"platforms":["linux/amd64"]}', ""
Invalid: '{invalid}', 'not json', '{key:value}'
Returns:
bool: True if valid, False otherwise
"""
import json
if not value or value.strip() == "":
return True # Optional
try:
json.loads(value)
return True
except json.JSONDecodeError as e:
self.add_error(f"Invalid JSON format in {input_name}: {value}. Error: {str(e)}")
return False
except Exception as e:
self.add_error(f"Failed to validate JSON in {input_name}: {str(e)}")
return False
def _validate_cache_config(self, value: str, input_name: str) -> bool:
"""Validate Docker BuildKit cache configuration.
Validates Docker cache export/import configuration format.
Common formats: type=registry,ref=..., type=local,dest=..., type=gha
Examples:
Valid: "type=registry,ref=user/repo:cache", "type=local,dest=/tmp/cache",
"type=gha", "type=inline", ""
Invalid: "invalid", "type=", "registry", "type=unknown"
Returns:
bool: True if valid, False otherwise
"""
import re
if not value or value.strip() == "":
return True # Optional
# Check basic format: type=value[,key=value,...]
if not re.match(r"^type=[a-z0-9-]+", value):
self.add_error(
f"Invalid cache config in {input_name}: {value}. "
f"Must start with 'type=<cache-type>'"
)
return False
# Valid cache types
valid_types = ["registry", "local", "gha", "inline", "s3", "azblob", "oci"]
# Extract type
type_match = re.match(r"^type=([a-z0-9-]+)", value)
if type_match:
cache_type = type_match.group(1)
if cache_type not in valid_types:
self.add_error(
f"Invalid cache type '{cache_type}' in {input_name}. "
f"Valid types: {', '.join(valid_types)}"
)
return False
# Validate key=value pairs format
parts = value.split(",")
for part in parts:
if "=" not in part:
self.add_error(
f"Invalid cache config format in {input_name}: {value}. "
f"Each part must be in 'key=value' format"
)
return False
return True