feat: fixes, tweaks, new actions, linting (#186)

* feat: fixes, tweaks, new actions, linting
* fix: improve docker publish loops and dotnet parsing (#193)
* fix: harden action scripts and version checks (#191)
* refactor: major repository restructuring and security enhancements

Add comprehensive development infrastructure:
- Add Makefile with automated documentation generation, formatting, and linting tasks
- Add TODO.md tracking self-containment progress and repository improvements
- Add .nvmrc for consistent Node.js version management
- Create python-version-detect-v2 action for enhanced Python detection

Enhance all GitHub Actions with standardized patterns:
- Add consistent token handling across 27 actions using standardized input patterns
- Implement bash error handling (set -euo pipefail) in all shell steps
- Add comprehensive input validation for path traversal and command injection protection
- Standardize checkout token authentication to prevent rate limiting
- Remove relative action dependencies to ensure external usability

Rewrite security workflow for PR-focused analysis:
- Transform security-suite.yml to PR-only security analysis workflow
- Remove scheduled runs, repository issue management, and Slack notifications
- Implement smart comment generation showing only sections with content
- Add GitHub Actions permission diff analysis and new action detection
- Integrate OWASP, Semgrep, and TruffleHog for comprehensive PR security scanning

Improve version detection and dependency management:
- Simplify version detection actions to use inline logic instead of shared utilities
- Fix Makefile version detection fallback to properly return 'main' when version not found
- Update all external action references to use SHA-pinned versions
- Remove deprecated run.sh in favor of Makefile automation

Update documentation and project standards:
- Enhance CLAUDE.md with self-containment requirements and linting standards
- Update README.md with improved action descriptions and usage examples
- Standardize code formatting with updated .editorconfig and .prettierrc.yml
- Improve GitHub templates for issues and security reporting

This refactoring ensures all 40 actions are fully self-contained and can be used independently when
referenced as ivuorinen/actions/action-name@main, addressing the critical requirement for external
usability while maintaining comprehensive security analysis and development automation.

* feat: add automated action catalog generation system

- Create generate_listing.cjs script for comprehensive action catalog
- Add package.json with development tooling and npm scripts
- Implement automated README.md catalog section with --update flag
- Generate markdown reference-style links for all 40 actions
- Add categorized tables with features, language support matrices
- Replace static reference links with auto-generated dynamic links
- Enable complete automation of action documentation maintenance

* feat: enhance actions with improved documentation and functionality

- Add comprehensive README files for 12 actions with usage examples
- Implement new utility actions (go-version-detect, dotnet-version-detect)
- Enhance node-setup with extensive configuration options
- Improve error handling and validation across all actions
- Update package.json scripts for better development workflow
- Expand TODO.md with detailed roadmap and improvement plans
- Standardize action structure with consistent inputs/outputs

* feat: add comprehensive output handling across all actions

- Add standardized outputs to 15 actions that previously had none
- Implement consistent snake_case naming convention for all outputs
- Add build status and test results outputs to build actions
- Add files changed and status outputs to lint/fix actions
- Add test execution metrics to php-tests action
- Add stale/closed counts to stale action
- Add release URLs and IDs to github-release action
- Update documentation with output specifications
- Mark comprehensive output handling task as complete in TODO.md

* feat: implement shared cache strategy across all actions

- Add caching to 10 actions that previously had none (Node.js, .NET, Python, Go)
- Standardize 4 existing actions to use common-cache instead of direct actions/cache
- Implement consistent cache-hit optimization to skip installations when cache available
- Add language-specific cache configurations with appropriate key files
- Create unified caching approach using ivuorinen/actions/common-cache@main
- Fix YAML syntax error in php-composer action paths parameter
- Update TODO.md to mark shared cache strategy as complete

* feat: implement comprehensive retry logic for network operations

- Create new common-retry action for standardized retry patterns with configurable strategies
- Add retry logic to 9 actions missing network retry capabilities
- Implement exponential backoff, custom timeouts, and flexible error handling
- Add max-retries input parameter to all network-dependent actions (Node.js, .NET, Python, Go)
- Standardize existing retry implementations to use common-retry utility
- Update action catalog to include new common-retry action (41 total actions)
- Update documentation with retry configuration examples and parameters
- Mark retry logic implementation as complete in TODO.md roadmap

* feat: enhance Node.js support with Corepack and Bun

- Add Corepack support for automatic package manager version management
- Add Bun package manager support across all Node.js actions
- Improve Yarn Berry/PnP support with .yarnrc.yml detection
- Add Node.js feature detection (ESM, TypeScript, frameworks)
- Update package manager detection priority and lockfile support
- Enhance caching with package-manager-specific keys
- Update eslint, prettier, and biome actions for multi-package-manager support

* fix: resolve critical runtime issues across multiple actions

- Fix token validation by removing ineffective literal string comparisons
- Add missing @microsoft/eslint-formatter-sarif dependency for SARIF output
- Fix Bash variable syntax errors in username and changelog length checks
- Update Dockerfile version regex to handle tags with suffixes (e.g., -alpine)
- Simplify version selection logic with single grep command
- Fix command execution in retry action with proper bash -c wrapper
- Correct step output references using .outcome instead of .outputs.outcome
- Add missing step IDs for version detection actions
- Include go.mod in cache key files for accurate invalidation
- Require minor version in all version regex patterns
- Improve Bun installation security by verifying script before execution
- Replace bc with sort -V for portable PHP version comparison
- Remove non-existent pre-commit output references

These fixes ensure proper runtime behavior, improved security, and better
cross-platform compatibility across all affected actions.

* fix: resolve critical runtime and security issues across actions

- Fix biome-fix files_changed calculation using git diff instead of git status delta
- Fix compress-images output description and add absolute path validation
- Remove csharp-publish token default and fix token fallback in push commands
- Add @microsoft/eslint-formatter-sarif to all package managers in eslint-check
- Fix eslint-check command syntax by using variable assignment
- Improve node-setup Bun installation security and remove invalid frozen-lockfile flag
- Fix pre-commit token validation by removing ineffective literal comparison
- Fix prettier-fix token comparison and expand regex for all GitHub token types
- Add version-file-parser regex validation safety and fix csproj wildcard handling

These fixes address security vulnerabilities, runtime errors, and functional issues
to ensure reliable operation across all affected GitHub Actions.

* feat: enhance Docker actions with advanced multi-architecture support

Major enhancement to Docker build and publish actions with comprehensive
multi-architecture capabilities and enterprise-grade features.

Added features:
- Advanced buildx configuration (version control, cache modes, build contexts)
- Auto-detect platforms for dynamic architecture discovery
- Performance optimizations with enhanced caching strategies
- Security scanning with Trivy and image signing with Cosign
- SBOM generation in multiple formats with validation
- Verbose logging and dry-run modes for debugging
- Platform-specific build args and fallback mechanisms

Enhanced all Docker actions:
- docker-build: Core buildx features and multi-arch support
- docker-publish-gh: GitHub Packages with security features
- docker-publish-hub: Docker Hub with scanning and signing
- docker-publish: Orchestrator with unified configuration

Updated documentation across all modified actions.

* fix: resolve documentation generation placeholder issue

Fixed Makefile and package.json to properly replace placeholder tokens in generated documentation, ensuring all README files show correct repository paths instead of ***PROJECT***@***VERSION***.

* chore: simplify github token validation
* chore(lint): optional yamlfmt, config and fixes
* feat: use relative `uses` names

* feat: comprehensive testing infrastructure and Python validation system

- Migrate from tests/ to _tests/ directory structure with ShellSpec framework
- Add comprehensive validation system with Python-based input validation
- Implement dual testing approach (ShellSpec + pytest) for complete coverage
- Add modern Python tooling (uv, ruff, pytest-cov) and dependencies
- Create centralized validation rules with automatic generation system
- Update project configuration and build system for new architecture
- Enhance documentation to reflect current testing capabilities

This establishes a robust foundation for action validation and testing
with extensive coverage across all GitHub Actions in the repository.

* chore: remove Dockerfile for now
* chore: code review fixes

* feat: comprehensive GitHub Actions restructuring and tooling improvements

This commit represents a major restructuring of the GitHub Actions monorepo
with improved tooling, testing infrastructure, and comprehensive PR #186
review implementation.

## Major Changes

### 🔧 Development Tooling & Configuration
- **Shellcheck integration**: Exclude shellspec test files from linting
  - Updated .pre-commit-config.yaml to exclude _tests/*.sh from shellcheck/shfmt
  - Modified Makefile shellcheck pattern to skip shellspec files
  - Updated CLAUDE.md documentation with proper exclusion syntax
- **Testing infrastructure**: Enhanced Python validation framework
  - Fixed nested if statements and boolean parameter issues in validation.py
  - Improved code quality with explicit keyword arguments
  - All pre-commit hooks now passing

### 🏗️ Project Structure & Documentation
- **Added Serena AI integration** with comprehensive project memories:
  - Project overview, structure, and technical stack documentation
  - Code style conventions and completion requirements
  - Comprehensive PR #186 review analysis and implementation tracking
- **Enhanced configuration**: Updated .gitignore, .yamlfmt.yml, pyproject.toml
- **Improved testing**: Added integration workflows and enhanced test specs

### 🚀 GitHub Actions Improvements (30+ actions updated)
- **Centralized validation**: Updated 41 validation rule files
- **Enhanced actions**: Improvements across all action categories:
  - Setup actions (node-setup, version detectors)
  - Utility actions (version-file-parser, version-validator)
  - Linting actions (biome, eslint, terraform-lint-fix major refactor)
  - Build/publish actions (docker-build, npm-publish, csharp-*)
  - Repository management actions

### 📝 Documentation Updates
- **README consistency**: Updated version references across action READMEs
- **Enhanced documentation**: Improved action descriptions and usage examples
- **CLAUDE.md**: Updated with current tooling and best practices

## Technical Improvements
- **Security enhancements**: Input validation and sanitization improvements
- **Performance optimizations**: Streamlined action logic and dependencies
- **Cross-platform compatibility**: Better Windows/macOS/Linux support
- **Error handling**: Improved error reporting and user feedback

## Files Changed
- 100 files changed
- 13 new Serena memory files documenting project state
- 41 validation rules updated for consistency
- 30+ GitHub Actions and READMEs improved
- Core tooling configuration enhanced

* feat: comprehensive GitHub Actions improvements and PR review fixes

Major Infrastructure Improvements:
- Add comprehensive testing framework with 17+ ShellSpec validation tests
- Implement Docker-based testing tools with automated test runner
- Add CodeRabbit configuration for automated code reviews
- Restructure documentation and memory management system
- Update validation rules for 25+ actions with enhanced input validation
- Modernize CI/CD workflows and testing infrastructure

Critical PR Review Fixes (All Issues Resolved):
- Fix double caching in node-setup (eliminate redundant cache operations)
- Optimize shell pipeline in version-file-parser (single awk vs complex pipeline)
- Fix GitHub expression interpolation in prettier-check cache keys
- Resolve terraform command order issue (validation after setup)
- Add missing flake8-sarif dependency for Python SARIF output
- Fix environment variable scope in pr-lint (export to GITHUB_ENV)

Performance & Reliability:
- Eliminate duplicate cache operations saving CI time
- Improve shell script efficiency with optimized parsing
- Fix command execution dependencies preventing runtime failures
- Ensure proper dependency installation for all linting tools
- Resolve workflow conditional logic issues

Security & Quality:
- All input validation rules updated with latest security patterns
- Cross-platform compatibility improvements maintained
- Comprehensive error handling and retry logic preserved
- Modern development tooling and best practices adopted

This commit addresses 100% of actionable feedback from PR review analysis,
implements comprehensive testing infrastructure, and maintains high code
quality standards across all 41 GitHub Actions.

* feat: enhance expression handling and version parsing

- Fix node-setup force-version expression logic for proper empty string handling
- Improve version-file-parser with secure regex validation and enhanced Python detection
- Add CodeRabbit configuration for CalVer versioning and README review guidance

* feat(validate-inputs): implement modular validation system

- Add modular validator architecture with specialized validators
- Implement base validator classes for different input types
- Add validators: boolean, docker, file, network, numeric, security, token, version
- Add convention mapper for automatic input validation
- Add comprehensive documentation for the validation system
- Implement PCRE regex support and injection protection

* feat(validate-inputs): add validation rules for all actions

- Add YAML validation rules for 42 GitHub Actions
- Auto-generated rules with convention mappings
- Include metadata for validation coverage and quality indicators
- Mark rules as auto-generated to prevent manual edits

* test(validate-inputs): add comprehensive test suite for validators

- Add unit tests for all validator modules
- Add integration tests for the validation system
- Add fixtures for version test data
- Test coverage for boolean, docker, file, network, numeric, security, token, and version validators
- Add tests for convention mapper and registry

* feat(tools): add validation scripts and utilities

- Add update-validators.py script for auto-generating rules
- Add benchmark-validator.py for performance testing
- Add debug-validator.py for troubleshooting
- Add generate-tests.py for test generation
- Add check-rules-not-manually-edited.sh for CI validation
- Add fix-local-action-refs.py tool for fixing action references

* feat(actions): add CustomValidator.py files for specialized validation

- Add custom validators for actions requiring special validation logic
- Implement validators for docker, go, node, npm, php, python, terraform actions
- Add specialized validation for compress-images, common-cache, common-file-check
- Implement version detection validators with language-specific logic
- Add validation for build arguments, architectures, and version formats

* test: update ShellSpec test framework for Python validation

- Update all validation.spec.sh files to use Python validator
- Add shared validation_core.py for common test utilities
- Remove obsolete bash validation helpers
- Update test output expectations for Python validator format
- Add codeql-analysis test suite
- Refactor framework utilities for Python integration
- Remove deprecated test files

* feat(actions): update action.yml files to use validate-inputs

- Replace inline bash validation with validate-inputs action
- Standardize validation across all 42 actions
- Add new codeql-analysis action
- Update action metadata and branding
- Add validation step as first step in composite actions
- Maintain backward compatibility with existing inputs/outputs

* ci: update GitHub workflows for enhanced security and testing

- Add new codeql-new.yml workflow
- Update security scanning workflows
- Enhance dependency review configuration
- Update test-actions workflow for new validation system
- Improve workflow permissions and security settings
- Update action versions to latest SHA-pinned releases

* build: update build configuration and dependencies

- Update Makefile with new validation targets
- Add Python dependencies in pyproject.toml
- Update npm dependencies and scripts
- Enhance Docker testing tools configuration
- Add targets for validator updates and local ref fixes
- Configure uv for Python package management

* chore: update linting and documentation configuration

- Update EditorConfig settings for consistent formatting
- Enhance pre-commit hooks configuration
- Update prettier and yamllint ignore patterns
- Update gitleaks security scanning rules
- Update CodeRabbit review configuration
- Update CLAUDE.md with latest project standards and rules

* docs: update Serena memory files and project metadata

- Remove obsolete PR-186 memory files
- Update project overview with current architecture
- Update project structure documentation
- Add quality standards and communication guidelines
- Add modular validator architecture documentation
- Add shellspec testing framework documentation
- Update project.yml with latest configuration

* feat: moved rules.yml to same folder as action, fixes

* fix(validators): correct token patterns and fix validator bugs

- Fix GitHub classic PAT pattern: ghp_ + 36 chars = 40 total
- Fix GitHub fine-grained PAT pattern: github_pat_ + 71 chars = 82 total
- Initialize result variable in convention_mapper to prevent UnboundLocalError
- Fix empty URL validation in network validator to return error
- Add GitHub expression check to docker architectures validator
- Update docker-build CustomValidator parallel-builds max to 16

* test(validators): fix test fixtures and expectations

- Fix token lengths in test data: github_pat 71 chars, ghp/gho 36 chars
- Update integration tests with correct token lengths
- Fix file validator test to expect absolute paths rejected for security
- Rename TestGenerator import to avoid pytest collection warning
- Update custom validator tests with correct input names
- Change docker-build tests: platforms->architectures, tags->tag
- Update docker-publish tests to match new registry enum validation

* test(shellspec): fix token lengths in test helpers and specs

- Fix default token lengths in spec_helper.sh to use correct 40-char format
- Update csharp-publish default tokens in 4 locations
- Update codeql-analysis default tokens in 2 locations
- Fix codeql-analysis test tokens to correct lengths (40 and 82 chars)
- Fix npm-publish fine-grained token test to use 82-char format

* feat(actions): add permissions documentation and environment variable usage

- Add permissions comments to all action.yml files documenting required GitHub permissions
- Convert direct input usage to environment variables in shell steps for security
- Add validation steps with proper error handling
- Update input descriptions and add security notes where applicable
- Ensure all actions follow consistent patterns for input validation

* chore(workflows): update GitHub Actions workflow versions

- Update workflow action versions to latest
- Improve workflow consistency and maintainability

* docs(security): add comprehensive security policy

- Document security features and best practices
- Add vulnerability reporting process
- Include audit history and security testing information

* docs(memory): add GitHub workflow reference documentation

- Add GitHub Actions workflow commands reference
- Add GitHub workflow expressions guide
- Add secure workflow usage patterns and best practices

* chore: token optimization, code style conventions
* chore: cr fixes
* fix: trivy reported Dockerfile problems
* fix(security): more security fixes
* chore: dockerfile and make targets for publishing
* fix(ci): add creds to test-actions workflow
* fix: security fix and checkout step to codeql-new
* chore: test fixes
* fix(security): codeql detected issues
* chore: code review fixes, ReDos protection
* style: apply MegaLinter fixes
* fix(ci): missing packages read permission
* fix(ci): add missing working directory setting
* chore: linting, add validation-regex to use regex_pattern
* chore: code review fixes
* chore(deps): update actions
* fix(security): codeql fixes
* chore(cr): apply cr comments
* chore: improve POSIX compatibility
* chore(cr): apply cr comments
* fix: codeql warning in Dockerfile, build failures
* chore(cr): apply cr comments
* fix: docker-testing-tools/Dockerfile
* chore(cr): apply cr comments
* fix(docker): update testing-tools image for GitHub Actions compatibility
* chore(cr): apply cr comments
* feat: add more tests, fix issues
* chore: fix codeql issues, update actions
* chore(cr): apply cr comments
* fix: integration tests
* chore: deduplication and fixes
* style: apply MegaLinter fixes
* chore(cr): apply cr comments
* feat: dry-run mode for generate-tests
* fix(ci): kcov installation
* chore(cr): apply cr comments
* chore(cr): apply cr comments
* chore(cr): apply cr comments
* chore(cr): apply cr comments, simplify action testing, use uv
* fix: run-tests.sh action counting
* chore(cr): apply cr comments
* chore(cr): apply cr comments
This commit is contained in:
2025-10-14 13:37:58 +03:00
committed by GitHub
parent d3cc8d4790
commit 78fdad69e5
353 changed files with 55370 additions and 1714 deletions

View File

@@ -0,0 +1,10 @@
"""Modular validation system for GitHub Actions inputs.
This package provides a flexible, extensible validation framework for GitHub Actions.
"""
from .base import BaseValidator
from .registry import ValidatorRegistry
__all__ = ["BaseValidator", "ValidatorRegistry"]
__version__ = "2.0.0"

View File

@@ -0,0 +1,229 @@
"""Base validator class for GitHub Actions input validation.
Provides the foundation for all validators with common functionality.
"""
from __future__ import annotations
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Any
class BaseValidator(ABC):
"""Abstract base class for all validators.
Provides common validation interface and error handling.
"""
def __init__(self, action_type: str = "") -> None:
"""Initialize the base validator.
Args:
action_type: The type of GitHub Action being validated
"""
self.action_type = action_type
self.errors: list[str] = []
self._rules: dict[str, Any] = {}
def add_error(self, message: str) -> None:
"""Add a validation error message.
Args:
message: The error message to add
"""
self.errors.append(message)
def clear_errors(self) -> None:
"""Clear all validation errors."""
self.errors = []
def has_errors(self) -> bool:
"""Check if there are any validation errors.
Returns:
True if there are errors, False otherwise
"""
return len(self.errors) > 0
@abstractmethod
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate the provided inputs.
Args:
inputs: Dictionary of input names to values
Returns:
True if all inputs are valid, False otherwise
"""
@abstractmethod
def get_required_inputs(self) -> list[str]:
"""Get the list of required input names.
Returns:
List of required input names
"""
@abstractmethod
def get_validation_rules(self) -> dict[str, Any]:
"""Get the validation rules for this validator.
Returns:
Dictionary of validation rules
"""
def validate_required_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate that all required inputs are present and non-empty.
Args:
inputs: Dictionary of input names to values
Returns:
True if all required inputs are present, False otherwise
"""
valid = True
required = self.get_required_inputs()
for req_input in required:
if not inputs.get(req_input, "").strip():
self.add_error(f"Required input '{req_input}' is missing or empty")
valid = False
return valid
def validate_security_patterns(self, value: str, name: str = "input") -> bool:
"""Check for common security injection patterns.
Args:
value: The value to check
name: The name of the input for error messages
Returns:
True if no injection patterns found, False otherwise
"""
if not value or value.strip() == "":
return True
# Common injection patterns to check
dangerous_patterns = [
(";", "command separator"),
("&&", "command chaining"),
("||", "command chaining"),
("|", "pipe operator"),
("`", "command substitution"),
("$(", "command substitution"),
("${", "variable expansion"),
("../", "path traversal"),
("..\\", "path traversal"),
]
for pattern, description in dangerous_patterns:
if pattern in value:
self.add_error(
f"Potential security issue in {name}: contains {description} '{pattern}'",
)
return False
return True
def validate_path_security(self, path: str, name: str = "path") -> bool:
"""Validate file paths for security issues.
Args:
path: The file path to validate
name: The name of the input for error messages
Returns:
True if path is secure, False otherwise
"""
if not path or path.strip() == "":
return True
# Check for absolute paths
if path.startswith("/") or (len(path) > 1 and path[1] == ":"):
self.add_error(f"Invalid {name}: '{path}'. Absolute path not allowed")
return False
# Check for path traversal
if ".." in path:
self.add_error(f"Invalid {name}: '{path}'. Path traversal detected")
return False
# Check for home directory expansion
if path.startswith("~"):
self.add_error(f"Invalid {name}: '{path}'. Home directory expansion not allowed")
return False
return True
def validate_empty_allowed(self, value: str, name: str) -> bool:
"""Validate that a value is provided (not empty).
Args:
value: The value to check
name: The name of the input for error messages
Returns:
True if value is not empty, False otherwise
"""
if not value or value.strip() == "":
self.add_error(f"Input '{name}' cannot be empty")
return False
return True
def load_rules(self, rules_path: Path | None = None) -> dict[str, Any]:
"""Load validation rules from YAML file.
Args:
rules_path: Path to the rules YAML file (must be a Path object)
Returns:
Dictionary containing validation rules
"""
if not rules_path:
# Default to action folder's rules.yml file
action_dir = Path(__file__).parent.parent.parent / self.action_type.replace("_", "-")
rules_path = action_dir / "rules.yml"
# Ensure rules_path is a Path object
if not isinstance(rules_path, Path):
msg = f"rules_path must be a Path object, got {type(rules_path)}"
raise TypeError(msg)
if not rules_path.exists():
return {}
try:
import yaml # pylint: disable=import-error,import-outside-toplevel
with rules_path.open(encoding="utf-8") as f:
self._rules = yaml.safe_load(f) or {}
return self._rules
except Exception as e: # pylint: disable=broad-exception-caught
self.add_error(f"Failed to load rules from {rules_path}: {e}")
return {}
def get_github_actions_output(self) -> dict[str, str]:
"""Get output formatted for GitHub Actions.
Returns:
Dictionary with status and error keys for GitHub Actions
"""
if self.has_errors():
return {
"status": "failure",
"error": "; ".join(self.errors),
}
return {
"status": "success",
"error": "",
}
def is_github_expression(self, value: str) -> bool:
"""Check if the value is a GitHub expression."""
return (
value.lower() == "${{ github.token }}"
or ("${{" in value and "}}" in value)
or (value.strip().startswith("${{") and value.strip().endswith("}}"))
)

View File

@@ -0,0 +1,174 @@
"""Boolean validator for true/false inputs."""
from __future__ import annotations
from .base import BaseValidator
class BooleanValidator(BaseValidator):
"""Validator for boolean inputs."""
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate boolean inputs."""
valid = True
# Common boolean input patterns
boolean_keywords = [
"dry-run",
"dry_run",
"verbose",
"debug",
"fail-on-error",
"fail_on_error",
"cache",
"skip",
"force",
"auto",
"enabled",
"disabled",
"check-only",
"check_only",
"sign",
"scan",
"push",
"nightly",
"stable",
"provenance",
"sbom",
]
for input_name, value in inputs.items():
# Check if input name suggests boolean
is_boolean_input = any(keyword in input_name.lower() for keyword in boolean_keywords)
# Also check for specific patterns
if (
is_boolean_input
or input_name.startswith(
(
"is-",
"is_",
"has-",
"has_",
"enable-",
"enable_",
"disable-",
"disable_",
"use-",
"use_",
"with-",
"with_",
"without-",
"without_",
),
)
or input_name.endswith(("-enabled", "_enabled", "-disabled", "_disabled"))
):
valid &= self.validate_boolean(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Boolean validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return boolean validation rules."""
return {
"boolean": "Must be 'true' or 'false' (lowercase)",
}
def validate_boolean(self, value: str, name: str = "boolean") -> bool:
"""Validate boolean input.
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
# Empty boolean often defaults to false
return True
# Allow GitHub Actions expressions
if self.is_github_expression(value):
return True
# Accept any case variation of true/false
if value.lower() in ["true", "false"]:
return True
# Check for yes/no (not valid for GitHub Actions)
if value.lower() in ["yes", "no", "y", "n"]:
self.add_error(
f"Invalid {name}: \"{value}\". Must be 'true' or 'false'",
)
return False
# Check for numeric boolean
if value in ["0", "1"]:
self.add_error(
f"Invalid {name}: \"{value}\". Must be 'true' or 'false'",
)
return False
# Generic error
self.add_error(f"Invalid {name}: \"{value}\". Must be 'true' or 'false'")
return False
def validate_boolean_extended(self, value: str | None, name: str = "boolean") -> bool:
"""Validate boolean input with extended options (true/false/empty).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid or empty, False otherwise
"""
if value is None:
return True
if not value or value.strip() == "":
return True
if value.lower() in ["yes", "no", "y", "n", "0", "1", "on", "off"]:
return True
return self.validate_boolean(value, name)
def validate_optional_boolean(self, value: str | None, name: str = "boolean") -> bool:
"""Validate optional boolean input (can be empty).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid or empty, False otherwise
"""
if value is None:
return True
if not value or value.strip() == "":
return True
return self.validate_boolean(value, name)
def validate_required_boolean(self, value: str, name: str = "boolean") -> bool:
"""Validate required boolean input (cannot be empty).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
self.add_error(f"Boolean {name} cannot be empty")
return False
return self.validate_boolean(value, name)

View File

@@ -0,0 +1,308 @@
"""CodeQL-specific validators for code analysis actions."""
from __future__ import annotations
import re
from typing import ClassVar
from .base import BaseValidator
class CodeQLValidator(BaseValidator):
"""Validator for CodeQL analysis action inputs."""
# Supported CodeQL languages
SUPPORTED_LANGUAGES: ClassVar[set[str]] = {
"javascript",
"typescript",
"python",
"java",
"csharp",
"cpp",
"c",
"go",
"ruby",
"swift",
"kotlin",
"actions",
}
# Standard query suites
STANDARD_SUITES: ClassVar[set[str]] = {
"security-extended",
"security-and-quality",
"code-scanning",
"default",
}
# Valid build modes
BUILD_MODES: ClassVar[set[str]] = {"none", "manual", "autobuild"}
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate CodeQL-specific inputs."""
valid = True
for input_name, value in inputs.items():
if input_name == "language":
valid &= self.validate_codeql_language(value)
elif input_name == "queries":
valid &= self.validate_codeql_queries(value)
elif input_name == "packs":
valid &= self.validate_codeql_packs(value)
elif input_name in {"build-mode", "build_mode"}:
valid &= self.validate_codeql_build_mode(value)
elif input_name == "config":
valid &= self.validate_codeql_config(value)
elif input_name == "category":
valid &= self.validate_category_format(value)
elif input_name == "threads":
valid &= self.validate_threads(value)
elif input_name == "ram":
valid &= self.validate_ram(value)
return valid
def get_required_inputs(self) -> list[str]:
"""Get required inputs for CodeQL analysis."""
return ["language"] # Language is required for CodeQL
def get_validation_rules(self) -> dict:
"""Return CodeQL validation rules."""
return {
"language": list(self.SUPPORTED_LANGUAGES),
"queries": list(self.STANDARD_SUITES),
"build_modes": list(self.BUILD_MODES),
"threads": "1-128",
"ram": "256-32768 MB",
}
def validate_codeql_language(self, value: str) -> bool:
"""Validate CodeQL language.
Args:
value: The language to validate
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
self.add_error("CodeQL language cannot be empty")
return False
language = value.strip().lower()
if language in self.SUPPORTED_LANGUAGES:
return True
self.add_error(
f'Invalid CodeQL language: "{value}". '
f"Supported languages: {', '.join(sorted(self.SUPPORTED_LANGUAGES))}",
)
return False
def validate_codeql_queries(self, value: str) -> bool:
"""Validate CodeQL query suites.
Args:
value: The queries to validate
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
self.add_error("CodeQL queries cannot be empty")
return False
# Allow GitHub Actions expressions
if self.is_github_expression(value):
return True
# Split by comma and validate each query
queries = [q.strip() for q in value.split(",") if q.strip()]
for query in queries:
query_lower = query.lower()
# Check if it's a standard suite
if query_lower in self.STANDARD_SUITES:
continue
# Check if it's a query file path
if query.endswith((".ql", ".qls")):
# Validate as file path
if not self.validate_path_security(query, "query file"):
return False
continue
# Check if it contains path separators (custom query path)
if "/" in query or "\\" in query:
if not self.validate_path_security(query, "query path"):
return False
continue
# If none of the above, it's invalid
self.add_error(f'Invalid CodeQL query suite: "{query}"')
return False
return True
def validate_codeql_packs(self, value: str) -> bool:
"""Validate CodeQL query packs.
Args:
value: The packs to validate
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Packs are optional
# Split by comma and validate each pack
packs = [p.strip() for p in value.split(",") if p.strip()]
# Pack format: pack-name or owner/repo or owner/repo@version
pack_pattern = r"^[a-zA-Z0-9._/-]+(@[a-zA-Z0-9._-]+)?$"
for pack in packs:
if not re.match(pack_pattern, pack):
self.add_error(
f'Invalid CodeQL pack format: "{pack}". '
"Expected format: pack-name, owner/repo, or owner/repo@version",
)
return False
return True
def validate_codeql_build_mode(self, value: str) -> bool:
"""Validate CodeQL build mode.
Args:
value: The build mode to validate
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Build mode is optional
mode = value.strip().lower()
if mode in self.BUILD_MODES:
return True
self.add_error(
f'Invalid CodeQL build mode: "{value}". '
f"Valid options: {', '.join(sorted(self.BUILD_MODES))}",
)
return False
def validate_codeql_config(self, value: str) -> bool:
"""Validate CodeQL configuration.
Args:
value: The config to validate
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Config is optional
# Check for dangerous YAML patterns
dangerous_patterns = [
r"!!python/", # Python object execution
r"!!ruby/", # Ruby execution
r"!!perl/", # Perl execution
r"!!js/", # JavaScript execution
]
for pattern in dangerous_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Dangerous pattern in CodeQL config: {pattern}")
return False
return True
def validate_category_format(self, value: str) -> bool:
"""Validate analysis category format.
Args:
value: The category to validate
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Category is optional
# Allow GitHub Actions expressions
if self.is_github_expression(value):
return True
# Category should start with /
if not value.startswith("/"):
self.add_error(f'Category must start with "/": {value}')
return False
# Check for valid characters
if not re.match(r"^/[a-zA-Z0-9_:/-]+$", value):
self.add_error(f"Invalid category format: {value}")
return False
return True
def validate_threads(self, value: str, name: str = "threads") -> bool:
"""Validate thread count (1-128).
Args:
value: The thread count to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Optional
try:
threads = int(value.strip())
if 1 <= threads <= 128:
return True
self.add_error(f"Invalid {name}: {threads}. Must be between 1 and 128")
return False
except ValueError:
self.add_error(f'Invalid {name}: "{value}". Must be a number')
return False
def validate_ram(self, value: str, name: str = "ram") -> bool:
"""Validate RAM in MB (256-32768).
Args:
value: The RAM value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Optional
try:
ram = int(value.strip())
if 256 <= ram <= 32768:
return True
self.add_error(f"Invalid {name}: {ram}. Must be between 256 and 32768 MB")
return False
except ValueError:
self.add_error(f'Invalid {name}: "{value}". Must be a number')
return False
# Convenience methods for convention-based validation
def validate_numeric_range_1_128(self, value: str, name: str = "threads") -> bool:
"""Alias for thread validation."""
return self.validate_threads(value, name)
def validate_numeric_range_256_32768(self, value: str, name: str = "ram") -> bool:
"""Alias for RAM validation."""
return self.validate_ram(value, name)

View File

@@ -0,0 +1,345 @@
"""Convention mapper for automatic validation detection.
Maps input names to appropriate validators based on naming conventions.
"""
from __future__ import annotations
from typing import TYPE_CHECKING, Any, ClassVar
if TYPE_CHECKING:
from collections.abc import Callable
class ConventionMapper:
"""Maps input names to validators based on naming conventions."""
# Priority-ordered convention patterns
CONVENTION_PATTERNS: ClassVar[list[dict[str, Any]]] = [
# High priority - exact matches
{
"priority": 100,
"type": "exact",
"patterns": {
"email": "email",
"url": "url",
"username": "username",
"password": "password",
"token": "github_token",
"github-token": "github_token",
"npm-token": "npm_token",
"docker-token": "docker_token",
"dockerhub-token": "docker_token",
"registry-token": "registry_token",
"api-key": "api_key",
"secret": "secret",
},
},
# Version patterns - specific versions have higher priority
{
"priority": 96, # Highest priority for exact version match
"type": "exact",
"patterns": {
"version": "flexible_version", # Support both SemVer and CalVer
},
},
{
"priority": 95, # Higher priority for specific versions
"type": "contains",
"patterns": {
"python-version": "python_version",
"node-version": "node_version",
"go-version": "go_version",
"php-version": "php_version",
"dotnet-version": "dotnet_version",
"terraform-version": "terraform_version",
"java-version": "java_version",
"ruby-version": "ruby_version",
},
},
{
"priority": 90, # Lower priority for generic version
"type": "suffix",
"patterns": {
"-version": "version",
"_version": "version",
},
},
# Boolean patterns
{
"priority": 80,
"type": "exact",
"patterns": {
"dry-run": "boolean",
"draft": "boolean",
"prerelease": "boolean",
"push": "boolean",
"force": "boolean",
"skip": "boolean",
"enabled": "boolean",
"disabled": "boolean",
"verbose": "boolean",
"debug": "boolean",
"nightly": "boolean",
"stable": "boolean",
"provenance": "boolean",
"sbom": "boolean",
"sign": "boolean",
"scan": "boolean",
},
},
{
"priority": 80,
"type": "prefix",
"patterns": {
"is-": "boolean",
"is_": "boolean",
"has-": "boolean",
"has_": "boolean",
"enable-": "boolean",
"enable_": "boolean",
"disable-": "boolean",
"disable_": "boolean",
"use-": "boolean",
"use_": "boolean",
"with-": "boolean",
"with_": "boolean",
"without-": "boolean",
"without_": "boolean",
},
},
{
"priority": 80,
"type": "suffix",
"patterns": {
"-enabled": "boolean",
"_enabled": "boolean",
"-disabled": "boolean",
"_disabled": "boolean",
},
},
# File patterns
{
"priority": 70,
"type": "suffix",
"patterns": {
"-file": "file_path",
"_file": "file_path",
"-path": "file_path",
"_path": "file_path",
"-dir": "directory",
"_dir": "directory",
"-directory": "directory",
"_directory": "directory",
},
},
{
"priority": 70,
"type": "exact",
"patterns": {
"dockerfile": "dockerfile",
"config": "file_path",
"config-file": "file_path",
"env-file": "env_file",
"compose-file": "compose_file",
},
},
# Numeric patterns
{
"priority": 60,
"type": "exact",
"patterns": {
"retries": "numeric_1_10",
"max-retries": "numeric_1_10",
"attempts": "numeric_1_10",
"timeout": "timeout",
"timeout-ms": "timeout_ms",
"timeout-seconds": "timeout",
"threads": "numeric_1_128",
"workers": "numeric_1_128",
"concurrency": "numeric_1_128",
"parallel-builds": "numeric_0_16",
"max-parallel": "numeric_0_16",
"compression-quality": "numeric_0_100",
"jpeg-quality": "numeric_0_100",
"quality": "numeric_0_100",
"max-warnings": "numeric_0_10000",
"days-before-stale": "positive_integer",
"days-before-close": "positive_integer",
"port": "port",
"ram": "numeric_256_32768",
"memory": "numeric_256_32768",
},
},
# Docker patterns
{
"priority": 50,
"type": "exact",
"patterns": {
"image": "docker_image",
"image-name": "docker_image",
"tag": "docker_tag",
"tags": "docker_tags",
"platforms": "docker_architectures",
"architectures": "docker_architectures",
"registry": "docker_registry",
"namespace": "docker_namespace",
"prefix": "prefix",
"suffix": "suffix",
"cache-from": "cache_mode",
"cache-to": "cache_mode",
"build-args": "build_args",
"labels": "labels",
},
},
# Network patterns
{
"priority": 40,
"type": "suffix",
"patterns": {
"-url": "url",
"_url": "url",
"-endpoint": "url",
"_endpoint": "url",
"-webhook": "url",
"_webhook": "url",
},
},
{
"priority": 40,
"type": "exact",
"patterns": {
"hostname": "hostname",
"host": "hostname",
"server": "hostname",
"domain": "hostname",
"ip": "ip_address",
"ip-address": "ip_address",
},
},
]
def __init__(self) -> None:
"""Initialize the convention mapper."""
self._cache = {}
self._compile_patterns()
def _compile_patterns(self) -> None:
"""Compile patterns for efficient matching."""
# Sort patterns by priority
self.CONVENTION_PATTERNS.sort(key=lambda x: x["priority"], reverse=True)
def _normalize_pattern(
self, normalized: str, pattern_type: str, patterns: dict[str, str]
) -> str | None:
result = None # Initialize to None for cases where no pattern matches
if pattern_type == "exact" and normalized in patterns:
result = patterns[normalized]
elif pattern_type == "prefix":
for prefix, validator in patterns.items():
if normalized.startswith(prefix):
result = validator
break
elif pattern_type == "suffix":
for suffix, validator in patterns.items():
if normalized.endswith(suffix):
result = validator
break
elif pattern_type == "contains":
for substring, validator in patterns.items():
if substring in normalized:
result = validator
break
return result
def get_validator_type(
self,
input_name: str,
input_config: dict[str, Any] | None = None,
) -> str | None:
"""Get the validator type for an input based on conventions.
Args:
input_name: The name of the input
input_config: Optional configuration for the input
Returns:
The validator type or None if no convention matches
"""
# Check cache
cache_key = f"{input_name}:{input_config!s}"
if cache_key in self._cache:
return self._cache[cache_key]
result = None
# Check for explicit validator in config
if input_config and isinstance(input_config, dict):
if "validator" in input_config:
result = input_config["validator"]
elif "type" in input_config:
result = input_config["type"]
# If no explicit validator, try pattern matching
if result is None:
# Normalize input name for matching
normalized = input_name.lower().replace("_", "-")
# Try each pattern group in priority order
for pattern_group in self.CONVENTION_PATTERNS:
if result is not None:
break
pattern_type = pattern_group["type"]
patterns = pattern_group["patterns"]
result = self._normalize_pattern(normalized, pattern_type, patterns)
# Cache and return result
self._cache[cache_key] = result
return result
def get_validator_for_inputs(self, inputs: dict[str, Any]) -> dict[str, str]:
"""Get validators for all inputs based on conventions.
Args:
inputs: Dictionary of input names and values
Returns:
Dictionary mapping input names to validator types
"""
validators = {}
for input_name in inputs:
validator_type = self.get_validator_type(input_name)
if validator_type:
validators[input_name] = validator_type
return validators
def clear_cache(self) -> None:
"""Clear the validator cache."""
self._cache = {}
def add_custom_pattern(self, pattern: dict[str, Any]) -> None:
"""Add a custom pattern to the convention mapper.
Args:
pattern: Pattern dictionary with priority, type, and patterns
"""
# Note: Modifying ClassVar directly is not ideal, but needed for dynamic configuration
ConventionMapper.CONVENTION_PATTERNS.append(pattern)
self._compile_patterns()
self.clear_cache()
def remove_pattern(self, pattern_filter: Callable[[dict], bool]) -> None:
"""Remove patterns matching a filter.
Args:
pattern_filter: Function that returns True for patterns to remove
"""
# Note: Modifying ClassVar directly is not ideal, but needed for dynamic configuration
ConventionMapper.CONVENTION_PATTERNS = [
p for p in ConventionMapper.CONVENTION_PATTERNS if not pattern_filter(p)
]
self._compile_patterns()
self.clear_cache()

View File

@@ -0,0 +1,610 @@
"""Convention-based validator that uses naming patterns to determine validation rules.
This validator automatically applies validation based on input naming conventions.
"""
from __future__ import annotations
from pathlib import Path
from typing import Any
import yaml # pylint: disable=import-error
from .base import BaseValidator
from .convention_mapper import ConventionMapper
TOKEN_TYPES = {
"github": "github_token",
"npm": "npm_token",
"docker": "docker_token",
}
VERSION_MAPPINGS = {
"python": "python_version",
"node": "node_version",
"go": "go_version",
"php": "php_version",
"terraform": "terraform_version",
"dotnet": "dotnet_version",
"net": "dotnet_version",
}
FILE_TYPES = {
"yaml": "yaml_file",
"yml": "yaml_file",
"json": "json_file",
}
class ConventionBasedValidator(BaseValidator):
"""Validator that applies validation based on naming conventions.
Automatically detects validation requirements based on input names
and applies appropriate validators.
"""
def __init__(self, action_type: str) -> None:
"""Initialize the convention-based validator.
Args:
action_type: The type of GitHub Action being validated
"""
super().__init__(action_type)
self._rules = self.load_rules()
self._validator_modules: dict[str, Any] = {}
self._convention_mapper = ConventionMapper() # Use the ConventionMapper
self._load_validator_modules()
def _load_validator_modules(self) -> None:
"""Lazy-load validator modules as needed."""
# These will be imported as needed to avoid circular imports
def load_rules(self, rules_path: Path | None = None) -> dict[str, Any]:
"""Load validation rules from YAML file.
Args:
rules_path: Optional path to the rules YAML file
Returns:
Dictionary of validation rules
"""
if rules_path and rules_path.exists():
rules_file = rules_path
else:
# Find the rules file for this action in the action folder
# Convert underscores back to dashes for the folder name
action_name = self.action_type.replace("_", "-")
project_root = Path(__file__).parent.parent.parent
rules_file = project_root / action_name / "rules.yml"
if not rules_file.exists():
# Return default empty rules if no rules file exists
return {
"action_type": self.action_type,
"required_inputs": [],
"optional_inputs": {},
"conventions": {},
"overrides": {},
}
try:
with Path(rules_file).open() as f:
rules = yaml.safe_load(f) or {}
# Ensure all expected keys exist
rules.setdefault("required_inputs", [])
rules.setdefault("optional_inputs", {})
rules.setdefault("conventions", {})
rules.setdefault("overrides", {})
# Build conventions from optional_inputs if not explicitly set
if not rules["conventions"] and rules["optional_inputs"]:
conventions = {}
for input_name, input_config in rules["optional_inputs"].items():
# Try to infer validator type from the input name or pattern
conventions[input_name] = self._infer_validator_type(input_name, input_config)
rules["conventions"] = conventions
return rules
except Exception:
return {
"action_type": self.action_type,
"required_inputs": [],
"optional_inputs": {},
"conventions": {},
"overrides": {},
}
def _infer_validator_type(self, input_name: str, input_config: dict[str, Any]) -> str | None:
"""Infer the validator type from input name and configuration.
Args:
input_name: The name of the input
input_config: The input configuration from rules
Returns:
The inferred validator type or None
"""
# Check for explicit validator type in config
if isinstance(input_config, dict) and "validator" in input_config:
return input_config["validator"]
# Infer based on name patterns
name_lower = input_name.lower().replace("-", "_")
# Try to determine validator type
validator_type = self._check_exact_matches(name_lower)
if validator_type is None:
validator_type = self._check_pattern_based_matches(name_lower)
return validator_type
def _check_exact_matches(self, name_lower: str) -> str | None:
"""Check for exact pattern matches."""
exact_matches = {
# Docker patterns
"platforms": "docker_architectures",
"architectures": "docker_architectures",
"cache_from": "cache_mode",
"cache_to": "cache_mode",
"sbom": "sbom_format",
"registry": "registry_url",
"registry_url": "registry_url",
"tags": "docker_tags",
# File patterns
"file": "file_path",
"path": "file_path",
"file_path": "file_path",
"config_file": "file_path",
"dockerfile": "file_path",
"branch": "branch_name",
"branch_name": "branch_name",
"ref": "branch_name",
# Network patterns
"email": "email",
"url": "url",
"endpoint": "url",
"webhook": "url",
"repository_url": "repository_url",
"repo_url": "repository_url",
"scope": "scope",
"username": "username",
"user": "username",
# Boolean patterns
"dry_run": "boolean",
"draft": "boolean",
"prerelease": "boolean",
"push": "boolean",
"delete": "boolean",
"all_files": "boolean",
"force": "boolean",
"skip": "boolean",
"enabled": "boolean",
"disabled": "boolean",
"verbose": "boolean",
"debug": "boolean",
# Numeric patterns
"retries": "retries",
"retry": "retries",
"attempts": "retries",
"timeout": "timeout",
"timeout_ms": "timeout",
"timeout_seconds": "timeout",
"threads": "threads",
"workers": "threads",
"concurrency": "threads",
# Other patterns
"category": "category_format",
"cache": "package_manager_enum",
"package_manager": "package_manager_enum",
"format": "report_format",
"output_format": "report_format",
"report_format": "report_format",
}
return exact_matches.get(name_lower)
def _check_pattern_based_matches(self, name_lower: str) -> str | None: # noqa: PLR0912
"""Check for pattern-based matches."""
result = None
# Token patterns
if "token" in name_lower:
token_types = TOKEN_TYPES
for key, value in token_types.items():
if key in name_lower:
result = value
break
if result is None:
result = "github_token" # Default token type
# Docker patterns
elif name_lower.startswith("docker_"):
result = f"docker_{name_lower[7:]}"
# Version patterns
elif "version" in name_lower:
version_mappings = VERSION_MAPPINGS
for key, value in version_mappings.items():
if key in name_lower:
result = value
break
if result is None:
result = "flexible_version" # Default to flexible version
# File suffix patterns
elif name_lower.endswith("_file") and name_lower != "config_file":
file_types = FILE_TYPES
for key, value in file_types.items():
if key in name_lower:
result = value
break
if result is None:
result = "file_path"
# CodeQL patterns
elif name_lower.startswith("codeql_"):
result = name_lower
# Cache-related check (special case for returning None)
elif "cache" in name_lower and name_lower != "cache":
result = None # cache-related but not numeric
return result
def get_required_inputs(self) -> list[str]:
"""Get the list of required input names from rules.
Returns:
List of required input names
"""
return self._rules.get("required_inputs", [])
def get_validation_rules(self) -> dict[str, Any]:
"""Get the validation rules.
Returns:
Dictionary of validation rules
"""
return self._rules
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate inputs based on conventions and rules.
Args:
inputs: Dictionary of input names to values
Returns:
True if all inputs are valid, False otherwise
"""
valid = True
# First validate required inputs
valid &= self.validate_required_inputs(inputs)
# Get conventions and overrides from rules
conventions = self._rules.get("conventions", {})
overrides = self._rules.get("overrides", {})
# Validate each input
for input_name, value in inputs.items():
# Skip if explicitly overridden to null
if input_name in overrides and overrides[input_name] is None:
continue
# Get validator type from overrides or conventions
validator_type = self._get_validator_type(input_name, conventions, overrides)
if validator_type:
# Check if this is a required input
is_required = input_name in self.get_required_inputs()
valid &= self._apply_validator(
input_name, value, validator_type, is_required=is_required
)
return valid
def _get_validator_type(
self,
input_name: str,
conventions: dict[str, str],
overrides: dict[str, str],
) -> str | None:
"""Determine the validator type for an input.
Args:
input_name: The name of the input
conventions: Convention mappings
overrides: Override mappings
Returns:
The validator type or None if no validator found
"""
# Check overrides first
if input_name in overrides:
return overrides[input_name]
# Check exact convention match
if input_name in conventions:
return conventions[input_name]
# Check with dash/underscore conversion
if "_" in input_name:
dash_version = input_name.replace("_", "-")
if dash_version in overrides:
return overrides[dash_version]
if dash_version in conventions:
return conventions[dash_version]
elif "-" in input_name:
underscore_version = input_name.replace("-", "_")
if underscore_version in overrides:
return overrides[underscore_version]
if underscore_version in conventions:
return conventions[underscore_version]
# Fall back to convention mapper for pattern-based detection
return self._convention_mapper.get_validator_type(input_name)
def _apply_validator(
self,
input_name: str,
value: str,
validator_type: str,
*,
is_required: bool,
) -> bool:
"""Apply the appropriate validator to an input value.
Args:
input_name: The name of the input
value: The value to validate
validator_type: The type of validator to apply
is_required: Whether the input is required
Returns:
True if valid, False otherwise
"""
# Get the validator module and method
validator_module, method_name = self._get_validator_method(validator_type)
if not validator_module:
# Unknown validator type, skip validation
return True
try:
# Call the validation method
if hasattr(validator_module, method_name):
method = getattr(validator_module, method_name)
# Some validators need additional parameters
if validator_type == "github_token" and method_name == "validate_github_token":
result = method(value, required=is_required)
elif "numeric_range" in validator_type:
# Parse range from validator type
min_val, max_val = self._parse_numeric_range(validator_type)
result = method(value, min_val, max_val, input_name)
else:
# Standard validation call
result = method(value, input_name)
# Copy errors from the validator module to this validator
# Skip if validator_module is self (for internal validators)
if validator_module is not self and hasattr(validator_module, "errors"):
for error in validator_module.errors:
if error not in self.errors:
self.add_error(error)
# Clear the module's errors after copying
validator_module.errors = []
return result
# Method not found, skip validation
return True
except Exception as e:
self.add_error(f"Validation error for {input_name}: {e}")
return False
def _get_validator_method(self, validator_type: str) -> tuple[Any, str]: # noqa: C901, PLR0912
"""Get the validator module and method name for a validator type.
Args:
validator_type: The validator type string
Returns:
Tuple of (validator_module, method_name)
"""
# Lazy import validators to avoid circular dependencies
# Token validators
if validator_type in [
"github_token",
"npm_token",
"docker_token",
"namespace_with_lookahead",
]:
if "token" not in self._validator_modules:
from . import token
self._validator_modules["token"] = token.TokenValidator()
return self._validator_modules["token"], f"validate_{validator_type}"
# Docker validators
if validator_type.startswith("docker_") or validator_type in [
"cache_mode",
"sbom_format",
"registry_enum",
]:
if "docker" not in self._validator_modules:
from . import docker
self._validator_modules["docker"] = docker.DockerValidator()
if validator_type.startswith("docker_"):
method = f"validate_{validator_type[7:]}" # Remove "docker_" prefix
elif validator_type == "registry_enum":
method = "validate_registry"
else:
method = f"validate_{validator_type}"
return self._validator_modules["docker"], method
# Version validators
if "version" in validator_type or validator_type in ["calver", "semantic", "flexible"]:
if "version" not in self._validator_modules:
from . import version
self._validator_modules["version"] = version.VersionValidator()
return self._validator_modules["version"], f"validate_{validator_type}"
# File validators
if validator_type in [
"file_path",
"branch_name",
"file_extensions",
"yaml_file",
"json_file",
"config_file",
]:
if "file" not in self._validator_modules:
from . import file
self._validator_modules["file"] = file.FileValidator()
return self._validator_modules["file"], f"validate_{validator_type}"
# Network validators
if validator_type in [
"email",
"url",
"scope",
"username",
"registry_url",
"repository_url",
]:
if "network" not in self._validator_modules:
from . import network
self._validator_modules["network"] = network.NetworkValidator()
return self._validator_modules["network"], f"validate_{validator_type}"
# Boolean validator
if validator_type == "boolean":
if "boolean" not in self._validator_modules:
from . import boolean
self._validator_modules["boolean"] = boolean.BooleanValidator()
return self._validator_modules["boolean"], "validate_boolean"
# Numeric validators
if validator_type.startswith("numeric_range") or validator_type in [
"retries",
"timeout",
"threads",
]:
if "numeric" not in self._validator_modules:
from . import numeric
self._validator_modules["numeric"] = numeric.NumericValidator()
if validator_type.startswith("numeric_range"):
return self._validator_modules["numeric"], "validate_range"
return self._validator_modules["numeric"], f"validate_{validator_type}"
# Security validators
if validator_type in ["security_patterns", "injection_patterns", "prefix", "regex_pattern"]:
if "security" not in self._validator_modules:
from . import security
self._validator_modules["security"] = security.SecurityValidator()
if validator_type == "prefix":
# Use no_injection for prefix - checks for injection patterns
# without character restrictions
return self._validator_modules["security"], "validate_no_injection"
return self._validator_modules["security"], f"validate_{validator_type}"
# CodeQL validators
if validator_type.startswith("codeql_") or validator_type in ["category_format"]:
if "codeql" not in self._validator_modules:
from . import codeql
self._validator_modules["codeql"] = codeql.CodeQLValidator()
return self._validator_modules["codeql"], f"validate_{validator_type}"
# PHP-specific validators
if validator_type in ["php_extensions", "coverage_driver"]:
# Return self for PHP-specific validation methods
return self, f"_validate_{validator_type}"
# Package manager and report format validators
if validator_type in ["package_manager_enum", "report_format"]:
# These could be in a separate module, but for now we'll put them in file validator
if "file" not in self._validator_modules:
from . import file
self._validator_modules["file"] = file.FileValidator()
# These methods need to be added to file validator or a new module
return None, ""
# Default: no validator
return None, ""
def _parse_numeric_range(self, validator_type: str) -> tuple[int, int]:
"""Parse min and max values from a numeric_range validator type.
Args:
validator_type: String like "numeric_range_1_100"
Returns:
Tuple of (min_value, max_value)
"""
parts = validator_type.split("_")
if len(parts) >= 4:
try:
return int(parts[2]), int(parts[3])
except ValueError:
pass
# Default range
return 0, 100
def _validate_php_extensions(self, value: str, input_name: str) -> bool:
"""Validate PHP extensions format.
Args:
value: The extensions value (comma-separated list)
input_name: The input name for error messages
Returns:
True if valid, False otherwise
"""
import re
if not value:
return True
# Check for injection patterns
if re.search(r"[;&|`$()@#]", value):
self.add_error(f"Potential injection detected in {input_name}: {value}")
return False
# Check format - should be alphanumeric, underscores, commas, spaces only
if not re.match(r"^[a-zA-Z0-9_,\s]+$", value):
self.add_error(f"Invalid format for {input_name}: {value}")
return False
return True
def _validate_coverage_driver(self, value: str, input_name: str) -> bool:
"""Validate coverage driver enum.
Args:
value: The coverage driver value
input_name: The input name for error messages
Returns:
True if valid, False otherwise
"""
valid_drivers = ["none", "xdebug", "pcov", "xdebug3"]
if value and value not in valid_drivers:
self.add_error(
f"Invalid {input_name}: {value}. Must be one of: {', '.join(valid_drivers)}"
)
return False
return True

View File

@@ -0,0 +1,309 @@
"""Docker-specific validators for container-related inputs."""
from __future__ import annotations
import re
from typing import ClassVar
from .base import BaseValidator
class DockerValidator(BaseValidator):
"""Validator for Docker-related inputs."""
VALID_ARCHITECTURES: ClassVar[list[str]] = [
"linux/amd64",
"linux/arm64",
"linux/arm/v7",
"linux/arm/v6",
"linux/386",
"linux/ppc64le",
"linux/s390x",
]
CACHE_MODES: ClassVar[list[str]] = ["max", "min", "inline"]
SBOM_FORMATS: ClassVar[list[str]] = ["spdx-json", "cyclonedx-json"]
REGISTRY_TYPES: ClassVar[list[str]] = ["dockerhub", "github", "both"]
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate Docker-specific inputs."""
valid = True
for input_name, value in inputs.items():
if "image" in input_name and "name" in input_name:
valid &= self.validate_image_name(value, input_name)
elif input_name == "tag" or input_name.endswith("-tag"):
valid &= self.validate_tag(value, input_name)
elif "architectures" in input_name or "platforms" in input_name:
valid &= self.validate_architectures(value, input_name)
elif "cache" in input_name and "mode" in input_name:
valid &= self.validate_cache_mode(value, input_name)
elif "sbom" in input_name and "format" in input_name:
valid &= self.validate_sbom_format(value, input_name)
elif input_name == "registry":
valid &= self.validate_registry(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Docker validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return Docker validation rules."""
return {
"image_name": "lowercase, alphanumeric, periods, hyphens, underscores",
"tag": "semantic version, 'latest', or valid Docker tag",
"architectures": self.VALID_ARCHITECTURES,
"cache_mode": self.CACHE_MODES,
"sbom_format": self.SBOM_FORMATS,
"registry": self.REGISTRY_TYPES,
}
def validate_image_name(self, image_name: str, name: str = "image-name") -> bool:
"""Validate Docker image name format.
Supports full Docker image references including:
- Simple names: myapp, nginx
- Names with separators: my-app, my_app, my.app
- Registry paths: registry.example.com/myapp
- Multi-part paths: docker.io/library/nginx
- Complex paths: registry.example.com/namespace/app.name
Args:
image_name: The image name to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not image_name or image_name.strip() == "":
return True # Image name is often optional
# Allow GitHub Actions expressions
if self.is_github_expression(image_name):
return True
# Docker image name pattern supporting registry paths with slashes
# Component: [a-z0-9]+ followed by optional (.|_|__|-+)[a-z0-9]+
# Path: optional (/component)* for registry/namespace/image structure
pattern = r"^[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*(/[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*)*$"
if re.match(pattern, image_name):
return True
self.add_error(
f'Invalid {name}: "{image_name}". Must contain only '
"lowercase letters, digits, periods, hyphens, and underscores. "
"Registry paths are supported (e.g., registry.example.com/namespace/image)",
)
return False
def validate_tag(self, tag: str, name: str = "tag") -> bool:
"""Validate Docker tag format.
Args:
tag: The tag to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not tag or tag.strip() == "":
self.add_error(f"Docker {name} cannot be empty")
return False
# Docker tags can be:
# - image:tag format (e.g., myapp:latest, nginx:1.21)
# - just a tag (e.g., latest, v1.2.3)
# - registry/image:tag (e.g., docker.io/library/nginx:latest)
# Allow GitHub Actions expressions
if self.is_github_expression(tag):
return True
# Very permissive Docker tag pattern
# Docker tags can contain letters, digits, periods, dashes, underscores, colons, and slashes
pattern = r"^[a-zA-Z0-9][-a-zA-Z0-9._:/@]*[a-zA-Z0-9]$"
if re.match(pattern, tag) or tag in ["latest"]:
return True
self.add_error(f'Invalid {name}: "{tag}". Must be a valid Docker tag')
return False
def validate_architectures(self, architectures: str, name: str = "architectures") -> bool:
"""Validate Docker architectures/platforms.
Args:
architectures: Comma-separated list of architectures
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not architectures or architectures.strip() == "":
return True # Often optional
# Allow GitHub Actions expressions
if self.is_github_expression(architectures):
return True
archs = [arch.strip() for arch in architectures.split(",")]
for arch in archs:
if arch not in self.VALID_ARCHITECTURES:
self.add_error(
f'Invalid {name}: "{arch}". Supported: {", ".join(self.VALID_ARCHITECTURES)}',
)
return False
return True
def validate_cache_mode(self, value: str, name: str = "cache-mode") -> bool:
"""Validate Docker cache mode values.
Args:
value: The cache mode value
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Cache mode is optional
if value in self.CACHE_MODES:
return True
self.add_error(f'Invalid {name}: "{value}". Must be one of: {", ".join(self.CACHE_MODES)}')
return False
def validate_sbom_format(self, value: str, name: str = "sbom-format") -> bool:
"""Validate SBOM (Software Bill of Materials) format values.
Args:
value: The SBOM format value
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # SBOM format is optional
if value in self.SBOM_FORMATS:
return True
self.add_error(f'Invalid {name}: "{value}". Must be one of: {", ".join(self.SBOM_FORMATS)}')
return False
def validate_registry(self, value: str, name: str = "registry") -> bool:
"""Validate registry enum values for docker-publish.
Args:
value: The registry value
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
self.add_error(f"Registry is required and cannot be empty in {name}")
return False
if value in self.REGISTRY_TYPES:
return True
self.add_error(
f'Invalid {name}: "{value}". Must be one of: {", ".join(self.REGISTRY_TYPES)}',
)
return False
def validate_namespace_with_lookahead(self, namespace: str, name: str = "namespace") -> bool:
"""Validate Docker namespace/organization name.
Args:
namespace: The namespace to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not namespace or namespace.strip() == "":
return True # Empty namespace is often valid
# Namespace must be lowercase, can contain hyphens but not at start/end
# No double hyphens allowed, max length 255
if len(namespace) > 255:
self.add_error(f'Invalid {name}: "{namespace}". Too long (max 255 characters)')
return False
# Check for invalid patterns
if namespace.startswith("-") or namespace.endswith("-"):
self.add_error(f'Invalid {name}: "{namespace}". Cannot start or end with hyphen')
return False
if "--" in namespace:
self.add_error(f'Invalid {name}: "{namespace}". Cannot contain double hyphens')
return False
if " " in namespace:
self.add_error(f'Invalid {name}: "{namespace}". Cannot contain spaces')
return False
# Must be lowercase alphanumeric with hyphens
pattern = r"^[a-z0-9]+(?:-[a-z0-9]+)*$"
if re.match(pattern, namespace):
return True
self.add_error(
f'Invalid {name}: "{namespace}". Must contain only '
"lowercase letters, digits, and hyphens (not at start/end)",
)
return False
def validate_prefix(self, prefix: str, name: str = "prefix") -> bool:
"""Validate Docker tag prefix.
Args:
prefix: The prefix to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
# Empty prefix is valid
if not prefix:
return True
# Prefix cannot contain spaces or special characters like @, #, :
invalid_chars = [" ", "@", "#", ":"]
for char in invalid_chars:
if char in prefix:
self.add_error(f'Invalid {name}: "{prefix}". Cannot contain "{char}" character')
return False
# Valid prefix contains alphanumeric, dots, dashes, underscores
pattern = r"^[a-zA-Z0-9._-]+$"
if re.match(pattern, prefix):
return True
self.add_error(
f'Invalid {name}: "{prefix}". Must contain only '
"letters, digits, periods, hyphens, and underscores",
)
return False
# Convenience methods for direct access
def validate_docker_image_name(self, value: str, name: str = "image-name") -> bool:
"""Alias for validate_image_name for convention compatibility."""
return self.validate_image_name(value, name)
def validate_docker_tag(self, value: str, name: str = "tag") -> bool:
"""Alias for validate_tag for convention compatibility."""
return self.validate_tag(value, name)
def validate_docker_architectures(self, value: str, name: str = "architectures") -> bool:
"""Alias for validate_architectures for convention compatibility."""
return self.validate_architectures(value, name)

View File

@@ -0,0 +1,360 @@
"""File and path validators."""
from __future__ import annotations
from pathlib import Path
import re
from .base import BaseValidator
class FileValidator(BaseValidator):
"""Validator for file paths, extensions, and related inputs."""
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate file-related inputs."""
valid = True
for input_name, value in inputs.items():
if "file" in input_name or "path" in input_name or "directory" in input_name:
valid &= self.validate_file_path(value, input_name)
elif "branch" in input_name:
valid &= self.validate_branch_name(value)
elif "extension" in input_name:
valid &= self.validate_file_extensions(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""File validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return file validation rules."""
return {
"file_path": "Relative paths only, no path traversal",
"branch_name": "Valid git branch name",
"file_extensions": "Comma-separated list starting with dots",
}
def validate_path(self, path: str, name: str = "path") -> bool:
"""Validate general file paths.
Args:
path: The file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not path or path.strip() == "":
return True # Path is often optional
# Allow GitHub Actions expressions
if self.is_github_expression(path):
return True
p = Path(path)
try:
safe_path = p.resolve(strict=True)
except FileNotFoundError:
self.add_error(f'Invalid {name}: "{path}". Path does not exist')
return False
# Use base class security validation
return self.validate_path_security(str(safe_path.absolute()), name)
def validate_file_path(self, path: str, name: str = "path") -> bool:
"""Validate file paths for security.
Args:
path: The file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not path or path.strip() == "":
return True # Path is often optional
# Allow GitHub Actions expressions
if self.is_github_expression(path):
return True
# Use base class security validation
if not self.validate_path_security(path, name):
return False
# Additional file path validation
# Check for valid characters
if not re.match(r"^[a-zA-Z0-9._/\-\s]+$", path):
self.add_error(f'Invalid {name}: "{path}". Contains invalid characters')
return False
return True
def validate_branch_name(self, branch: str, name: str = "branch") -> bool:
"""Validate git branch name.
Args:
branch: The branch name to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not branch or branch.strip() == "":
return True # Branch name is often optional
# Check for command injection
injection_patterns = [";", "&&", "||", "|", "`", "$("]
for pattern in injection_patterns:
if pattern in branch:
self.add_error(
f'Invalid {name}: "{branch}". '
f'Command injection pattern "{pattern}" not allowed',
)
return False
# Check for invalid git characters
if ".." in branch or "~" in branch or "^" in branch or ":" in branch:
self.add_error(
f'Invalid {name}: "{branch}". Contains invalid git branch characters',
)
return False
# Check for valid characters
if not re.match(r"^[a-zA-Z0-9/_.\-]+$", branch):
self.add_error(
f'Invalid {name}: "{branch}". '
"Must contain only alphanumeric, slash, underscore, dot, and hyphen",
)
return False
# Check for invalid start/end characters
if branch.startswith((".", "-", "/")) or branch.endswith((".", "/")):
self.add_error(
f'Invalid {name}: "{branch}". Cannot start/end with ".", "-", or "/"',
)
return False
# Check for consecutive slashes
if "//" in branch:
self.add_error(f'Invalid branch name: "{branch}". Cannot contain consecutive slashes')
return False
return True
def validate_file_extensions(self, value: str, name: str = "file-extensions") -> bool:
"""Validate file extensions format.
Args:
value: Comma-separated list of file extensions
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # File extensions are optional
extensions = [ext.strip() for ext in value.split(",")]
for ext in extensions:
if not ext:
continue # Skip empty entries
# Must start with a dot
if not ext.startswith("."):
self.add_error(
f'Invalid file extension: "{ext}" in {name}. Extensions must start with a dot',
)
return False
# Check for valid extension format
if not re.match(r"^\.[a-zA-Z0-9]+$", ext):
self.add_error(
f'Invalid file extension format: "{ext}" in {name}. '
"Must be dot followed by alphanumeric characters",
)
return False
# Check for security patterns
if self.validate_security_patterns(ext, f"{name} extension"):
continue
return False
return True
def validate_yaml_file(self, path: str, name: str = "yaml-file") -> bool:
"""Validate YAML file path.
Args:
path: The YAML file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
# Allow GitHub Actions expressions
if self.is_github_expression(path):
return True
if not self.validate_file_path(path, name):
return False
if path and not (path.endswith((".yml", ".yaml"))):
self.add_error(f'Invalid {name}: "{path}". Must be a .yml or .yaml file')
return False
return True
def validate_json_file(self, path: str, name: str = "json-file") -> bool:
"""Validate JSON file path.
Args:
path: The JSON file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not self.validate_file_path(path, name):
return False
if path and not path.endswith(".json"):
self.add_error(f'Invalid {name}: "{path}". Must be a .json file')
return False
return True
def validate_config_file(self, path: str, name: str = "config-file") -> bool:
"""Validate configuration file path.
Args:
path: The config file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not self.validate_file_path(path, name):
return False
# Config files typically have specific extensions
valid_extensions = [
".yml",
".yaml",
".json",
".toml",
".ini",
".conf",
".config",
".cfg",
".xml",
]
if path:
has_valid_ext = any(path.endswith(ext) for ext in valid_extensions)
if not has_valid_ext:
self.add_error(
f'Invalid {name}: "{path}". '
f"Expected config file extension: {', '.join(valid_extensions)}",
)
return False
return True
def validate_dockerfile_path(self, path: str, name: str = "dockerfile") -> bool:
"""Validate Dockerfile path.
Args:
path: The Dockerfile path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not path or path.strip() == "":
return True # Dockerfile path is often optional
# First validate general file path security
if not self.validate_file_path(path, name):
return False
# Check if it looks like a Dockerfile
# Accept: Dockerfile, dockerfile, Dockerfile.*, docker/Dockerfile, etc.
basename = Path(path).name.lower()
# Must contain 'dockerfile' in the basename
if "dockerfile" not in basename:
self.add_error(
f"Invalid {name}: \"{path}\". File name must contain 'Dockerfile' or 'dockerfile'",
)
return False
return True
def validate_executable_file(self, path: str, name: str = "executable") -> bool:
"""Validate executable file path.
Args:
path: The executable file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not path or path.strip() == "":
return True # Executable path is often optional
# First validate general file path security
if not self.validate_file_path(path, name):
return False
# Check for common executable extensions (for Windows)
# Check for potential security issues with executables
basename = Path(path).name.lower()
# Block obviously dangerous executable names
dangerous_names = [
"cmd",
"powershell",
"bash",
"sh",
"rm",
"del",
"format",
"fdisk",
"shutdown",
"reboot",
]
name_without_ext = Path(basename).stem
if name_without_ext in dangerous_names:
self.add_error(
f'Invalid {name}: "{path}". '
f"Potentially dangerous executable name: {name_without_ext}",
)
return False
return True
def validate_required_file(self, path: str, name: str = "file") -> bool:
"""Validate a required file path (cannot be empty).
Args:
path: The file path to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not path or path.strip() == "":
self.add_error(f"Required {name} path cannot be empty")
return False
# Validate the path itself
return self.validate_file_path(path, name)

View File

@@ -0,0 +1,391 @@
"""Network-related validators for URLs, emails, and other network inputs."""
from __future__ import annotations
import re
from .base import BaseValidator
class NetworkValidator(BaseValidator):
"""Validator for network-related inputs like URLs, emails, scopes."""
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate network-related inputs."""
valid = True
for input_name, value in inputs.items():
if "email" in input_name:
valid &= self.validate_email(value, input_name)
elif "url" in input_name or ("registry" in input_name and "url" in input_name):
valid &= self.validate_url(value, input_name)
elif "scope" in input_name:
valid &= self.validate_scope(value, input_name)
elif "username" in input_name or "user" in input_name:
valid &= self.validate_username(value)
return valid
def get_required_inputs(self) -> list[str]:
"""Network validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return network validation rules."""
return {
"email": "Valid email format",
"url": "Valid URL starting with http:// or https://",
"scope": "NPM scope format (@organization)",
"username": "Valid username without injection patterns",
}
def validate_email(self, email: str, name: str = "email") -> bool:
"""Validate email format.
Args:
email: The email address to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not email or email.strip() == "":
return True # Email is often optional
# Allow GitHub Actions expressions
if self.is_github_expression(email):
return True
# Check for spaces
if " " in email:
self.add_error(f'Invalid {name}: "{email}". Spaces not allowed in email')
return False
# Check @ symbol
at_count = email.count("@")
if at_count != 1:
self.add_error(
f'Invalid {name}: "{email}". Expected exactly one @ symbol, found {at_count}',
)
return False
local, domain = email.split("@")
# Validate local part
if not local:
self.add_error(f'Invalid {name}: "{email}". Missing local part before @')
return False
# Validate domain
if not domain:
self.add_error(f'Invalid {name}: "{email}". Missing domain after @')
return False
# Domain must have at least one dot
if "." not in domain:
self.add_error(f'Invalid {name}: "{email}". Domain must contain a dot')
return False
# Check for dots at start/end of domain
if domain.startswith(".") or domain.endswith("."):
self.add_error(f'Invalid {name}: "{email}". Domain cannot start/end with dot')
return False
# Check for consecutive dots
if ".." in email:
self.add_error(f'Invalid {name}: "{email}". Cannot contain consecutive dots')
return False
# Basic character validation
email_pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
if not re.match(email_pattern, email):
self.add_error(f'Invalid {name}: "{email}". Invalid email format')
return False
return True
def validate_url(self, value: str, name: str = "url") -> bool:
"""Validate URL format.
Args:
value: The URL to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
self.add_error(f"{name} cannot be empty")
return False
# Allow GitHub Actions expressions
if self.is_github_expression(value):
return True
# Must start with http:// or https://
if not (value.startswith(("http://", "https://"))):
self.add_error(f'Invalid {name}: "{value}". Must start with http:// or https://')
return False
# Check for obvious injection patterns
injection_patterns = [";", "&", "|", "`", "$(", "${"]
for pattern in injection_patterns:
if pattern in value:
self.add_error(f'Potential security injection in {name}: contains "{pattern}"')
return False
# Basic URL validation (with optional port)
url_pattern = r"^https?://[\w.-]+(?:\.[a-zA-Z]{2,})?(?::\d{1,5})?(?:[/?#][^\s]*)?$"
if not re.match(url_pattern, value):
self.add_error(f'Invalid {name}: "{value}". Invalid URL format')
return False
return True
def validate_scope(self, value: str, name: str = "scope") -> bool:
"""Validate scope format (e.g., NPM scope).
Args:
value: The scope to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Scope is optional
# NPM scope should start with @
if not value.startswith("@"):
self.add_error(f'Invalid {name}: "{value}". Must start with @')
return False
# Remove @ and validate the rest
scope_name = value[1:]
if not scope_name:
self.add_error(f'Invalid {name}: "{value}". Scope name cannot be empty')
return False
# Must start with lowercase letter
if not scope_name[0].islower():
self.add_error(
f'Invalid {name}: "{value}". Scope name must start with lowercase letter',
)
return False
# Check for valid scope characters
if not re.match(r"^[a-z][a-z0-9._~-]*$", scope_name):
self.add_error(
f'Invalid {name}: "{value}". '
"Scope can only contain lowercase letters, numbers, dots, "
"underscores, tildes, and hyphens",
)
return False
# Check for security patterns
return self.validate_security_patterns(value, name)
def validate_username(self, username: str, name: str = "username") -> bool:
"""Validate username with injection protection.
Args:
username: The username to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not username or username.strip() == "":
return True # Username is often optional
# Check for command injection patterns
injection_patterns = [";", "&&", "||", "|", "`", "$(", "${"]
for pattern in injection_patterns:
if pattern in username:
self.add_error(
f'Invalid {name}: "{username}". Command injection patterns not allowed',
)
return False
# Check length (GitHub username limit)
if len(username) > 39:
self.add_error(
f"{name.capitalize()} too long: {len(username)} characters. "
"GitHub usernames max 39 characters",
)
return False
# GitHub username validation (also allow underscores)
if not re.match(r"^[a-zA-Z0-9](?:[a-zA-Z0-9_-]*[a-zA-Z0-9])?$", username):
self.add_error(
f'Invalid {name}: "{username}". '
"Must start and end with alphanumeric, can contain hyphens and underscores",
)
return False
return True
def validate_registry_url(self, value: str, name: str = "registry-url") -> bool:
"""Validate registry URL format.
Args:
value: The registry URL to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Registry URL is often optional
# Common registry URLs
known_registries = [
"https://registry.npmjs.org/",
"https://npm.pkg.github.com/",
"https://registry.yarnpkg.com/",
"https://pypi.org/simple/",
"https://test.pypi.org/simple/",
"https://rubygems.org/",
"https://nuget.org/api/v2/",
]
# Check if it's a known registry
for registry in known_registries:
if value.startswith(registry):
return True
# Otherwise validate as general URL
return self.validate_url(value, name)
def validate_repository_url(self, value: str, name: str = "repository-url") -> bool:
"""Validate repository URL format.
Args:
value: The repository URL to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Repository URL is often optional
# Common repository URL patterns
repo_patterns = [
r"^https://github\.com/[a-zA-Z0-9-]+/[a-zA-Z0-9._-]+(?:\.git)?$",
r"^https://gitlab\.com/[a-zA-Z0-9-]+/[a-zA-Z0-9._-]+(?:\.git)?$",
r"^https://bitbucket\.org/[a-zA-Z0-9-]+/[a-zA-Z0-9._-]+(?:\.git)?$",
]
for pattern in repo_patterns:
if re.match(pattern, value):
return True
# Otherwise validate as general URL
return self.validate_url(value, name)
def validate_hostname(self, hostname: str, name: str = "hostname") -> bool:
"""Validate hostname format.
Args:
hostname: The hostname to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not hostname or hostname.strip() == "":
return True # Hostname is often optional
# Check length (max 253 characters)
if len(hostname) > 253:
self.add_error(f'Invalid {name}: "{hostname}". Hostname too long (max 253 characters)')
return False
# Check for valid hostname pattern
# Each label can be 1-63 chars, alphanumeric and hyphens, not starting/ending with hyphen
hostname_pattern = r"^(?!-)(?:[a-zA-Z0-9-]{1,63}(?<!-)\.)*[a-zA-Z0-9-]{1,63}(?<!-)$"
if re.match(hostname_pattern, hostname):
return True
# Also allow localhost and IPv6 loopback
if hostname in ["localhost", "::1", "::"]:
return True
# Also check if it's an IP address (which can be a valid hostname)
if self.validate_ip_address(hostname):
return True
self.add_error(f'Invalid {name}: "{hostname}". Must be a valid hostname')
return False
def validate_ip_address(self, ip: str, name: str = "ip_address") -> bool:
"""Validate IP address (IPv4 or IPv6).
Args:
ip: The IP address to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not ip or ip.strip() == "":
return True # IP address is often optional
# IPv4 pattern
ipv4_pattern = (
r"^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}"
r"(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$"
)
if re.match(ipv4_pattern, ip):
return True
# Simplified IPv6 pattern (full validation is complex)
# This covers most common cases: full form, loopback (::1), and unspecified (::)
ipv6_pattern = r"^(?:(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|::1|::)$"
if re.match(ipv6_pattern, ip):
return True
# Allow compressed IPv6
if "::" in ip:
# Very basic check for compressed IPv6
parts = ip.split("::")
if len(parts) == 2:
# Check if parts look like hex
for part in parts:
if part and not all(c in "0123456789abcdefABCDEF:" for c in part):
self.add_error(f'Invalid {name}: "{ip}". Not a valid IP address')
return False
return True
self.add_error(f'Invalid {name}: "{ip}". Must be a valid IPv4 or IPv6 address')
return False
def validate_port(self, port: str, name: str = "port") -> bool:
"""Validate port number.
Args:
port: The port number to validate (as string)
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not port or port.strip() == "":
return True # Port is often optional
# Check if it's a number
try:
port_num = int(port)
except ValueError:
self.add_error(f'Invalid {name}: "{port}". Port must be a number')
return False
# Check valid range (1-65535)
if port_num < 1 or port_num > 65535:
self.add_error(f"Invalid {name}: {port}. Port must be between 1 and 65535")
return False
return True

View File

@@ -0,0 +1,279 @@
"""Numeric validators for ranges and numeric inputs."""
from __future__ import annotations
from .base import BaseValidator
class NumericValidator(BaseValidator):
"""Validator for numeric inputs and ranges."""
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate numeric inputs."""
valid = True
for input_name, value in inputs.items():
# Check for specific numeric patterns
if "retries" in input_name or "retry" in input_name:
valid &= self.validate_range(value, 0, 10, input_name)
elif "timeout" in input_name:
valid &= self.validate_range(value, 1, 3600, input_name)
elif "threads" in input_name or "workers" in input_name:
valid &= self.validate_range(value, 1, 128, input_name)
elif "ram" in input_name or "memory" in input_name:
valid &= self.validate_range(value, 256, 32768, input_name)
elif "quality" in input_name:
valid &= self.validate_range(value, 0, 100, input_name)
elif "parallel" in input_name and "builds" in input_name:
valid &= self.validate_range(value, 0, 16, input_name)
elif "max-warnings" in input_name or "max_warnings" in input_name:
valid &= self.validate_range(value, 0, 10000, input_name)
elif "delay" in input_name:
valid &= self.validate_range(value, 1, 300, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Numeric validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return numeric validation rules."""
return {
"retries": "0-10",
"timeout": "1-3600 seconds",
"threads": "1-128",
"ram": "256-32768 MB",
"quality": "0-100",
"parallel_builds": "0-16",
"max_warnings": "0-10000",
"delay": "1-300 seconds",
}
def validate_range(
self,
value: str,
min_val: int | None,
max_val: int | None,
name: str = "value",
) -> bool:
"""Validate numeric input within a specific range.
Args:
value: The value to validate
min_val: Minimum allowed value (None for no minimum)
max_val: Maximum allowed value (None for no maximum)
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Numeric values are often optional
# Allow GitHub Actions expressions
if self.is_github_expression(value):
return True
try:
num = int(value.strip())
# Handle None values for min and max
if min_val is not None and num < min_val:
self.add_error(f"Invalid {name}: {num}. Must be at least {min_val}")
return False
if max_val is not None and num > max_val:
self.add_error(f"Invalid {name}: {num}. Must be at most {max_val}")
return False
return True
except ValueError:
self.add_error(f'Invalid {name}: "{value}". Must be a number')
return False
def validate_numeric_range(
self,
value: str,
min_val: int | None = None,
max_val: int | None = None,
name: str = "numeric",
) -> bool:
"""Generic numeric range validation.
Args:
value: The value to validate
min_val: Minimum allowed value (inclusive), None for no minimum
max_val: Maximum allowed value (inclusive), None for no maximum
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, min_val, max_val, name)
def validate_numeric_range_0_100(self, value: str, name: str = "value") -> bool:
"""Validate percentage or quality value (0-100).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 0, 100, name)
def validate_numeric_range_1_10(self, value: str, name: str = "retries") -> bool:
"""Validate retry count (1-10).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 1, 10, name)
def validate_numeric_range_1_128(self, value: str, name: str = "threads") -> bool:
"""Validate thread/worker count (1-128).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 1, 128, name)
def validate_numeric_range_256_32768(self, value: str, name: str = "ram") -> bool:
"""Validate RAM in MB (256-32768).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 256, 32768, name)
def validate_numeric_range_0_16(self, value: str, name: str = "parallel-builds") -> bool:
"""Validate parallel builds count (0-16).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 0, 16, name)
def validate_numeric_range_0_10000(self, value: str, name: str = "max-warnings") -> bool:
"""Validate max warnings count (0-10000).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 0, 10000, name)
def validate_numeric_range_1_300(self, value: str, name: str = "delay") -> bool:
"""Validate delay in seconds (1-300).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 1, 300, name)
def validate_numeric_range_1_3600(self, value: str, name: str = "timeout") -> bool:
"""Validate timeout in seconds (1-3600).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
return self.validate_range(value, 1, 3600, name)
def validate_integer(self, value: str | int, name: str = "value") -> bool:
"""Validate integer (can be negative).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or str(value).strip() == "":
return True # Optional
# Allow GitHub Actions expressions
if self.is_github_expression(str(value)):
return True
try:
int(str(value).strip())
return True
except ValueError:
self.add_error(f'Invalid {name}: "{value}". Must be an integer')
return False
def validate_positive_integer(self, value: str, name: str = "value") -> bool:
"""Validate positive integer (> 0).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Optional
try:
num = int(value.strip())
if num > 0:
return True
self.add_error(f"Invalid {name}: {num}. Must be positive")
return False
except ValueError:
self.add_error(f'Invalid {name}: "{value}". Must be a positive integer')
return False
def validate_non_negative_integer(self, value: str, name: str = "value") -> bool:
"""Validate non-negative integer (>= 0).
Args:
value: The value to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not value or value.strip() == "":
return True # Optional
try:
num = int(value.strip())
if num >= 0:
return True
self.add_error(f"Invalid {name}: {num}. Cannot be negative")
return False
except ValueError:
self.add_error(f'Invalid {name}: "{value}". Must be a non-negative integer')
return False

View File

@@ -0,0 +1,234 @@
"""Validator registry for dynamic validator discovery and loading.
Manages the registration and instantiation of validators.
"""
from __future__ import annotations
import importlib
import importlib.util
import logging
from pathlib import Path
import sys
from typing import TYPE_CHECKING
from .convention_mapper import ConventionMapper
from .conventions import ConventionBasedValidator
if TYPE_CHECKING:
from .base import BaseValidator
class ValidatorRegistry:
"""Registry for managing and discovering validators.
Provides dynamic loading of custom validators and fallback to convention-based validation.
"""
def __init__(self) -> None:
"""Initialize the validator registry."""
self._validators: dict[str, type[BaseValidator]] = {}
self._validator_instances: dict[str, BaseValidator] = {}
self._convention_mapper = ConventionMapper()
def register(self, action_type: str, validator_class: type[BaseValidator]) -> None:
"""Register a validator class for an action type.
Args:
action_type: The action type identifier
validator_class: The validator class to register
"""
self._validators[action_type] = validator_class
def register_validator(self, action_type: str, validator_class: type[BaseValidator]) -> None:
"""Register a validator class for an action type (alias for register).
Args:
action_type: The action type identifier
validator_class: The validator class to register
"""
self.register(action_type, validator_class)
# Also create and cache an instance
validator_instance = validator_class(action_type)
self._validator_instances[action_type] = validator_instance
def get_validator(self, action_type: str) -> BaseValidator:
"""Get a validator instance for the given action type.
First attempts to load a custom validator from the action directory,
then falls back to convention-based validation.
Args:
action_type: The action type identifier
Returns:
A validator instance for the action
"""
# Check cache first
if action_type in self._validator_instances:
return self._validator_instances[action_type]
# Try to load custom validator
validator = self._load_custom_validator(action_type)
# Fall back to convention-based validator
if not validator:
validator = self._load_convention_validator(action_type)
# Cache and return
self._validator_instances[action_type] = validator
return validator
def _load_custom_validator(self, action_type: str) -> BaseValidator | None:
"""Attempt to load a custom validator from the action directory.
Args:
action_type: The action type identifier
Returns:
Custom validator instance or None if not found
"""
# Convert action_type to directory name (e.g., sync_labels -> sync-labels)
action_dir = action_type.replace("_", "-")
# Look for CustomValidator.py in the action directory
project_root = Path(__file__).parent.parent.parent
custom_validator_path = project_root / action_dir / "CustomValidator.py"
if not custom_validator_path.exists():
return None
try:
# Load the module dynamically
spec = importlib.util.spec_from_file_location(
f"{action_type}_custom_validator",
custom_validator_path,
)
if not spec or not spec.loader:
return None
module = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
# Get the CustomValidator class
if hasattr(module, "CustomValidator"):
validator_class = module.CustomValidator
return validator_class(action_type)
except (ImportError, AttributeError, TypeError, ValueError) as e:
# Log at debug level - custom validators are optional
# Catch common errors during dynamic module loading:
# - ImportError: Module dependencies not found
# - AttributeError: Module doesn't have CustomValidator
# - TypeError: Validator instantiation failed
# - ValueError: Invalid validator configuration
logger = logging.getLogger(__name__)
logger.debug("Could not load custom validator for %s: %s", action_type, e)
return None
def _load_convention_validator(self, action_type: str) -> BaseValidator:
"""Load a convention-based validator for the action type.
Args:
action_type: The action type identifier
Returns:
Convention-based validator instance
"""
return ConventionBasedValidator(action_type)
def clear_cache(self) -> None:
"""Clear the validator instance cache."""
self._validator_instances.clear()
def list_registered(self) -> list[str]:
"""List all registered action types.
Returns:
List of registered action type identifiers
"""
return list(self._validators.keys())
def is_registered(self, action_type: str) -> bool:
"""Check if an action type has a registered validator.
Args:
action_type: The action type identifier
Returns:
True if a validator is registered, False otherwise
"""
return action_type in self._validators
def get_validator_by_type(self, validator_type: str) -> BaseValidator | None:
"""Get a validator instance by its type name.
Args:
validator_type: The validator type name (e.g., 'BooleanValidator', 'TokenValidator')
Returns:
A validator instance or None if not found
"""
# Map of validator type names to modules
validator_modules = {
"BooleanValidator": "boolean",
"CodeQLValidator": "codeql",
"DockerValidator": "docker",
"FileValidator": "file",
"NetworkValidator": "network",
"NumericValidator": "numeric",
"SecurityValidator": "security",
"TokenValidator": "token",
"VersionValidator": "version",
}
module_name = validator_modules.get(validator_type)
if not module_name:
return None
try:
# Import the module
module = importlib.import_module(f"validators.{module_name}")
# Get the validator class
validator_class = getattr(module, validator_type, None)
if validator_class:
# Create an instance with a dummy action type
return validator_class("temp")
except (ImportError, AttributeError):
# Silently ignore if custom validator module doesn't exist or class not found
pass
return None
# Global registry instance
_registry = ValidatorRegistry()
def get_validator(action_type: str) -> BaseValidator:
"""Get a validator for the given action type.
Args:
action_type: The action type identifier
Returns:
A validator instance for the action
"""
return _registry.get_validator(action_type)
def register_validator(action_type: str, validator_class: type[BaseValidator]) -> None:
"""Register a validator class for an action type.
Args:
action_type: The action type identifier
validator_class: The validator class to register
"""
_registry.register(action_type, validator_class)
def clear_cache() -> None:
"""Clear the global validator cache."""
_registry.clear_cache()

View File

@@ -0,0 +1,748 @@
"""Security validator for detecting injection patterns and security issues."""
from __future__ import annotations
import re
from typing import ClassVar
from .base import BaseValidator
class SecurityValidator(BaseValidator):
"""Validator for security-related checks across all inputs."""
# Common injection patterns to detect
INJECTION_PATTERNS: ClassVar[list[tuple[str, str]]] = [
(r";\s*rm\s+-rf", "rm -rf command"),
(r";\s*del\s+", "del command"),
(r"&&\s*curl\s+", "curl command injection"),
(r"&&\s*wget\s+", "wget command injection"),
(r"\|\s*sh\b", "pipe to shell"),
(r"\|\s*bash\b", "pipe to bash"),
(r"`[^`]+`", "command substitution"),
(r"\$\([^)]+\)", "command substitution"),
(r"\${[^}]+}", "variable expansion"),
(r"<script[^>]*>", "script tag injection"),
(r"javascript:", "javascript protocol"),
(r"data:text/html", "data URI injection"),
]
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate all inputs for security issues."""
valid = True
for input_name, value in inputs.items():
# Skip empty values
if not value or not value.strip():
continue
# Apply security validation to all inputs
valid &= self.validate_security_patterns(value, input_name)
# Additional checks for specific input types
if "regex" in input_name or "pattern" in input_name:
valid &= self.validate_regex_pattern(value, input_name)
elif "path" in input_name or "file" in input_name:
valid &= self.validate_path_security(value, input_name)
elif "url" in input_name or "uri" in input_name:
valid &= self.validate_url_security(value, input_name)
elif "command" in input_name or "cmd" in input_name:
valid &= self.validate_command_security(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Security validator doesn't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return security validation rules."""
return {
"injection_patterns": "Command injection detection",
"path_traversal": "Path traversal prevention",
"xss_prevention": "Cross-site scripting prevention",
}
def validate_injection_patterns(self, value: str, name: str = "input") -> bool:
"""Check for advanced injection patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no injection patterns found, False otherwise
"""
if not value or value.strip() == "":
return True
# Check against known injection patterns
for pattern, description in self.INJECTION_PATTERNS:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Security issue in {name}: detected {description}")
return False
return True
def validate_url_security(self, url: str, name: str = "url") -> bool:
"""Validate URL for security issues.
Args:
url: The URL to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not url or url.strip() == "":
return True
# Check for javascript: protocol
if url.lower().startswith("javascript:"):
self.add_error(f"Security issue in {name}: javascript: protocol not allowed")
return False
# Check for data: URI with HTML
if url.lower().startswith("data:") and "text/html" in url.lower():
self.add_error(f"Security issue in {name}: data:text/html URIs not allowed")
return False
# Check for file: protocol
if url.lower().startswith("file:"):
self.add_error(f"Security issue in {name}: file: protocol not allowed")
return False
return True
def validate_command_security(self, command: str, name: str = "command") -> bool:
"""Validate command for security issues.
Args:
command: The command to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not command or command.strip() == "":
return True
# Dangerous commands that should not be allowed
dangerous_commands = [
"rm -rf",
"rm -fr",
"format c:",
"del /f /s /q",
"shutdown",
"reboot",
":(){:|:&};:", # Fork bomb
"dd if=/dev/zero",
"dd if=/dev/random", # Also dangerous
"mkfs",
"chmod -R 777", # Dangerous permission change
"chmod 777",
"chown -R", # Dangerous ownership change
]
command_lower = command.lower()
for dangerous in dangerous_commands:
if dangerous.lower() in command_lower:
self.add_error(
f"Security issue in {name}: dangerous command pattern '{dangerous}' detected",
)
return False
# Check for base64 encoded commands (often used to hide malicious code)
if re.search(r"base64\s+-d|base64\s+--decode", command, re.IGNORECASE):
self.add_error(f"Security issue in {name}: base64 decode operations not allowed")
return False
return True
def validate_content_security(self, content: str, name: str = "content") -> bool:
"""Validate content for XSS and injection.
Args:
content: The content to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not content or content.strip() == "":
return True
# Check for script tags (match any content between script and >)
if re.search(r"<script[^>]*>.*?</script[^>]*>", content, re.IGNORECASE | re.DOTALL):
self.add_error(f"Security issue in {name}: script tags not allowed")
return False
# Check for event handlers
event_handlers = [
"onclick",
"onload",
"onerror",
"onmouseover",
"onfocus",
"onblur",
"onchange",
"onsubmit",
]
for handler in event_handlers:
if re.search(rf"\b{handler}\s*=", content, re.IGNORECASE):
self.add_error(f"Security issue in {name}: event handler '{handler}' not allowed")
return False
# Check for iframe injection
if re.search(r"<iframe[^>]*>", content, re.IGNORECASE):
self.add_error(f"Security issue in {name}: iframe tags not allowed")
return False
return True
def validate_prefix_security(self, prefix: str, name: str = "prefix") -> bool:
"""Validate prefix for security issues.
Args:
prefix: The prefix to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not prefix or prefix.strip() == "":
return True
# Only alphanumeric, dots, underscores, and hyphens
if not re.match(r"^[a-zA-Z0-9_.-]*$", prefix):
self.add_error(f"Security issue in {name}: '{prefix}' contains invalid characters")
return False
return True
def validate_no_injection(self, value: str, name: str = "input") -> bool:
"""Comprehensive injection detection.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no injection patterns found, False otherwise
"""
if not value or value.strip() == "":
return True
# Allow GitHub expressions (they're safe in Actions context)
if self.is_github_expression(value):
return True
# Check for command injection patterns
if not self.validate_security_patterns(value, name):
return False
# Check for single & (background execution)
if re.search(r"(?<!&)&(?!&)", value):
self.add_error(f"Background execution pattern '&' detected in {name}")
return False
# Check for advanced injection patterns
if not self.validate_injection_patterns(value, name):
return False
# Check for SQL injection patterns
sql_patterns = [
r"'\s*OR\s+'[^']*'\s*=\s*'[^']*", # ' OR '1'='1
r'"\s*OR\s+"[^"]*"\s*=\s*"[^"]*', # " OR "1"="1
r"'\s*OR\s+\d+\s*=\s*\d+", # ' OR 1=1
r";\s*DROP\s+TABLE", # ; DROP TABLE
r";\s*DELETE\s+FROM", # ; DELETE FROM
r"UNION\s+SELECT", # UNION SELECT
r"--\s*$", # SQL comment at end
r";\s*EXEC\s+", # ; EXEC
r"xp_cmdshell", # SQL Server command execution
]
for pattern in sql_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"SQL injection pattern detected in {name}")
return False
# Check for script injection patterns
return self.validate_content_security(value, name)
def validate_safe_command(self, command: str, name: str = "command") -> bool:
"""Validate that a command is safe to execute.
Args:
command: The command to validate
name: The input name for error messages
Returns:
True if command appears safe, False otherwise
"""
if not command or command.strip() == "":
return True
# Allow GitHub expressions (they're safe in Actions context)
if self.is_github_expression(command):
return True
# Use existing command security validation
if not self.validate_command_security(command, name):
return False
# Check for dangerous redirect to device files
if re.search(r">\s*/dev/", command):
self.add_error(f"Security issue in {name}: redirect to device file not allowed")
return False
# Check for filesystem creation commands
if re.search(r"\bmkfs", command, re.IGNORECASE):
self.add_error(f"Security issue in {name}: filesystem creation commands not allowed")
return False
# Additional checks for safe commands
# Block shell metacharacters that could be dangerous
dangerous_chars = ["&", "|", ";", "$", "`", "\\", "!", "{", "}", "[", "]", "(", ")"]
for char in dangerous_chars:
if char in command:
# Allow some safe uses
if char == "&" and "&&" not in command and "&>" not in command:
continue
self.add_error(f"Potentially dangerous character '{char}' in {name}")
return False
return True
def validate_safe_environment_variable(self, value: str, name: str = "env_var") -> bool:
"""Validate environment variable value for security.
Args:
value: The environment variable value
name: The input name for error messages
Returns:
True if safe, False otherwise
"""
if not value or value.strip() == "":
return True
# Check for command substitution in env vars
if "$(" in value or "`" in value or "${" in value:
self.add_error(f"Command substitution not allowed in environment variable {name}")
return False
# Check for newlines (could be used to inject multiple commands)
if "\n" in value or "\r" in value:
self.add_error(f"Newlines not allowed in environment variable {name}")
return False
# Check for null bytes (could be used for string termination attacks)
if "\x00" in value:
self.add_error(f"Null bytes not allowed in environment variable {name}")
return False
# Check for shell special chars that might cause issues
if re.search(r"[;&|]", value) and re.search(
r";\s*(rm|del|format|shutdown|reboot)",
value,
re.IGNORECASE,
):
self.add_error(f"Dangerous command pattern in environment variable {name}")
return False
return True
# Alias for test compatibility
def validate_safe_env_var(self, value: str, name: str = "env_var") -> bool:
"""Alias for validate_safe_environment_variable for test compatibility."""
return self.validate_safe_environment_variable(value, name)
def _check_github_tokens(self, value: str, name: str) -> bool:
"""Check for GitHub token patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no GitHub tokens found, False otherwise
"""
github_token_patterns = [
r"ghp_[a-zA-Z0-9]{36}", # GitHub personal access token
r"gho_[a-zA-Z0-9]{36}", # GitHub OAuth token
r"ghu_[a-zA-Z0-9]{36}", # GitHub user token
r"ghs_[a-zA-Z0-9]{36}", # GitHub server token
r"ghr_[a-zA-Z0-9]{36}", # GitHub refresh token
r"github_pat_[a-zA-Z0-9_]{48,}", # GitHub fine-grained PAT
]
for pattern in github_token_patterns:
if re.search(pattern, value):
self.add_error(f"Potential GitHub token detected in {name}")
return False
return True
def _check_api_keys(self, value: str, name: str) -> bool:
"""Check for API key patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no API keys found, False otherwise
"""
api_key_patterns = [
r"api[_-]?key\s*[:=]\s*['\"]?[a-zA-Z0-9]{20,}", # Generic API key
r"secret[_-]?key\s*[:=]\s*['\"]?[a-zA-Z0-9]{20,}", # Secret key
r"access[_-]?key\s*[:=]\s*['\"]?[a-zA-Z0-9]{20,}", # Access key
]
for pattern in api_key_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Potential API key detected in {name}")
return False
return True
def _check_passwords(self, value: str, name: str) -> bool:
"""Check for password patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no passwords found, False otherwise
"""
password_patterns = [
r"password\s*[:=]\s*['\"]?[^\s'\"]{8,}", # Password assignment
r"passwd\s*[:=]\s*['\"]?[^\s'\"]{8,}", # Passwd assignment
r"pwd\s*[:=]\s*['\"]?[^\s'\"]{8,}", # Pwd assignment
]
for pattern in password_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Potential password detected in {name}")
return False
return True
def _check_private_keys(self, value: str, name: str) -> bool:
"""Check for private key markers.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no private keys found, False otherwise
"""
private_key_markers = [
"-----BEGIN RSA PRIVATE KEY-----",
"-----BEGIN PRIVATE KEY-----",
"-----BEGIN OPENSSH PRIVATE KEY-----",
"-----BEGIN DSA PRIVATE KEY-----",
"-----BEGIN EC PRIVATE KEY-----",
]
for marker in private_key_markers:
if marker in value:
self.add_error(f"Private key detected in {name}")
return False
return True
def _check_encoded_secrets(self, value: str, name: str) -> bool:
"""Check for Base64 encoded secrets.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no encoded secrets found, False otherwise
"""
# Look for long base64 strings that might be credentials
# and if it contains words like secret, key, token, password
if re.search(r"[A-Za-z0-9+/]{40,}={0,2}", value) or (
re.search(r"[A-Za-z0-9+/]{40,}={0,2}", value)
and re.search(r"(secret|key|token|password|credential)", value, re.IGNORECASE)
):
self.add_error(f"Potential encoded secret detected in {name}")
return False
return True
def validate_no_secrets(self, value: str, name: str = "input") -> bool:
"""Validate that no secrets or sensitive data are present.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no secrets detected, False otherwise
"""
if not value or value.strip() == "":
return True
# Run all secret detection checks
return (
self._check_github_tokens(value, name)
and self._check_api_keys(value, name)
and self._check_passwords(value, name)
and self._check_private_keys(value, name)
and self._check_encoded_secrets(value, name)
)
def _check_command_injection_in_regex(self, pattern: str, name: str) -> bool:
"""Check for command injection patterns in regex.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if command injection detected
"""
dangerous_cmd_patterns = [
r";\s*(rm|del|cat|whoami|id|pwd|ls|curl|wget|nc|bash|sh|cmd)",
r"&&\s*(rm|del|cat|whoami|id|pwd|ls|curl|wget|nc|bash|sh|cmd)",
r"\|\s*(sh|bash|cmd)\b",
r"`[^`]+`",
r"\$\([^)]+\)",
]
for cmd_pattern in dangerous_cmd_patterns:
if re.search(cmd_pattern, pattern, re.IGNORECASE):
self.add_error(f"Command injection detected in {name}")
return False
return True
def _check_nested_quantifiers(self, pattern: str, name: str) -> bool:
"""Check for nested quantifiers that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if nested quantifiers detected
"""
nested_quantifier_patterns = [
r"\([^)]*[+*]\)[+*{]", # (x+)+ or (x*)* or (x+){n,m}
r"\([^)]*\{[0-9,]+\}\)[+*{]", # (x{n,m})+ or (x{n,m})*
r"\([^)]*[+*]\)\{", # (x+){n,m}
]
for redos_pattern in nested_quantifier_patterns:
if re.search(redos_pattern, pattern):
self.add_error(
f"ReDoS risk detected in {name}: nested quantifiers can cause "
"catastrophic backtracking. Avoid patterns like (a+)+, (a*)*, or (a+){n,m}"
)
return False
return True
def _check_duplicate_alternatives(self, alt1: str, alt2: str, group: str, name: str) -> bool:
"""Check if two alternatives are exact duplicates.
Args:
alt1: First alternative
alt2: Second alternative
group: The full group string for error message
name: The input name for error messages
Returns:
True if not duplicates, False if duplicates detected
"""
if alt1 == alt2:
self.add_error(
f"ReDoS risk detected in {name}: duplicate alternatives "
f"in repeating group '({group})' can cause "
"catastrophic backtracking"
)
return False
return True
def _check_overlapping_alternatives(self, alt1: str, alt2: str, group: str, name: str) -> bool:
"""Check if two alternatives have prefix overlap.
Args:
alt1: First alternative
alt2: Second alternative
group: The full group string for error message
name: The input name for error messages
Returns:
True if no overlap, False if overlap detected
"""
if alt1.startswith(alt2) or alt2.startswith(alt1):
self.add_error(
f"ReDoS risk detected in {name}: overlapping alternatives "
f"in repeating group '({group})' can cause "
"catastrophic backtracking"
)
return False
return True
def _validate_alternative_pairs(self, alternatives: list[str], group: str, name: str) -> bool:
"""Validate all pairs of alternatives for duplicates and overlaps.
Args:
alternatives: List of alternatives to check
group: The full group string for error message
name: The input name for error messages
Returns:
True if all pairs are safe, False otherwise
"""
for i, alt1 in enumerate(alternatives):
for alt2 in alternatives[i + 1 :]:
# Check for exact duplicates
if not self._check_duplicate_alternatives(alt1, alt2, group, name):
return False
# Check for prefix overlaps
if not self._check_overlapping_alternatives(alt1, alt2, group, name):
return False
return True
def _check_alternation_repetition(self, pattern: str, name: str) -> bool:
"""Check for alternation with repetition that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if problematic alternation detected
"""
alternation_repetition = r"\([^)]*\|[^)]*\)[+*{]"
if not re.search(alternation_repetition, pattern):
return True
# Check if alternatives overlap (basic heuristic)
matches = re.finditer(r"\(([^)]*\|[^)]*)\)[+*{]", pattern)
for match in matches:
alternatives = match.group(1).split("|")
# Validate all pairs of alternatives
if not self._validate_alternative_pairs(alternatives, match.group(1), name):
return False
return True
def _check_consecutive_quantifiers(self, pattern: str, name: str) -> bool:
"""Check for consecutive quantifiers that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if consecutive quantifiers detected
"""
consecutive_quantifiers = r"[.+*][+*{]"
if re.search(consecutive_quantifiers, pattern):
self.add_error(
f"ReDoS risk detected in {name}: consecutive quantifiers like .*.* or .*+ "
"can cause catastrophic backtracking"
)
return False
return True
def _check_exponential_quantifiers(self, pattern: str, name: str) -> bool:
"""Check for exponential quantifier combinations that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if exponential quantifiers detected
"""
depth = 0
max_depth = 0
quantifier_depth_count = 0
i = 0
while i < len(pattern):
char = pattern[i]
if char == "(":
depth += 1
max_depth = max(max_depth, depth)
# Check if followed by quantifier after closing
closing_idx = self._find_closing_paren(pattern, i)
if closing_idx != -1 and closing_idx + 1 < len(pattern):
next_char = pattern[closing_idx + 1]
if next_char in "+*{":
quantifier_depth_count += 1
elif char == ")":
depth -= 1
i += 1
# If we have multiple nested quantified groups (depth > 2 with 3+ quantifiers)
if max_depth > 2 and quantifier_depth_count >= 3:
self.add_error(
f"ReDoS risk detected in {name}: deeply nested groups with multiple "
"quantifiers can cause catastrophic backtracking"
)
return False
return True
def validate_regex_pattern(self, pattern: str, name: str = "regex") -> bool:
"""Validate regex pattern for ReDoS vulnerabilities.
Detects potentially dangerous regex patterns that could cause
Regular Expression Denial of Service (ReDoS) through catastrophic
backtracking.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if pattern appears safe, False if ReDoS risk detected
"""
if not pattern or pattern.strip() == "":
return True
# Allow GitHub expressions
if self.is_github_expression(pattern):
return True
# Run all ReDoS checks using helper methods
if not self._check_command_injection_in_regex(pattern, name):
return False
if not self._check_nested_quantifiers(pattern, name):
return False
if not self._check_alternation_repetition(pattern, name):
return False
if not self._check_consecutive_quantifiers(pattern, name):
return False
return self._check_exponential_quantifiers(pattern, name)
def _find_closing_paren(self, pattern: str, start: int) -> int:
"""Find the closing parenthesis for an opening one.
Args:
pattern: The regex pattern
start: The index of the opening parenthesis
Returns:
Index of the closing parenthesis, or -1 if not found
"""
if start >= len(pattern) or pattern[start] != "(":
return -1
depth = 1
i = start + 1
while i < len(pattern) and depth > 0:
if pattern[i] == "(":
depth += 1
elif pattern[i] == ")":
depth -= 1
if depth == 0:
return i
i += 1
return -1

View File

@@ -0,0 +1,232 @@
"""Token validators for authentication tokens."""
from __future__ import annotations
import re
from typing import ClassVar
from .base import BaseValidator
class TokenValidator(BaseValidator):
"""Validator for various authentication tokens."""
# Token patterns for different token types (based on official GitHub documentation)
# https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/about-authentication-to-github#githubs-token-formats
# Note: The lengths include the prefix
TOKEN_PATTERNS: ClassVar[dict[str, str]] = {
# Personal access token (classic):
# ghp_ + 36 = 40 chars total
"github_classic": r"^ghp_[a-zA-Z0-9]{36}$",
# Fine-grained PAT:
# github_pat_ + 50-255 chars with underscores
"github_fine_grained": r"^github_pat_[A-Za-z0-9_]{50,255}$",
# OAuth access token: gho_ + 36 = 40 chars total
"github_oauth": r"^gho_[a-zA-Z0-9]{36}$",
# User access token for GitHub App:
# ghu_ + 36 = 40 chars total
"github_user_app": r"^ghu_[a-zA-Z0-9]{36}$",
# Installation access token:
# ghs_ + 36 = 40 chars total
"github_installation": r"^ghs_[a-zA-Z0-9]{36}$",
# Refresh token for GitHub App:
# ghr_ + 36 = 40 chars total
"github_refresh": r"^ghr_[a-zA-Z0-9]{36}$",
# GitHub Enterprise token:
# ghe_ + 36 = 40 chars total
"github_enterprise": r"^ghe_[a-zA-Z0-9]{36}$",
# NPM classic tokens
"npm_classic": r"^npm_[a-zA-Z0-9]{40,}$",
}
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate token-related inputs."""
valid = True
for input_name, value in inputs.items():
if "token" in input_name.lower():
# Determine token type from input name
if "npm" in input_name:
valid &= self.validate_npm_token(value, input_name)
elif "dockerhub" in input_name or "docker" in input_name:
valid &= self.validate_docker_token(value, input_name)
else:
# Default to GitHub token
valid &= self.validate_github_token(value)
elif input_name == "password":
# Password fields might be tokens
valid &= self.validate_password(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Token validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return token validation rules."""
return {
"github_token": "GitHub personal access token or ${{ github.token }}",
"npm_token": "NPM authentication token",
"docker_token": "Docker Hub access token",
"patterns": self.TOKEN_PATTERNS,
}
def validate_github_token(self, token: str, *, required: bool = False) -> bool:
"""Validate GitHub token format.
Args:
token: The token to validate
required: Whether the token is required
Returns:
True if valid, False otherwise
"""
if not token or token.strip() == "":
if required:
self.add_error("GitHub token is required but not provided")
return False
return True # Optional token can be empty
# Allow GitHub Actions expressions
if self.is_github_expression(token):
return True
if token == "${{ secrets.GITHUB_TOKEN }}":
return True
# Allow environment variable references
if token.startswith("$") and not token.startswith("${{"):
return True
# Check against known GitHub token patterns
for pattern_name, pattern in self.TOKEN_PATTERNS.items():
if pattern_name.startswith("github_") and re.match(pattern, token):
return True
self.add_error(
"Invalid token format. Expected: ghp_* (40 chars), "
"github_pat_[A-Za-z0-9_]* (50-255 chars), gho_* (40 chars), ghu_* (40 chars), "
"ghs_* (40 chars), ghr_* (40 chars), ghe_* (40 chars), or ${{ github.token }}",
)
return False
def validate_npm_token(self, token: str, name: str = "npm-token") -> bool:
"""Validate NPM token format.
Args:
token: The token to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not token or token.strip() == "":
return True # NPM token is often optional
# Allow environment variable references
if token.startswith("$"):
return True
# Check NPM token pattern
if re.match(self.TOKEN_PATTERNS["npm_classic"], token):
return True
# NPM also accepts UUIDs and other formats
if re.match(
r"^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$",
token,
):
return True
self.add_error(f"Invalid {name} format. Expected npm_* token or UUID format")
return False
def validate_docker_token(self, token: str, name: str = "docker-token") -> bool:
"""Validate Docker Hub token format.
Args:
token: The token to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not token or token.strip() == "":
return True # Docker token is often optional
# Allow environment variable references
if token.startswith("$"):
return True
# Docker tokens are typically UUIDs or custom formats
# We'll be lenient here as Docker Hub accepts various formats
if len(token) < 10:
self.add_error(f"Invalid {name}: token too short")
return False
# Check for obvious security issues
if " " in token or "\n" in token or "\t" in token:
self.add_error(f"Invalid {name}: contains whitespace")
return False
return True
def validate_password(self, password: str, name: str = "password") -> bool:
"""Validate password field (might be a token).
Args:
password: The password/token to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not password or password.strip() == "":
# Password might be required depending on context
return True
# Allow environment variable references
if password.startswith("$"):
return True
# Check for obvious security issues
if len(password) < 8:
self.add_error(f"Invalid {name}: too short (minimum 8 characters)")
return False
# Check for whitespace
if password != password.strip():
self.add_error(f"Invalid {name}: contains leading/trailing whitespace")
return False
return True
def validate_namespace_with_lookahead(self, namespace: str, name: str = "namespace") -> bool:
"""Validate namespace using lookahead pattern (for csharp-publish).
This is a special case for GitHub package namespaces.
Args:
namespace: The namespace to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not namespace or namespace.strip() == "":
self.add_error(f"{name.capitalize()} cannot be empty")
return False
# Original pattern with lookahead: ^[a-zA-Z0-9]([a-zA-Z0-9]|-(?=[a-zA-Z0-9])){0,38}$
# This ensures no trailing hyphens
pattern = r"^[a-zA-Z0-9]([a-zA-Z0-9]|-(?=[a-zA-Z0-9])){0,38}$"
if re.match(pattern, namespace):
return True
self.add_error(
f'Invalid {name} format: "{namespace}". Must be 1-39 characters, '
"alphanumeric and hyphens, no trailing hyphens",
)
return False

View File

@@ -0,0 +1,606 @@
"""Version validators for various versioning schemes."""
from __future__ import annotations
import re
from .base import BaseValidator
class VersionValidator(BaseValidator):
"""Validator for version strings (SemVer, CalVer, language-specific)."""
# Common version patterns
VERSION_X_Y_Z_PATTERN = r"^\d+\.\d+(\.\d+)?$"
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate version-related inputs."""
valid = True
for input_name, value in inputs.items():
if "version" in input_name.lower():
# Determine version type from input name
if "dotnet" in input_name:
valid &= self.validate_dotnet_version(value, input_name)
elif "terraform" in input_name or "tflint" in input_name:
valid &= self.validate_terraform_version(value, input_name)
elif "node" in input_name:
valid &= self.validate_node_version(value, input_name)
elif "python" in input_name:
valid &= self.validate_python_version(value, input_name)
elif "php" in input_name:
valid &= self.validate_php_version(value, input_name)
elif "go" in input_name:
valid &= self.validate_go_version(value, input_name)
else:
# Default to semantic version
valid &= self.validate_semantic_version(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Version validators typically don't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return version validation rules."""
return {
"semantic": "X.Y.Z format with optional pre-release and build metadata",
"calver": "Calendar-based versioning (YYYY.MM.DD, etc.)",
"dotnet": ".NET version format",
"terraform": "Terraform version format",
"node": "Node.js version format",
"python": "Python 3.x version",
"php": "PHP 7.4-9.x version",
"go": "Go 1.x version",
}
def validate_semantic_version(self, version: str, name: str = "version") -> bool:
"""Validate semantic version format.
Args:
version: The version string to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not version or version.strip() == "":
return True # Version is often optional
# Remove 'v' or 'V' prefix if present (case-insensitive)
clean_version = version
if clean_version.lower().startswith("v"):
clean_version = clean_version[1:]
# Examples: 1.0.0, 2.1.3-beta, 3.0.0-rc.1, 1.2.3+20130313144700
semver_pattern = (
r"^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)" # MAJOR.MINOR.PATCH
r"(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)" # Pre-release
r"(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?"
r"(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$" # Build metadata
)
if re.match(semver_pattern, clean_version):
return True
# Also allow simple X.Y format for flexibility
simple_pattern = r"^(0|[1-9]\d*)\.(0|[1-9]\d*)$"
if re.match(simple_pattern, clean_version):
return True
# Also allow single digit version (e.g., "1", "2")
single_digit_pattern = r"^(0|[1-9]\d*)$"
if re.match(single_digit_pattern, clean_version):
return True
self.add_error(
f'Invalid semantic version: "{version}" in {name}. '
"Expected format: MAJOR.MINOR.PATCH (e.g., 1.2.3, 2.0.0-beta)",
)
return False
# Compatibility aliases for tests and backward compatibility
def validate_semver(self, version: str, name: str = "version") -> bool:
"""Alias for validate_semantic_version."""
return self.validate_semantic_version(version, name)
def validate_calver(self, version: str, name: str = "version") -> bool:
"""Alias for validate_calver_version."""
return self.validate_calver_version(version, name)
def validate_version(self, version: str, version_type: str, name: str = "version") -> bool:
"""Generic version validation based on type."""
if version_type == "semantic":
return self.validate_semantic_version(version, name)
if version_type == "calver":
return self.validate_calver_version(version, name)
if version_type == "flexible":
return self.validate_flexible_version(version, name)
if version_type == "dotnet":
return self.validate_dotnet_version(version, name)
if version_type == "terraform":
return self.validate_terraform_version(version, name)
if version_type == "node":
return self.validate_node_version(version, name)
if version_type == "python":
return self.validate_python_version(version, name)
if version_type == "php":
return self.validate_php_version(version, name)
if version_type == "go":
return self.validate_go_version(version, name)
# Allow "latest" as special case
if version.strip().lower() == "latest":
return True
# Default to semantic version
return self.validate_semantic_version(version, name) # Version is often optional
def validate_strict_semantic_version(self, version: str, name: str = "version") -> bool:
"""Validate strict semantic version format (X.Y.Z required).
Args:
version: The version string to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not version or version.strip() == "":
self.add_error(f"Version cannot be empty in {name}")
return False
# Allow "latest" as special case
if version.strip().lower() == "latest":
return True
# Remove common prefixes for validation
clean_version = version.lstrip("v")
# Strict semantic version pattern
pattern = r"^\d+\.\d+\.\d+(-[\dA-Za-z.-]+)?(\+[\dA-Za-z.-]+)?$"
if re.match(pattern, clean_version):
return True
self.add_error(
f'Invalid strict semantic version format: "{version}" in {name}. Must be X.Y.Z',
)
return False
def validate_calver_version(self, version: str, name: str = "version") -> bool:
"""Validate CalVer (Calendar Versioning) format.
Args:
version: The version string to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not version or version.strip() == "":
return True # Version is often optional
# Remove common prefixes for validation
clean_version = version.lstrip("v")
# CalVer patterns
calver_patterns = [
r"^\d{4}\.\d{1,2}\.\d{1,2}$", # YYYY.MM.DD
r"^\d{4}\.\d{1,2}\.\d{3,}$", # YYYY.MM.PATCH
r"^\d{4}\.0\d\.0\d$", # YYYY.0M.0D
r"^\d{2}\.\d{1,2}\.\d+$", # YY.MM.MICRO
r"^\d{4}\.\d{1,2}$", # YYYY.MM
r"^\d{4}-\d{2}-\d{2}$", # YYYY-MM-DD
]
for pattern in calver_patterns:
match = re.match(pattern, clean_version)
# Additional validation for date components
if match and self._validate_calver_date_parts(clean_version, pattern):
return True
self.add_error(
f'Invalid CalVer format: "{version}" in {name}. '
"Expected formats like YYYY.MM.DD, YY.MM.MICRO, etc.",
)
return False
def _parse_calver_year(self, year_part: str) -> int | None:
"""Parse year from CalVer version part.
Returns:
Year as integer, or None if not a valid year format
"""
if len(year_part) == 4:
return int(year_part)
if len(year_part) == 2:
# For YY format, assume 2000s
return 2000 + int(year_part)
return None # Not a date-based CalVer
def _is_valid_month(self, month: int) -> bool:
"""Check if month is in valid range (1-12)."""
return 1 <= month <= 12
def _is_leap_year(self, year: int) -> bool:
"""Check if year is a leap year."""
return (year % 4 == 0 and year % 100 != 0) or (year % 400 == 0)
def _get_max_day_for_month(self, month: int, year: int) -> int:
"""Get maximum valid day for given month and year.
Args:
month: Month (1-12)
year: Year (for leap year calculation)
Returns:
Maximum valid day for the month
"""
if month in [4, 6, 9, 11]: # April, June, September, November
return 30
if month == 2: # February
return 29 if self._is_leap_year(year) else 28
return 31 # All other months
def _is_valid_day(self, day: int, month: int, year: int) -> bool:
"""Check if day is valid for the given month and year."""
if not (1 <= day <= 31):
return False
max_day = self._get_max_day_for_month(month, year)
return day <= max_day
def _should_validate_day(self, pattern: str, third_part: str) -> bool:
"""Determine if third part should be validated as a day.
Args:
pattern: The CalVer pattern that matched
third_part: The third part of the version string
Returns:
True if the third part represents a day and should be validated
"""
# YYYY.MM.DD and YYYY-MM-DD formats have day as third part
if r"\d{1,2}$" in pattern or r"\d{2}$" in pattern:
# Check if it looks like a day (1-2 digits)
return third_part.isdigit() and len(third_part) <= 2
# YYYY.MM.PATCH format has patch number (3+ digits), not a day
if r"\d{3,}" in pattern:
return False
# YYYY.0M.0D format is a date format
return r"0\d" in pattern
def _validate_calver_date_parts(self, version: str, pattern: str) -> bool:
"""Validate date components in CalVer version.
Args:
version: The version string to validate
pattern: The regex pattern that matched (helps determine format type)
Returns:
True if date components are valid, False otherwise
"""
# Handle different separators
parts = version.split("-") if "-" in version else version.split(".")
# Need at least year and month
if len(parts) < 2:
return True
# Parse and validate year
year = self._parse_calver_year(parts[0])
if year is None:
return True # Not a date-based CalVer
# Validate month
month = int(parts[1])
if not self._is_valid_month(month):
return False
# Validate day if present and pattern indicates it's a day (not patch number)
if len(parts) >= 3 and self._should_validate_day(pattern, parts[2]):
day = int(parts[2])
if not self._is_valid_day(day, month, year):
return False
return True
def validate_flexible_version(self, version: str, name: str = "version") -> bool:
"""Validate either CalVer or SemVer format.
Args:
version: The version string to validate
name: The input name for error messages
Returns:
True if valid, False otherwise
"""
if not version or version.strip() == "":
return True # Version is often optional
# Allow "latest" as special case
if version.strip().lower() == "latest":
return True
# Save current errors
original_errors = self.errors.copy()
# Try CalVer first if it looks like CalVer
clean_version = version.lstrip("v")
looks_like_calver = (
re.match(r"^\d{4}\.", clean_version)
or re.match(r"^\d{4}-", clean_version)
or (re.match(r"^\d{2}\.\d", clean_version) and int(clean_version.split(".")[0]) >= 20)
)
if looks_like_calver:
self.errors = []
if self.validate_calver_version(version, name):
self.errors = original_errors
return True
# If it looks like CalVer but fails, don't try SemVer
self.errors = original_errors
self.add_error(f'Invalid CalVer format: "{version}" in {name}')
return False
# Try SemVer
self.errors = []
if self.validate_semantic_version(version, name):
self.errors = original_errors
return True
# Failed both
self.errors = original_errors
self.add_error(
f'Invalid version format: "{version}" in {name}. '
"Expected either CalVer (e.g., 2024.3.1) or SemVer (e.g., 1.2.3)",
)
return False
def validate_dotnet_version(self, value: str, name: str = "dotnet-version") -> bool:
"""Validate .NET version format."""
return self._validate_language_version(
value,
name,
{
"name": ".NET",
"major_range": (3, 20),
"pattern": r"^\d+(\.\d+(\.\d+)?)?(-[\dA-Za-z-]+(\.\dA-Za-z-]+)*)?$",
"check_leading_zeros": True,
},
)
def validate_terraform_version(self, value: str, name: str = "terraform-version") -> bool:
"""Validate Terraform version format."""
if not value or value.strip() == "":
return True
if value.strip().lower() == "latest":
return True
clean_version = value.lstrip("v")
pattern = r"^\d+\.\d+\.\d+(-[\w.-]+)?$"
if re.match(pattern, clean_version):
return True
self.add_error(f'Invalid Terraform version format: "{value}" in {name}')
return False
def validate_node_version(self, value: str, name: str = "node-version") -> bool:
"""Validate Node.js version format."""
if not value or value.strip() == "":
return True
# Check for special Node.js keywords (case-insensitive)
value_lower = value.strip().lower()
node_keywords = ["latest", "lts", "current", "node"]
if value_lower in node_keywords or value_lower.startswith("lts/"):
return True
# Remove v prefix (case-insensitive)
clean_version = value
if clean_version.lower().startswith("v"):
clean_version = clean_version[1:]
pattern = r"^\d+(\.\d+(\.\d+)?)?$"
if re.match(pattern, clean_version):
return True
self.add_error(f'Invalid Node.js version format: "{value}" in {name}')
return False
def validate_python_version(self, value: str, name: str = "python-version") -> bool:
"""Validate Python version format (3.8-3.15)."""
return self._validate_language_version(
value,
name,
{
"name": "Python",
"major_range": 3,
"minor_range": (8, 15),
"pattern": self.VERSION_X_Y_Z_PATTERN,
},
)
def validate_php_version(self, value: str, name: str = "php-version") -> bool:
"""Validate PHP version format (7.4-9.x)."""
# First do basic validation
if not value or value.strip() == "":
self.add_error(f"{name} cannot be empty")
return False
clean_value = value.strip()
# Reject v prefix
if clean_value.startswith("v"):
self.add_error(
f'Invalid PHP version format: "{value}" in {name}. '
'Version prefix "v" is not allowed',
)
return False
# Check format
if not re.match(self.VERSION_X_Y_Z_PATTERN, clean_value):
self.add_error(
f'Invalid PHP version format: "{value}" in {name}. Must be X.Y or X.Y.Z format',
)
return False
# Parse version
parts = clean_value.split(".")
major = int(parts[0])
minor = int(parts[1])
# Check major version range (7-9)
if major < 7 or major > 9:
self.add_error(
f'PHP version "{value}" in {name}. Major version should be between 7 and 9',
)
return False
# Check minor version ranges per major version
# PHP 7: 7.0-7.4 are the only released versions, but allow higher for testing
# PHP 8: Allow any minor version for future compatibility
# PHP 9: Allow any minor for future compatibility
# Only restrict if the minor version is unreasonably high (>99)
if minor > 99:
self.add_error(
f'PHP version "{value}" in {name}. Minor version {minor} is unreasonably high',
)
return False
return True
def validate_go_version(self, value: str, name: str = "go-version") -> bool:
"""Validate Go version format (1.18-1.30)."""
return self._validate_language_version(
value,
name,
{
"name": "Go",
"major_range": 1,
"minor_range": (18, 30),
"pattern": self.VERSION_X_Y_Z_PATTERN,
},
)
def _check_version_prefix(
self, value: str, clean_value: str, name: str, lang_name: str
) -> bool:
"""Check if version has invalid 'v' prefix."""
if clean_value.startswith("v"):
self.add_error(
f'Invalid {lang_name} version format: "{value}" in {name}. '
'Version prefix "v" is not allowed',
)
return False
return True
def _check_version_format(self, value: str, clean_value: str, name: str, config: dict) -> bool:
"""Check if version matches expected format pattern."""
if not re.match(config["pattern"], clean_value):
self.add_error(
f'Invalid {config["name"]} version format: "{value}" in {name}. '
"Must be X.Y or X.Y.Z format",
)
return False
return True
def _check_leading_zeros(self, value: str, parts: list[str], name: str, lang_name: str) -> bool:
"""Check for invalid leading zeros in version parts."""
for part in parts:
if part.startswith("0") and len(part) > 1:
self.add_error(
f'Invalid {lang_name} version format: "{value}" in {name}. '
"Leading zeros are not allowed",
)
return False
return True
def _validate_major_version(
self,
major: int,
value: str,
name: str,
major_range: int | tuple[int, int] | None,
lang_name: str,
) -> bool:
"""Validate major version against allowed range."""
if isinstance(major_range, int):
if major != major_range:
self.add_error(
f'{lang_name} version "{value}" in {name}. '
f"{lang_name} major version should be {major_range}",
)
return False
elif major_range:
min_major, max_major = major_range
if major < min_major or major > max_major:
self.add_error(
f'{lang_name} version "{value}" in {name}. '
f"Major version should be between {min_major} and {max_major}",
)
return False
return True
def _validate_minor_version(
self,
minor: int,
value: str,
name: str,
minor_range: tuple[int, int] | None,
lang_name: str,
) -> bool:
"""Validate minor version against allowed range."""
if minor_range:
min_minor, max_minor = minor_range
if minor < min_minor or minor > max_minor:
self.add_error(
f'{lang_name} version "{value}" in {name}. '
f"Minor version should be between {min_minor} and {max_minor}",
)
return False
return True
def _validate_language_version(self, value: str, name: str, config: dict) -> bool:
"""Consolidated language version validation."""
if not value or value.strip() == "":
if config.get("required", False):
self.add_error(f'Input "{name}" is required and cannot be empty')
return False
return True
clean_value = value.strip()
lang_name = config["name"]
# Check for invalid 'v' prefix
if not self._check_version_prefix(value, clean_value, name, lang_name):
return False
# Check version format matches pattern
if not self._check_version_format(value, clean_value, name, config):
return False
# Parse version components
parts = clean_value.split(".")
major = int(parts[0])
minor = int(parts[1]) if len(parts) > 1 else 0
# Check for leading zeros if required
if config.get("check_leading_zeros") and not self._check_leading_zeros(
value, parts, name, lang_name
):
return False
# Validate major version range
major_valid = self._validate_major_version(
major, value, name, config.get("major_range"), lang_name
)
if not major_valid:
return False
# Validate minor version range
return self._validate_minor_version(
minor, value, name, config.get("minor_range"), lang_name
)