Files
actions/_tests/shared/validation_core.py
Ismo Vuorinen 78fdad69e5 feat: fixes, tweaks, new actions, linting (#186)
* feat: fixes, tweaks, new actions, linting
* fix: improve docker publish loops and dotnet parsing (#193)
* fix: harden action scripts and version checks (#191)
* refactor: major repository restructuring and security enhancements

Add comprehensive development infrastructure:
- Add Makefile with automated documentation generation, formatting, and linting tasks
- Add TODO.md tracking self-containment progress and repository improvements
- Add .nvmrc for consistent Node.js version management
- Create python-version-detect-v2 action for enhanced Python detection

Enhance all GitHub Actions with standardized patterns:
- Add consistent token handling across 27 actions using standardized input patterns
- Implement bash error handling (set -euo pipefail) in all shell steps
- Add comprehensive input validation for path traversal and command injection protection
- Standardize checkout token authentication to prevent rate limiting
- Remove relative action dependencies to ensure external usability

Rewrite security workflow for PR-focused analysis:
- Transform security-suite.yml to PR-only security analysis workflow
- Remove scheduled runs, repository issue management, and Slack notifications
- Implement smart comment generation showing only sections with content
- Add GitHub Actions permission diff analysis and new action detection
- Integrate OWASP, Semgrep, and TruffleHog for comprehensive PR security scanning

Improve version detection and dependency management:
- Simplify version detection actions to use inline logic instead of shared utilities
- Fix Makefile version detection fallback to properly return 'main' when version not found
- Update all external action references to use SHA-pinned versions
- Remove deprecated run.sh in favor of Makefile automation

Update documentation and project standards:
- Enhance CLAUDE.md with self-containment requirements and linting standards
- Update README.md with improved action descriptions and usage examples
- Standardize code formatting with updated .editorconfig and .prettierrc.yml
- Improve GitHub templates for issues and security reporting

This refactoring ensures all 40 actions are fully self-contained and can be used independently when
referenced as ivuorinen/actions/action-name@main, addressing the critical requirement for external
usability while maintaining comprehensive security analysis and development automation.

* feat: add automated action catalog generation system

- Create generate_listing.cjs script for comprehensive action catalog
- Add package.json with development tooling and npm scripts
- Implement automated README.md catalog section with --update flag
- Generate markdown reference-style links for all 40 actions
- Add categorized tables with features, language support matrices
- Replace static reference links with auto-generated dynamic links
- Enable complete automation of action documentation maintenance

* feat: enhance actions with improved documentation and functionality

- Add comprehensive README files for 12 actions with usage examples
- Implement new utility actions (go-version-detect, dotnet-version-detect)
- Enhance node-setup with extensive configuration options
- Improve error handling and validation across all actions
- Update package.json scripts for better development workflow
- Expand TODO.md with detailed roadmap and improvement plans
- Standardize action structure with consistent inputs/outputs

* feat: add comprehensive output handling across all actions

- Add standardized outputs to 15 actions that previously had none
- Implement consistent snake_case naming convention for all outputs
- Add build status and test results outputs to build actions
- Add files changed and status outputs to lint/fix actions
- Add test execution metrics to php-tests action
- Add stale/closed counts to stale action
- Add release URLs and IDs to github-release action
- Update documentation with output specifications
- Mark comprehensive output handling task as complete in TODO.md

* feat: implement shared cache strategy across all actions

- Add caching to 10 actions that previously had none (Node.js, .NET, Python, Go)
- Standardize 4 existing actions to use common-cache instead of direct actions/cache
- Implement consistent cache-hit optimization to skip installations when cache available
- Add language-specific cache configurations with appropriate key files
- Create unified caching approach using ivuorinen/actions/common-cache@main
- Fix YAML syntax error in php-composer action paths parameter
- Update TODO.md to mark shared cache strategy as complete

* feat: implement comprehensive retry logic for network operations

- Create new common-retry action for standardized retry patterns with configurable strategies
- Add retry logic to 9 actions missing network retry capabilities
- Implement exponential backoff, custom timeouts, and flexible error handling
- Add max-retries input parameter to all network-dependent actions (Node.js, .NET, Python, Go)
- Standardize existing retry implementations to use common-retry utility
- Update action catalog to include new common-retry action (41 total actions)
- Update documentation with retry configuration examples and parameters
- Mark retry logic implementation as complete in TODO.md roadmap

* feat: enhance Node.js support with Corepack and Bun

- Add Corepack support for automatic package manager version management
- Add Bun package manager support across all Node.js actions
- Improve Yarn Berry/PnP support with .yarnrc.yml detection
- Add Node.js feature detection (ESM, TypeScript, frameworks)
- Update package manager detection priority and lockfile support
- Enhance caching with package-manager-specific keys
- Update eslint, prettier, and biome actions for multi-package-manager support

* fix: resolve critical runtime issues across multiple actions

- Fix token validation by removing ineffective literal string comparisons
- Add missing @microsoft/eslint-formatter-sarif dependency for SARIF output
- Fix Bash variable syntax errors in username and changelog length checks
- Update Dockerfile version regex to handle tags with suffixes (e.g., -alpine)
- Simplify version selection logic with single grep command
- Fix command execution in retry action with proper bash -c wrapper
- Correct step output references using .outcome instead of .outputs.outcome
- Add missing step IDs for version detection actions
- Include go.mod in cache key files for accurate invalidation
- Require minor version in all version regex patterns
- Improve Bun installation security by verifying script before execution
- Replace bc with sort -V for portable PHP version comparison
- Remove non-existent pre-commit output references

These fixes ensure proper runtime behavior, improved security, and better
cross-platform compatibility across all affected actions.

* fix: resolve critical runtime and security issues across actions

- Fix biome-fix files_changed calculation using git diff instead of git status delta
- Fix compress-images output description and add absolute path validation
- Remove csharp-publish token default and fix token fallback in push commands
- Add @microsoft/eslint-formatter-sarif to all package managers in eslint-check
- Fix eslint-check command syntax by using variable assignment
- Improve node-setup Bun installation security and remove invalid frozen-lockfile flag
- Fix pre-commit token validation by removing ineffective literal comparison
- Fix prettier-fix token comparison and expand regex for all GitHub token types
- Add version-file-parser regex validation safety and fix csproj wildcard handling

These fixes address security vulnerabilities, runtime errors, and functional issues
to ensure reliable operation across all affected GitHub Actions.

* feat: enhance Docker actions with advanced multi-architecture support

Major enhancement to Docker build and publish actions with comprehensive
multi-architecture capabilities and enterprise-grade features.

Added features:
- Advanced buildx configuration (version control, cache modes, build contexts)
- Auto-detect platforms for dynamic architecture discovery
- Performance optimizations with enhanced caching strategies
- Security scanning with Trivy and image signing with Cosign
- SBOM generation in multiple formats with validation
- Verbose logging and dry-run modes for debugging
- Platform-specific build args and fallback mechanisms

Enhanced all Docker actions:
- docker-build: Core buildx features and multi-arch support
- docker-publish-gh: GitHub Packages with security features
- docker-publish-hub: Docker Hub with scanning and signing
- docker-publish: Orchestrator with unified configuration

Updated documentation across all modified actions.

* fix: resolve documentation generation placeholder issue

Fixed Makefile and package.json to properly replace placeholder tokens in generated documentation, ensuring all README files show correct repository paths instead of ***PROJECT***@***VERSION***.

* chore: simplify github token validation
* chore(lint): optional yamlfmt, config and fixes
* feat: use relative `uses` names

* feat: comprehensive testing infrastructure and Python validation system

- Migrate from tests/ to _tests/ directory structure with ShellSpec framework
- Add comprehensive validation system with Python-based input validation
- Implement dual testing approach (ShellSpec + pytest) for complete coverage
- Add modern Python tooling (uv, ruff, pytest-cov) and dependencies
- Create centralized validation rules with automatic generation system
- Update project configuration and build system for new architecture
- Enhance documentation to reflect current testing capabilities

This establishes a robust foundation for action validation and testing
with extensive coverage across all GitHub Actions in the repository.

* chore: remove Dockerfile for now
* chore: code review fixes

* feat: comprehensive GitHub Actions restructuring and tooling improvements

This commit represents a major restructuring of the GitHub Actions monorepo
with improved tooling, testing infrastructure, and comprehensive PR #186
review implementation.

## Major Changes

### 🔧 Development Tooling & Configuration
- **Shellcheck integration**: Exclude shellspec test files from linting
  - Updated .pre-commit-config.yaml to exclude _tests/*.sh from shellcheck/shfmt
  - Modified Makefile shellcheck pattern to skip shellspec files
  - Updated CLAUDE.md documentation with proper exclusion syntax
- **Testing infrastructure**: Enhanced Python validation framework
  - Fixed nested if statements and boolean parameter issues in validation.py
  - Improved code quality with explicit keyword arguments
  - All pre-commit hooks now passing

### 🏗️ Project Structure & Documentation
- **Added Serena AI integration** with comprehensive project memories:
  - Project overview, structure, and technical stack documentation
  - Code style conventions and completion requirements
  - Comprehensive PR #186 review analysis and implementation tracking
- **Enhanced configuration**: Updated .gitignore, .yamlfmt.yml, pyproject.toml
- **Improved testing**: Added integration workflows and enhanced test specs

### 🚀 GitHub Actions Improvements (30+ actions updated)
- **Centralized validation**: Updated 41 validation rule files
- **Enhanced actions**: Improvements across all action categories:
  - Setup actions (node-setup, version detectors)
  - Utility actions (version-file-parser, version-validator)
  - Linting actions (biome, eslint, terraform-lint-fix major refactor)
  - Build/publish actions (docker-build, npm-publish, csharp-*)
  - Repository management actions

### 📝 Documentation Updates
- **README consistency**: Updated version references across action READMEs
- **Enhanced documentation**: Improved action descriptions and usage examples
- **CLAUDE.md**: Updated with current tooling and best practices

## Technical Improvements
- **Security enhancements**: Input validation and sanitization improvements
- **Performance optimizations**: Streamlined action logic and dependencies
- **Cross-platform compatibility**: Better Windows/macOS/Linux support
- **Error handling**: Improved error reporting and user feedback

## Files Changed
- 100 files changed
- 13 new Serena memory files documenting project state
- 41 validation rules updated for consistency
- 30+ GitHub Actions and READMEs improved
- Core tooling configuration enhanced

* feat: comprehensive GitHub Actions improvements and PR review fixes

Major Infrastructure Improvements:
- Add comprehensive testing framework with 17+ ShellSpec validation tests
- Implement Docker-based testing tools with automated test runner
- Add CodeRabbit configuration for automated code reviews
- Restructure documentation and memory management system
- Update validation rules for 25+ actions with enhanced input validation
- Modernize CI/CD workflows and testing infrastructure

Critical PR Review Fixes (All Issues Resolved):
- Fix double caching in node-setup (eliminate redundant cache operations)
- Optimize shell pipeline in version-file-parser (single awk vs complex pipeline)
- Fix GitHub expression interpolation in prettier-check cache keys
- Resolve terraform command order issue (validation after setup)
- Add missing flake8-sarif dependency for Python SARIF output
- Fix environment variable scope in pr-lint (export to GITHUB_ENV)

Performance & Reliability:
- Eliminate duplicate cache operations saving CI time
- Improve shell script efficiency with optimized parsing
- Fix command execution dependencies preventing runtime failures
- Ensure proper dependency installation for all linting tools
- Resolve workflow conditional logic issues

Security & Quality:
- All input validation rules updated with latest security patterns
- Cross-platform compatibility improvements maintained
- Comprehensive error handling and retry logic preserved
- Modern development tooling and best practices adopted

This commit addresses 100% of actionable feedback from PR review analysis,
implements comprehensive testing infrastructure, and maintains high code
quality standards across all 41 GitHub Actions.

* feat: enhance expression handling and version parsing

- Fix node-setup force-version expression logic for proper empty string handling
- Improve version-file-parser with secure regex validation and enhanced Python detection
- Add CodeRabbit configuration for CalVer versioning and README review guidance

* feat(validate-inputs): implement modular validation system

- Add modular validator architecture with specialized validators
- Implement base validator classes for different input types
- Add validators: boolean, docker, file, network, numeric, security, token, version
- Add convention mapper for automatic input validation
- Add comprehensive documentation for the validation system
- Implement PCRE regex support and injection protection

* feat(validate-inputs): add validation rules for all actions

- Add YAML validation rules for 42 GitHub Actions
- Auto-generated rules with convention mappings
- Include metadata for validation coverage and quality indicators
- Mark rules as auto-generated to prevent manual edits

* test(validate-inputs): add comprehensive test suite for validators

- Add unit tests for all validator modules
- Add integration tests for the validation system
- Add fixtures for version test data
- Test coverage for boolean, docker, file, network, numeric, security, token, and version validators
- Add tests for convention mapper and registry

* feat(tools): add validation scripts and utilities

- Add update-validators.py script for auto-generating rules
- Add benchmark-validator.py for performance testing
- Add debug-validator.py for troubleshooting
- Add generate-tests.py for test generation
- Add check-rules-not-manually-edited.sh for CI validation
- Add fix-local-action-refs.py tool for fixing action references

* feat(actions): add CustomValidator.py files for specialized validation

- Add custom validators for actions requiring special validation logic
- Implement validators for docker, go, node, npm, php, python, terraform actions
- Add specialized validation for compress-images, common-cache, common-file-check
- Implement version detection validators with language-specific logic
- Add validation for build arguments, architectures, and version formats

* test: update ShellSpec test framework for Python validation

- Update all validation.spec.sh files to use Python validator
- Add shared validation_core.py for common test utilities
- Remove obsolete bash validation helpers
- Update test output expectations for Python validator format
- Add codeql-analysis test suite
- Refactor framework utilities for Python integration
- Remove deprecated test files

* feat(actions): update action.yml files to use validate-inputs

- Replace inline bash validation with validate-inputs action
- Standardize validation across all 42 actions
- Add new codeql-analysis action
- Update action metadata and branding
- Add validation step as first step in composite actions
- Maintain backward compatibility with existing inputs/outputs

* ci: update GitHub workflows for enhanced security and testing

- Add new codeql-new.yml workflow
- Update security scanning workflows
- Enhance dependency review configuration
- Update test-actions workflow for new validation system
- Improve workflow permissions and security settings
- Update action versions to latest SHA-pinned releases

* build: update build configuration and dependencies

- Update Makefile with new validation targets
- Add Python dependencies in pyproject.toml
- Update npm dependencies and scripts
- Enhance Docker testing tools configuration
- Add targets for validator updates and local ref fixes
- Configure uv for Python package management

* chore: update linting and documentation configuration

- Update EditorConfig settings for consistent formatting
- Enhance pre-commit hooks configuration
- Update prettier and yamllint ignore patterns
- Update gitleaks security scanning rules
- Update CodeRabbit review configuration
- Update CLAUDE.md with latest project standards and rules

* docs: update Serena memory files and project metadata

- Remove obsolete PR-186 memory files
- Update project overview with current architecture
- Update project structure documentation
- Add quality standards and communication guidelines
- Add modular validator architecture documentation
- Add shellspec testing framework documentation
- Update project.yml with latest configuration

* feat: moved rules.yml to same folder as action, fixes

* fix(validators): correct token patterns and fix validator bugs

- Fix GitHub classic PAT pattern: ghp_ + 36 chars = 40 total
- Fix GitHub fine-grained PAT pattern: github_pat_ + 71 chars = 82 total
- Initialize result variable in convention_mapper to prevent UnboundLocalError
- Fix empty URL validation in network validator to return error
- Add GitHub expression check to docker architectures validator
- Update docker-build CustomValidator parallel-builds max to 16

* test(validators): fix test fixtures and expectations

- Fix token lengths in test data: github_pat 71 chars, ghp/gho 36 chars
- Update integration tests with correct token lengths
- Fix file validator test to expect absolute paths rejected for security
- Rename TestGenerator import to avoid pytest collection warning
- Update custom validator tests with correct input names
- Change docker-build tests: platforms->architectures, tags->tag
- Update docker-publish tests to match new registry enum validation

* test(shellspec): fix token lengths in test helpers and specs

- Fix default token lengths in spec_helper.sh to use correct 40-char format
- Update csharp-publish default tokens in 4 locations
- Update codeql-analysis default tokens in 2 locations
- Fix codeql-analysis test tokens to correct lengths (40 and 82 chars)
- Fix npm-publish fine-grained token test to use 82-char format

* feat(actions): add permissions documentation and environment variable usage

- Add permissions comments to all action.yml files documenting required GitHub permissions
- Convert direct input usage to environment variables in shell steps for security
- Add validation steps with proper error handling
- Update input descriptions and add security notes where applicable
- Ensure all actions follow consistent patterns for input validation

* chore(workflows): update GitHub Actions workflow versions

- Update workflow action versions to latest
- Improve workflow consistency and maintainability

* docs(security): add comprehensive security policy

- Document security features and best practices
- Add vulnerability reporting process
- Include audit history and security testing information

* docs(memory): add GitHub workflow reference documentation

- Add GitHub Actions workflow commands reference
- Add GitHub workflow expressions guide
- Add secure workflow usage patterns and best practices

* chore: token optimization, code style conventions
* chore: cr fixes
* fix: trivy reported Dockerfile problems
* fix(security): more security fixes
* chore: dockerfile and make targets for publishing
* fix(ci): add creds to test-actions workflow
* fix: security fix and checkout step to codeql-new
* chore: test fixes
* fix(security): codeql detected issues
* chore: code review fixes, ReDos protection
* style: apply MegaLinter fixes
* fix(ci): missing packages read permission
* fix(ci): add missing working directory setting
* chore: linting, add validation-regex to use regex_pattern
* chore: code review fixes
* chore(deps): update actions
* fix(security): codeql fixes
* chore(cr): apply cr comments
* chore: improve POSIX compatibility
* chore(cr): apply cr comments
* fix: codeql warning in Dockerfile, build failures
* chore(cr): apply cr comments
* fix: docker-testing-tools/Dockerfile
* chore(cr): apply cr comments
* fix(docker): update testing-tools image for GitHub Actions compatibility
* chore(cr): apply cr comments
* feat: add more tests, fix issues
* chore: fix codeql issues, update actions
* chore(cr): apply cr comments
* fix: integration tests
* chore: deduplication and fixes
* style: apply MegaLinter fixes
* chore(cr): apply cr comments
* feat: dry-run mode for generate-tests
* fix(ci): kcov installation
* chore(cr): apply cr comments
* chore(cr): apply cr comments
* chore(cr): apply cr comments
* chore(cr): apply cr comments, simplify action testing, use uv
* fix: run-tests.sh action counting
* chore(cr): apply cr comments
* chore(cr): apply cr comments
2025-10-14 13:37:58 +03:00

883 lines
33 KiB
Python
Executable File

#!/usr/bin/env python3
"""
Shared validation core module for GitHub Actions.
This module consolidates all validation logic to eliminate duplication between
the framework validation and the centralized validator. It provides:
1. Standardized token patterns (resolved GitHub documentation discrepancies)
2. Common validation functions
3. Unified security validation
4. Centralized YAML parsing utilities
5. Command-line interface for ShellSpec test integration
This replaces inline Python code in ShellSpec tests and duplicate functions
across multiple files.
"""
from __future__ import annotations
import argparse
from pathlib import Path
import re
import sys
from typing import Any
import yaml # pylint: disable=import-error
class ValidationCore:
"""Core validation functionality with standardized patterns and functions."""
# Standardized token patterns - resolved based on GitHub documentation
# Fine-grained tokens are 50-255 characters with underscores
TOKEN_PATTERNS = {
"classic": r"^gh[efpousr]_[a-zA-Z0-9]{36}$",
"fine_grained": r"^github_pat_[A-Za-z0-9_]{50,255}$", # 50-255 chars with underscores
"installation": r"^ghs_[a-zA-Z0-9]{36}$",
"npm_classic": r"^npm_[a-zA-Z0-9]{40,}$", # NPM classic tokens
}
# Injection detection pattern - characters commonly used in command injection
INJECTION_CHARS_PATTERN = r"[;&|`$()]"
# Security injection patterns
SECURITY_PATTERNS = [
r";\s*(rm|del|format|shutdown|reboot)",
r"&&\s*(rm|del|format|shutdown|reboot)",
r"\|\s*(rm|del|format|shutdown|reboot)",
r"`[^`]*`", # Command substitution
r"\$\([^)]*\)", # Command substitution
# Path traversal only dangerous when combined with commands
r"\.\./.*;\s*(rm|del|format|shutdown|reboot)",
r"\.\.\\+.*;\s*(rm|del|format|shutdown|reboot)", # Windows: ..\ or ..\\ patterns
]
def __init__(self):
"""Initialize the validation core."""
def validate_github_token(self, token: str, *, required: bool = False) -> tuple[bool, str]:
"""
Validate GitHub token format using standardized PCRE patterns.
Args:
token: The token to validate
required: Whether the token is required
Returns:
Tuple of (is_valid, error_message)
"""
if not token or token.strip() == "":
if required:
return False, "Token is required but not provided"
return True, ""
# Allow GitHub Actions expressions
if token == "${{ github.token }}" or (token.startswith("${{") and token.endswith("}}")):
return True, ""
# Allow environment variable references (e.g., $GITHUB_TOKEN)
if re.match(r"^\$[A-Za-z_][\w]*$", token):
return True, ""
# Check against standardized token patterns
for _token_type, pattern in self.TOKEN_PATTERNS.items():
if re.match(pattern, token):
return True, ""
return (
False,
"Invalid token format. Expected: gh[efpousr]_* (36 chars), "
"github_pat_[A-Za-z0-9_]* (50-255 chars), ghs_* (36 chars), or npm_* (40+ chars)",
)
def validate_namespace_with_lookahead(self, namespace: str) -> tuple[bool, str]:
"""
Validate namespace using lookahead pattern for .NET namespaces.
Args:
namespace: The namespace to validate
Returns:
Tuple of (is_valid, error_message)
"""
if not namespace or namespace.strip() == "":
return False, "Namespace cannot be empty"
# Pattern with lookahead ensures hyphens are only allowed when followed by alphanumeric
pattern = r"^[a-zA-Z0-9]([a-zA-Z0-9]|-(?=[a-zA-Z0-9])){0,38}$"
if re.match(pattern, namespace):
return True, ""
return (
False,
"Invalid namespace format. Must be 1-39 characters, "
"alphanumeric and hyphens, no trailing hyphens",
)
def validate_security_patterns(
self,
input_value: str,
input_name: str = "",
) -> tuple[bool, str]:
"""
Check for common security injection patterns.
Args:
input_value: The value to validate
input_name: Name of the input (for context)
Returns:
Tuple of (is_valid, error_message)
"""
# Allow empty values for most inputs (they're often optional)
if not input_value or input_value.strip() == "":
return True, ""
for pattern in self.SECURITY_PATTERNS:
if re.search(pattern, input_value, re.IGNORECASE):
return (
False,
f"Potential security injection pattern detected in {input_name or 'input'}",
)
return True, ""
def validate_boolean(self, value: str, input_name: str) -> tuple[bool, str]:
"""Validate boolean input with intelligent fallback for misclassified inputs."""
# Handle empty values
if not value:
return True, ""
# Standard boolean values
if value.lower() in ["true", "false"]:
return True, ""
# Intelligent fallback for misclassified inputs
# If input name suggests it should accept paths/directories, validate as such
if any(
keyword in input_name.lower()
for keyword in ["directories", "directory", "path", "file"]
):
return self.validate_cache_directories(value)
return False, f"Input '{input_name}' must be 'true' or 'false'"
def validate_version_format(
self,
value: str,
*,
allow_v_prefix: bool = False,
) -> tuple[bool, str]:
"""Validate semantic version format."""
if value.lower() == "latest":
return True, ""
if not allow_v_prefix and value.startswith("v"):
return False, f"Version should not start with 'v': {value}"
value = value.removeprefix("v") # Remove v prefix for validation
# Split validation to reduce complexity
# Base version: major.minor.patch (or simpler forms)
base_pattern = r"^[\d]+(\.[\d]+)?(\.[\d]+)?$"
# Version with prerelease/build: major.minor.patch-prerelease+build
extended_pattern = r"^[\d]+(\.[\d]+)?(\.[\d]+)?[-+][0-9A-Za-z.-]+$"
if re.match(base_pattern, value) or re.match(extended_pattern, value):
return True, ""
return False, f"Invalid version format: {value}"
def validate_file_path(self, value: str, *, allow_traversal: bool = False) -> tuple[bool, str]:
"""Validate file path format."""
if not value:
return True, ""
# Check for injection patterns
if re.search(self.INJECTION_CHARS_PATTERN, value):
return False, f"Potential injection detected in file path: {value}"
# Check for path traversal (unless explicitly allowed)
if not allow_traversal and ("../" in value or "..\\" in value):
return False, f"Path traversal not allowed: {value}"
# Check for absolute paths (often not allowed)
if value.startswith("/") or (len(value) > 1 and value[1] == ":"):
return False, f"Absolute paths not allowed: {value}"
return True, ""
def validate_docker_image_name(self, value: str) -> tuple[bool, str]:
"""Validate docker image name format."""
if not value:
return True, ""
# Split validation into parts to reduce regex complexity
# Valid format: lowercase alphanumeric with separators (., _, __, -) and optional namespace
if not re.match(r"^[a-z0-9]", value):
return False, f"Invalid docker image name format: {value}"
if not re.match(r"^[a-z0-9._/-]+$", value):
return False, f"Invalid docker image name format: {value}"
# Check for invalid patterns
if value.endswith((".", "_", "-", "/")):
return False, f"Invalid docker image name format: {value}"
if "//" in value or ".." in value:
return False, f"Invalid docker image name format: {value}"
return True, ""
def validate_docker_tag(self, value: str) -> tuple[bool, str]:
"""Validate Docker tag format."""
if not value:
return True, ""
# Docker tags must be valid ASCII and may contain lowercase and uppercase letters,
# digits, underscores, periods and dashes. Cannot start with period or dash.
# Max length is 128 characters.
if len(value) > 128:
return False, f"Docker tag too long (max 128 characters): {value}"
if not re.match(r"^[a-zA-Z0-9_][a-zA-Z0-9._-]*$", value):
return False, f"Invalid docker tag format: {value}"
return True, ""
def validate_php_extensions(self, value: str) -> tuple[bool, str]:
"""Validate PHP extensions format."""
if not value:
return True, ""
if re.search(r"[;&|`$()@#]", value):
return False, f"Potential injection detected in PHP extensions: {value}"
if not re.match(r"^[a-zA-Z0-9_,\s]+$", value):
return False, f"Invalid PHP extensions format: {value}"
return True, ""
def validate_coverage_driver(self, value: str) -> tuple[bool, str]:
"""Validate coverage driver."""
if value not in ["none", "xdebug", "pcov", "xdebug3"]:
return False, "Invalid coverage driver. Must be 'none', 'xdebug', 'pcov', or 'xdebug3'"
return True, ""
def validate_numeric_range(self, value: str, min_val: int, max_val: int) -> tuple[bool, str]:
"""Validate numeric value within range."""
try:
num = int(value)
if min_val <= num <= max_val:
return True, ""
return False, f"Value must be between {min_val} and {max_val}, got {num}"
except ValueError:
return False, f"Invalid numeric value: {value}"
def validate_php_version(self, value: str) -> tuple[bool, str]:
"""Validate PHP version format (allows X.Y and X.Y.Z)."""
if not value:
return True, ""
# PHP versions can be X.Y or X.Y.Z format
if re.match(r"^[\d]+\.[\d]+(\.[\d]+)?$", value):
return True, ""
return False, f"Invalid PHP version format: {value}"
def validate_composer_version(self, value: str) -> tuple[bool, str]:
"""Validate Composer version (1 or 2)."""
if value in ["1", "2"]:
return True, ""
return False, f"Invalid Composer version. Must be '1' or '2', got '{value}'"
def validate_stability(self, value: str) -> tuple[bool, str]:
"""Validate Composer stability."""
valid_stabilities = ["stable", "RC", "beta", "alpha", "dev"]
if value in valid_stabilities:
return True, ""
return False, f"Invalid stability. Must be one of: {', '.join(valid_stabilities)}"
def validate_cache_directories(self, value: str) -> tuple[bool, str]:
"""Validate cache directories (comma-separated paths)."""
if not value:
return True, ""
# Split by comma and validate each directory
directories = [d.strip() for d in value.split(",")]
for directory in directories:
if not directory:
continue
# Basic path validation
if re.search(self.INJECTION_CHARS_PATTERN, directory):
return False, f"Potential injection detected in directory path: {directory}"
# Check for path traversal (both Unix and Windows)
if re.search(r"\.\.[/\\]", directory):
return False, f"Path traversal not allowed in directory: {directory}"
# Check for absolute paths
if directory.startswith("/") or (len(directory) > 1 and directory[1] == ":"):
return False, f"Absolute paths not allowed in directory: {directory}"
return True, ""
def validate_tools(self, value: str) -> tuple[bool, str]:
"""Validate Composer tools format (allows @ for stability flags like dev-master@dev)."""
if not value:
return True, ""
# Check for injection patterns (@ removed to allow Composer stability flags)
if re.search(self.INJECTION_CHARS_PATTERN, value):
return False, f"Potential injection detected in tools: {value}"
return True, ""
def validate_numeric_range_1_10(self, value: str) -> tuple[bool, str]:
"""Validate numeric value between 1 and 10."""
return self.validate_numeric_range(value, 1, 10)
def validate_enhanced_business_logic(
self,
action_name: str,
input_name: str,
value: str,
) -> tuple[bool | None, str]:
"""
Enhanced business logic validation for specific action/input combinations.
Returns (None, "") if no enhanced validation applies, otherwise returns validation result.
"""
if not value: # Empty values are generally allowed, except for specific cases
# Some inputs should not be empty even if they're optional
if action_name == "php-composer" and input_name in ["composer-version"]:
return False, f"Empty {input_name} is not allowed"
return None, ""
# PHP Composer specific validations
if action_name == "php-composer":
return self._validate_php_composer_business_logic(input_name, value)
# Prettier-check specific validations
if action_name == "prettier-check":
return self._validate_prettier_check_business_logic(input_name, value)
# Add more action-specific validations here as needed
return None, "" # No enhanced validation applies
def _validate_composer_version(self, value: str) -> tuple[bool, str]:
"""Validate composer version input."""
if value not in ["1", "2"]:
return False, f"Composer version must be '1' or '2', got '{value}'"
return True, ""
def _validate_stability(self, value: str) -> tuple[bool, str]:
"""Validate stability input."""
valid_stabilities = ["stable", "RC", "beta", "alpha", "dev"]
if value not in valid_stabilities:
return (
False,
f"Invalid stability '{value}'. Must be one of: {', '.join(valid_stabilities)}",
)
return True, ""
def _validate_php_version(self, value: str) -> tuple[bool, str]:
"""Validate PHP version input."""
if not re.match(r"^[\d]+\.[\d]+(\.[\d]+)?$", value):
return False, f"Invalid PHP version format: {value}"
try:
major, minor = value.split(".")[:2]
major_num, minor_num = int(major), int(minor)
if major_num < 7:
return False, f"PHP version {value} is too old (minimum 7.0)"
if major_num > 20:
return False, f"Invalid PHP version: {value}"
if minor_num < 0 or minor_num > 99:
return False, f"Invalid PHP version: {value}"
except (ValueError, IndexError):
return False, f"Invalid PHP version format: {value}"
return True, ""
def _validate_extensions(self, value: str) -> tuple[bool, str]:
"""Validate PHP extensions input."""
if re.search(r"[@#$&*(){}\[\]|\\]", value):
return False, f"Invalid characters in PHP extensions: {value}"
return True, ""
def _validate_tools(self, value: str) -> tuple[bool, str]:
"""Validate tools input (@ allowed for Composer stability flags like dev-master@dev)."""
if re.search(r"[#$&*(){}\[\]|\\]", value):
return False, f"Invalid characters in tools specification: {value}"
return True, ""
def _validate_args(self, value: str) -> tuple[bool, str]:
"""Validate args input."""
if re.search(self.INJECTION_CHARS_PATTERN, value):
return False, f"Potentially dangerous characters in args: {value}"
return True, ""
def _validate_php_composer_business_logic(
self,
input_name: str,
value: str,
) -> tuple[bool | None, str]:
"""Business logic validation specific to php-composer action."""
validators = {
"composer-version": self._validate_composer_version,
"stability": self._validate_stability,
"php": self._validate_php_version,
"extensions": self._validate_extensions,
"tools": self._validate_tools,
"args": self._validate_args,
}
if input_name in validators:
is_valid, error_msg = validators[input_name](value)
return is_valid, error_msg
return None, "" # No specific validation for this input
def _validate_file_pattern_security(self, value: str) -> tuple[bool, str]:
"""Validate file-pattern for security issues."""
if ".." in value:
return False, "Path traversal detected in file-pattern"
if value.startswith("/"):
return False, "Absolute path not allowed in file-pattern"
if "$" in value:
return False, "Shell expansion not allowed in file-pattern"
return True, ""
def _validate_plugins_security(self, value: str) -> tuple[bool, str]:
"""Validate plugins for security issues."""
if re.search(self.INJECTION_CHARS_PATTERN, value):
return False, "Potentially dangerous characters in plugins"
if re.search(r"\$\{.*\}", value):
return False, "Variable expansion not allowed in plugins"
if re.search(r"\$\(.*\)", value):
return False, "Command substitution not allowed in plugins"
return True, ""
def _validate_prettier_check_business_logic(
self,
input_name: str,
value: str,
) -> tuple[bool | None, str]:
"""Business logic validation specific to prettier-check action."""
# Handle prettier-version specially (accepts "latest" or semantic version)
if input_name == "prettier-version":
if value == "latest":
return True, ""
# Otherwise validate as semantic version
return None, "" # Let standard semantic version validation handle it
# Validate file-pattern for security issues
if input_name == "file-pattern":
return self._validate_file_pattern_security(value)
# Validate report-format enum
if input_name == "report-format":
if value == "":
return False, "report-format cannot be empty"
if value not in ["json", "sarif"]:
return False, f"Invalid report-format: {value}"
return True, ""
# Validate plugins for security issues
if input_name == "plugins":
return self._validate_plugins_security(value)
return None, "" # No specific validation for this input
class ActionFileParser:
"""Parser for GitHub Action YAML files."""
@staticmethod
def load_action_file(action_file: str) -> dict[str, Any]:
"""Load and parse an action.yml file."""
try:
with Path(action_file).open(encoding="utf-8") as f:
return yaml.safe_load(f)
except (OSError, yaml.YAMLError) as e:
msg = f"Failed to load action file {action_file}: {e}"
raise ValueError(msg) from e
@staticmethod
def get_action_name(action_file: str) -> str:
"""Get the action name from an action.yml file."""
try:
data = ActionFileParser.load_action_file(action_file)
return data.get("name", "Unknown")
except (OSError, ValueError, yaml.YAMLError, AttributeError):
return "Unknown"
@staticmethod
def get_action_inputs(action_file: str) -> list[str]:
"""Get all input names from an action.yml file."""
try:
data = ActionFileParser.load_action_file(action_file)
inputs = data.get("inputs", {})
return list(inputs.keys())
except (OSError, ValueError, yaml.YAMLError, AttributeError):
return []
@staticmethod
def get_action_outputs(action_file: str) -> list[str]:
"""Get all output names from an action.yml file."""
try:
data = ActionFileParser.load_action_file(action_file)
outputs = data.get("outputs", {})
return list(outputs.keys())
except (OSError, ValueError, yaml.YAMLError, AttributeError):
return []
@staticmethod
def _get_required_property(input_data: dict, property_name: str) -> str:
"""Get the required/optional property."""
is_required = input_data.get("required") in [True, "true"]
if property_name == "required":
return "required" if is_required else "optional"
return "optional" if not is_required else "required"
@staticmethod
def _get_default_property(input_data: dict) -> str:
"""Get the default property."""
default_value = input_data.get("default", "")
return str(default_value) if default_value else "no-default"
@staticmethod
def _get_description_property(input_data: dict) -> str:
"""Get the description property."""
description = input_data.get("description", "")
return description if description else "no-description"
@staticmethod
def _get_all_optional_property(inputs: dict) -> str:
"""Get the all_optional property (list of required inputs)."""
required_inputs = [k for k, v in inputs.items() if v.get("required") in [True, "true"]]
return "none" if not required_inputs else ",".join(required_inputs)
@staticmethod
def get_input_property(action_file: str, input_name: str, property_name: str) -> str:
"""
Get a property of an input from an action.yml file.
Args:
action_file: Path to the action.yml file
input_name: Name of the input to check
property_name: Property to check (required, optional, default, description,
all_optional)
Returns:
- For 'required': 'required' or 'optional'
- For 'optional': 'optional' or 'required'
- For 'default': the default value or 'no-default'
- For 'description': the description or 'no-description'
- For 'all_optional': 'none' if no required inputs, else comma-separated list
"""
try:
data = ActionFileParser.load_action_file(action_file)
inputs = data.get("inputs", {})
input_data = inputs.get(input_name, {})
property_handlers = {
"required": lambda: ActionFileParser._get_required_property(
input_data, property_name
),
"optional": lambda: ActionFileParser._get_required_property(
input_data, property_name
),
"default": lambda: ActionFileParser._get_default_property(input_data),
"description": lambda: ActionFileParser._get_description_property(input_data),
"all_optional": lambda: ActionFileParser._get_all_optional_property(inputs),
}
if property_name in property_handlers:
return property_handlers[property_name]()
return f"unknown-property-{property_name}"
except (OSError, ValueError, yaml.YAMLError, AttributeError, KeyError) as e:
return f"error: {e}"
def resolve_action_file_path(action_dir: str) -> str:
"""Resolve the path to the action.yml file."""
action_dir_path = Path(action_dir)
if not action_dir_path.is_absolute():
# If relative, assume we're in _tests/shared and actions are at ../../
script_dir = Path(__file__).resolve().parent
project_root = script_dir.parent.parent
return str(project_root / action_dir / "action.yml")
return f"{action_dir}/action.yml"
def _apply_validation_by_type(
validator: ValidationCore,
validation_type: str,
input_value: str,
input_name: str,
required_inputs: list,
) -> tuple[bool, str]:
"""Apply validation based on the validation type."""
validation_map = {
"github_token": lambda: validator.validate_github_token(
input_value, required=input_name in required_inputs
),
"namespace_with_lookahead": lambda: validator.validate_namespace_with_lookahead(
input_value,
),
"boolean": lambda: validator.validate_boolean(input_value, input_name),
"file_path": lambda: validator.validate_file_path(input_value),
"docker_image_name": lambda: validator.validate_docker_image_name(input_value),
"docker_tag": lambda: validator.validate_docker_tag(input_value),
"php_extensions": lambda: validator.validate_php_extensions(input_value),
"coverage_driver": lambda: validator.validate_coverage_driver(input_value),
"php_version": lambda: validator.validate_php_version(input_value),
"composer_version": lambda: validator.validate_composer_version(input_value),
"stability": lambda: validator.validate_stability(input_value),
"cache_directories": lambda: validator.validate_cache_directories(input_value),
"tools": lambda: validator.validate_tools(input_value),
"numeric_range_1_10": lambda: validator.validate_numeric_range_1_10(input_value),
}
# Handle version formats
if validation_type in ["semantic_version", "calver_version", "flexible_version"]:
return validator.validate_version_format(input_value)
if validation_type == "terraform_version":
return validator.validate_version_format(input_value, allow_v_prefix=True)
# Use validation map for other types
if validation_type in validation_map:
return validation_map[validation_type]()
return True, "" # Unknown validation type, assume valid
def _load_and_validate_rules(
rules_file: Path,
input_name: str,
input_value: str,
) -> tuple[str | None, dict, list]:
"""Load validation rules and perform basic validation."""
try:
with Path(rules_file).open(encoding="utf-8") as f:
rules_data = yaml.safe_load(f)
conventions = rules_data.get("conventions", {})
overrides = rules_data.get("overrides", {})
required_inputs = rules_data.get("required_inputs", [])
# Check if input is required and empty
if input_name in required_inputs and (not input_value or input_value.strip() == ""):
return None, {}, [] # Will cause error in caller
# Get validation type
validation_type = overrides.get(input_name, conventions.get(input_name))
return validation_type, rules_data, required_inputs
except (OSError, yaml.YAMLError, KeyError, AttributeError):
return None, {}, []
def validate_input(action_dir: str, input_name: str, input_value: str) -> tuple[bool | None, str]:
"""
Validate an input value for a specific action.
This is the main validation entry point that replaces the complex
validation logic in the original framework.
"""
validator = ValidationCore()
# Always perform security validation first
security_valid, security_error = validator.validate_security_patterns(input_value, input_name)
if not security_valid:
return False, security_error
# Get action name for business logic and rules
action_name = Path(action_dir).name
# Check enhanced business logic first (takes precedence over general rules)
enhanced_validation = validator.validate_enhanced_business_logic(
action_name,
input_name,
input_value,
)
if enhanced_validation[0] is not None: # If enhanced validation has an opinion
return enhanced_validation
# Load validation rules from action folder
script_dir = Path(__file__).resolve().parent
project_root = script_dir.parent.parent
rules_file = project_root / action_name / "rules.yml"
if rules_file.exists():
validation_type, _rules_data, required_inputs = _load_and_validate_rules(
rules_file,
input_name,
input_value,
)
# Check for required input error
if input_name in required_inputs and (not input_value or input_value.strip() == ""):
return False, f"Required input '{input_name}' cannot be empty"
if validation_type:
try:
return _apply_validation_by_type(
validator,
validation_type,
input_value,
input_name,
required_inputs,
)
except (ValueError, AttributeError, KeyError, TypeError) as e:
print(
f"Warning: Could not apply validation for {action_name}: {e}",
file=sys.stderr,
)
# If no specific validation found, the security check is sufficient
return True, ""
def _handle_legacy_interface():
"""Handle legacy CLI interface for backward compatibility."""
if len(sys.argv) == 5 and all(not arg.startswith("-") for arg in sys.argv[1:]):
action_dir, input_name, input_value, expected_result = sys.argv[1:5]
is_valid, error_msg = validate_input(action_dir, input_name, input_value)
actual_result = "success" if is_valid else "failure"
if actual_result == expected_result:
sys.exit(0)
else:
print(f"Expected {expected_result}, got {actual_result}: {error_msg}", file=sys.stderr)
sys.exit(1)
return False # Not legacy interface
def _create_argument_parser():
"""Create and configure the argument parser."""
parser = argparse.ArgumentParser(
description="Shared validation core for GitHub Actions",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Validate an input value
python3 validation_core.py --validate action-dir input-name input-value
# Get input property
python3 validation_core.py --property action.yml input-name required
# List inputs
python3 validation_core.py --inputs action.yml
# List outputs
python3 validation_core.py --outputs action.yml
# Get action name
python3 validation_core.py --name action.yml
""",
)
mode_group = parser.add_mutually_exclusive_group(required=True)
mode_group.add_argument(
"--validate",
nargs=3,
metavar=("ACTION_DIR", "INPUT_NAME", "INPUT_VALUE"),
help="Validate an input value",
)
mode_group.add_argument(
"--property",
nargs=3,
metavar=("ACTION_FILE", "INPUT_NAME", "PROPERTY"),
help="Get input property",
)
mode_group.add_argument("--inputs", metavar="ACTION_FILE", help="List action inputs")
mode_group.add_argument("--outputs", metavar="ACTION_FILE", help="List action outputs")
mode_group.add_argument("--name", metavar="ACTION_FILE", help="Get action name")
mode_group.add_argument(
"--validate-yaml",
metavar="YAML_FILE",
help="Validate YAML file syntax",
)
return parser
def _handle_validate_command(args):
"""Handle the validate command."""
action_dir, input_name, input_value = args.validate
is_valid, error_msg = validate_input(action_dir, input_name, input_value)
if is_valid:
sys.exit(0)
else:
print(f"INVALID: {error_msg}", file=sys.stderr)
sys.exit(1)
def _handle_property_command(args):
"""Handle the property command."""
action_file, input_name, property_name = args.property
result = ActionFileParser.get_input_property(action_file, input_name, property_name)
print(result)
def _handle_inputs_command(args):
"""Handle the inputs command."""
inputs = ActionFileParser.get_action_inputs(args.inputs)
for input_name in inputs:
print(input_name)
def _handle_outputs_command(args):
"""Handle the outputs command."""
outputs = ActionFileParser.get_action_outputs(args.outputs)
for output_name in outputs:
print(output_name)
def _handle_name_command(args):
"""Handle the name command."""
name = ActionFileParser.get_action_name(args.name)
print(name)
def _handle_validate_yaml_command(args):
"""Handle the validate-yaml command."""
try:
with Path(args.validate_yaml).open(encoding="utf-8") as f:
yaml.safe_load(f)
sys.exit(0)
except (OSError, yaml.YAMLError) as e:
print(f"Invalid YAML: {e}", file=sys.stderr)
sys.exit(1)
def _execute_command(args):
"""Execute the appropriate command based on arguments."""
command_handlers = {
"validate": _handle_validate_command,
"property": _handle_property_command,
"inputs": _handle_inputs_command,
"outputs": _handle_outputs_command,
"name": _handle_name_command,
"validate_yaml": _handle_validate_yaml_command,
}
for command, handler in command_handlers.items():
if getattr(args, command, None):
handler(args)
return
def main():
"""Command-line interface for validation core."""
# Handle legacy interface first
_handle_legacy_interface()
# Parse arguments and execute command
parser = _create_argument_parser()
args = parser.parse_args()
try:
_execute_command(args)
except (ValueError, OSError, AttributeError) as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()