feat: fixes, tweaks, new actions, linting (#186)

* feat: fixes, tweaks, new actions, linting
* fix: improve docker publish loops and dotnet parsing (#193)
* fix: harden action scripts and version checks (#191)
* refactor: major repository restructuring and security enhancements

Add comprehensive development infrastructure:
- Add Makefile with automated documentation generation, formatting, and linting tasks
- Add TODO.md tracking self-containment progress and repository improvements
- Add .nvmrc for consistent Node.js version management
- Create python-version-detect-v2 action for enhanced Python detection

Enhance all GitHub Actions with standardized patterns:
- Add consistent token handling across 27 actions using standardized input patterns
- Implement bash error handling (set -euo pipefail) in all shell steps
- Add comprehensive input validation for path traversal and command injection protection
- Standardize checkout token authentication to prevent rate limiting
- Remove relative action dependencies to ensure external usability

Rewrite security workflow for PR-focused analysis:
- Transform security-suite.yml to PR-only security analysis workflow
- Remove scheduled runs, repository issue management, and Slack notifications
- Implement smart comment generation showing only sections with content
- Add GitHub Actions permission diff analysis and new action detection
- Integrate OWASP, Semgrep, and TruffleHog for comprehensive PR security scanning

Improve version detection and dependency management:
- Simplify version detection actions to use inline logic instead of shared utilities
- Fix Makefile version detection fallback to properly return 'main' when version not found
- Update all external action references to use SHA-pinned versions
- Remove deprecated run.sh in favor of Makefile automation

Update documentation and project standards:
- Enhance CLAUDE.md with self-containment requirements and linting standards
- Update README.md with improved action descriptions and usage examples
- Standardize code formatting with updated .editorconfig and .prettierrc.yml
- Improve GitHub templates for issues and security reporting

This refactoring ensures all 40 actions are fully self-contained and can be used independently when
referenced as ivuorinen/actions/action-name@main, addressing the critical requirement for external
usability while maintaining comprehensive security analysis and development automation.

* feat: add automated action catalog generation system

- Create generate_listing.cjs script for comprehensive action catalog
- Add package.json with development tooling and npm scripts
- Implement automated README.md catalog section with --update flag
- Generate markdown reference-style links for all 40 actions
- Add categorized tables with features, language support matrices
- Replace static reference links with auto-generated dynamic links
- Enable complete automation of action documentation maintenance

* feat: enhance actions with improved documentation and functionality

- Add comprehensive README files for 12 actions with usage examples
- Implement new utility actions (go-version-detect, dotnet-version-detect)
- Enhance node-setup with extensive configuration options
- Improve error handling and validation across all actions
- Update package.json scripts for better development workflow
- Expand TODO.md with detailed roadmap and improvement plans
- Standardize action structure with consistent inputs/outputs

* feat: add comprehensive output handling across all actions

- Add standardized outputs to 15 actions that previously had none
- Implement consistent snake_case naming convention for all outputs
- Add build status and test results outputs to build actions
- Add files changed and status outputs to lint/fix actions
- Add test execution metrics to php-tests action
- Add stale/closed counts to stale action
- Add release URLs and IDs to github-release action
- Update documentation with output specifications
- Mark comprehensive output handling task as complete in TODO.md

* feat: implement shared cache strategy across all actions

- Add caching to 10 actions that previously had none (Node.js, .NET, Python, Go)
- Standardize 4 existing actions to use common-cache instead of direct actions/cache
- Implement consistent cache-hit optimization to skip installations when cache available
- Add language-specific cache configurations with appropriate key files
- Create unified caching approach using ivuorinen/actions/common-cache@main
- Fix YAML syntax error in php-composer action paths parameter
- Update TODO.md to mark shared cache strategy as complete

* feat: implement comprehensive retry logic for network operations

- Create new common-retry action for standardized retry patterns with configurable strategies
- Add retry logic to 9 actions missing network retry capabilities
- Implement exponential backoff, custom timeouts, and flexible error handling
- Add max-retries input parameter to all network-dependent actions (Node.js, .NET, Python, Go)
- Standardize existing retry implementations to use common-retry utility
- Update action catalog to include new common-retry action (41 total actions)
- Update documentation with retry configuration examples and parameters
- Mark retry logic implementation as complete in TODO.md roadmap

* feat: enhance Node.js support with Corepack and Bun

- Add Corepack support for automatic package manager version management
- Add Bun package manager support across all Node.js actions
- Improve Yarn Berry/PnP support with .yarnrc.yml detection
- Add Node.js feature detection (ESM, TypeScript, frameworks)
- Update package manager detection priority and lockfile support
- Enhance caching with package-manager-specific keys
- Update eslint, prettier, and biome actions for multi-package-manager support

* fix: resolve critical runtime issues across multiple actions

- Fix token validation by removing ineffective literal string comparisons
- Add missing @microsoft/eslint-formatter-sarif dependency for SARIF output
- Fix Bash variable syntax errors in username and changelog length checks
- Update Dockerfile version regex to handle tags with suffixes (e.g., -alpine)
- Simplify version selection logic with single grep command
- Fix command execution in retry action with proper bash -c wrapper
- Correct step output references using .outcome instead of .outputs.outcome
- Add missing step IDs for version detection actions
- Include go.mod in cache key files for accurate invalidation
- Require minor version in all version regex patterns
- Improve Bun installation security by verifying script before execution
- Replace bc with sort -V for portable PHP version comparison
- Remove non-existent pre-commit output references

These fixes ensure proper runtime behavior, improved security, and better
cross-platform compatibility across all affected actions.

* fix: resolve critical runtime and security issues across actions

- Fix biome-fix files_changed calculation using git diff instead of git status delta
- Fix compress-images output description and add absolute path validation
- Remove csharp-publish token default and fix token fallback in push commands
- Add @microsoft/eslint-formatter-sarif to all package managers in eslint-check
- Fix eslint-check command syntax by using variable assignment
- Improve node-setup Bun installation security and remove invalid frozen-lockfile flag
- Fix pre-commit token validation by removing ineffective literal comparison
- Fix prettier-fix token comparison and expand regex for all GitHub token types
- Add version-file-parser regex validation safety and fix csproj wildcard handling

These fixes address security vulnerabilities, runtime errors, and functional issues
to ensure reliable operation across all affected GitHub Actions.

* feat: enhance Docker actions with advanced multi-architecture support

Major enhancement to Docker build and publish actions with comprehensive
multi-architecture capabilities and enterprise-grade features.

Added features:
- Advanced buildx configuration (version control, cache modes, build contexts)
- Auto-detect platforms for dynamic architecture discovery
- Performance optimizations with enhanced caching strategies
- Security scanning with Trivy and image signing with Cosign
- SBOM generation in multiple formats with validation
- Verbose logging and dry-run modes for debugging
- Platform-specific build args and fallback mechanisms

Enhanced all Docker actions:
- docker-build: Core buildx features and multi-arch support
- docker-publish-gh: GitHub Packages with security features
- docker-publish-hub: Docker Hub with scanning and signing
- docker-publish: Orchestrator with unified configuration

Updated documentation across all modified actions.

* fix: resolve documentation generation placeholder issue

Fixed Makefile and package.json to properly replace placeholder tokens in generated documentation, ensuring all README files show correct repository paths instead of ***PROJECT***@***VERSION***.

* chore: simplify github token validation
* chore(lint): optional yamlfmt, config and fixes
* feat: use relative `uses` names

* feat: comprehensive testing infrastructure and Python validation system

- Migrate from tests/ to _tests/ directory structure with ShellSpec framework
- Add comprehensive validation system with Python-based input validation
- Implement dual testing approach (ShellSpec + pytest) for complete coverage
- Add modern Python tooling (uv, ruff, pytest-cov) and dependencies
- Create centralized validation rules with automatic generation system
- Update project configuration and build system for new architecture
- Enhance documentation to reflect current testing capabilities

This establishes a robust foundation for action validation and testing
with extensive coverage across all GitHub Actions in the repository.

* chore: remove Dockerfile for now
* chore: code review fixes

* feat: comprehensive GitHub Actions restructuring and tooling improvements

This commit represents a major restructuring of the GitHub Actions monorepo
with improved tooling, testing infrastructure, and comprehensive PR #186
review implementation.

## Major Changes

### 🔧 Development Tooling & Configuration
- **Shellcheck integration**: Exclude shellspec test files from linting
  - Updated .pre-commit-config.yaml to exclude _tests/*.sh from shellcheck/shfmt
  - Modified Makefile shellcheck pattern to skip shellspec files
  - Updated CLAUDE.md documentation with proper exclusion syntax
- **Testing infrastructure**: Enhanced Python validation framework
  - Fixed nested if statements and boolean parameter issues in validation.py
  - Improved code quality with explicit keyword arguments
  - All pre-commit hooks now passing

### 🏗️ Project Structure & Documentation
- **Added Serena AI integration** with comprehensive project memories:
  - Project overview, structure, and technical stack documentation
  - Code style conventions and completion requirements
  - Comprehensive PR #186 review analysis and implementation tracking
- **Enhanced configuration**: Updated .gitignore, .yamlfmt.yml, pyproject.toml
- **Improved testing**: Added integration workflows and enhanced test specs

### 🚀 GitHub Actions Improvements (30+ actions updated)
- **Centralized validation**: Updated 41 validation rule files
- **Enhanced actions**: Improvements across all action categories:
  - Setup actions (node-setup, version detectors)
  - Utility actions (version-file-parser, version-validator)
  - Linting actions (biome, eslint, terraform-lint-fix major refactor)
  - Build/publish actions (docker-build, npm-publish, csharp-*)
  - Repository management actions

### 📝 Documentation Updates
- **README consistency**: Updated version references across action READMEs
- **Enhanced documentation**: Improved action descriptions and usage examples
- **CLAUDE.md**: Updated with current tooling and best practices

## Technical Improvements
- **Security enhancements**: Input validation and sanitization improvements
- **Performance optimizations**: Streamlined action logic and dependencies
- **Cross-platform compatibility**: Better Windows/macOS/Linux support
- **Error handling**: Improved error reporting and user feedback

## Files Changed
- 100 files changed
- 13 new Serena memory files documenting project state
- 41 validation rules updated for consistency
- 30+ GitHub Actions and READMEs improved
- Core tooling configuration enhanced

* feat: comprehensive GitHub Actions improvements and PR review fixes

Major Infrastructure Improvements:
- Add comprehensive testing framework with 17+ ShellSpec validation tests
- Implement Docker-based testing tools with automated test runner
- Add CodeRabbit configuration for automated code reviews
- Restructure documentation and memory management system
- Update validation rules for 25+ actions with enhanced input validation
- Modernize CI/CD workflows and testing infrastructure

Critical PR Review Fixes (All Issues Resolved):
- Fix double caching in node-setup (eliminate redundant cache operations)
- Optimize shell pipeline in version-file-parser (single awk vs complex pipeline)
- Fix GitHub expression interpolation in prettier-check cache keys
- Resolve terraform command order issue (validation after setup)
- Add missing flake8-sarif dependency for Python SARIF output
- Fix environment variable scope in pr-lint (export to GITHUB_ENV)

Performance & Reliability:
- Eliminate duplicate cache operations saving CI time
- Improve shell script efficiency with optimized parsing
- Fix command execution dependencies preventing runtime failures
- Ensure proper dependency installation for all linting tools
- Resolve workflow conditional logic issues

Security & Quality:
- All input validation rules updated with latest security patterns
- Cross-platform compatibility improvements maintained
- Comprehensive error handling and retry logic preserved
- Modern development tooling and best practices adopted

This commit addresses 100% of actionable feedback from PR review analysis,
implements comprehensive testing infrastructure, and maintains high code
quality standards across all 41 GitHub Actions.

* feat: enhance expression handling and version parsing

- Fix node-setup force-version expression logic for proper empty string handling
- Improve version-file-parser with secure regex validation and enhanced Python detection
- Add CodeRabbit configuration for CalVer versioning and README review guidance

* feat(validate-inputs): implement modular validation system

- Add modular validator architecture with specialized validators
- Implement base validator classes for different input types
- Add validators: boolean, docker, file, network, numeric, security, token, version
- Add convention mapper for automatic input validation
- Add comprehensive documentation for the validation system
- Implement PCRE regex support and injection protection

* feat(validate-inputs): add validation rules for all actions

- Add YAML validation rules for 42 GitHub Actions
- Auto-generated rules with convention mappings
- Include metadata for validation coverage and quality indicators
- Mark rules as auto-generated to prevent manual edits

* test(validate-inputs): add comprehensive test suite for validators

- Add unit tests for all validator modules
- Add integration tests for the validation system
- Add fixtures for version test data
- Test coverage for boolean, docker, file, network, numeric, security, token, and version validators
- Add tests for convention mapper and registry

* feat(tools): add validation scripts and utilities

- Add update-validators.py script for auto-generating rules
- Add benchmark-validator.py for performance testing
- Add debug-validator.py for troubleshooting
- Add generate-tests.py for test generation
- Add check-rules-not-manually-edited.sh for CI validation
- Add fix-local-action-refs.py tool for fixing action references

* feat(actions): add CustomValidator.py files for specialized validation

- Add custom validators for actions requiring special validation logic
- Implement validators for docker, go, node, npm, php, python, terraform actions
- Add specialized validation for compress-images, common-cache, common-file-check
- Implement version detection validators with language-specific logic
- Add validation for build arguments, architectures, and version formats

* test: update ShellSpec test framework for Python validation

- Update all validation.spec.sh files to use Python validator
- Add shared validation_core.py for common test utilities
- Remove obsolete bash validation helpers
- Update test output expectations for Python validator format
- Add codeql-analysis test suite
- Refactor framework utilities for Python integration
- Remove deprecated test files

* feat(actions): update action.yml files to use validate-inputs

- Replace inline bash validation with validate-inputs action
- Standardize validation across all 42 actions
- Add new codeql-analysis action
- Update action metadata and branding
- Add validation step as first step in composite actions
- Maintain backward compatibility with existing inputs/outputs

* ci: update GitHub workflows for enhanced security and testing

- Add new codeql-new.yml workflow
- Update security scanning workflows
- Enhance dependency review configuration
- Update test-actions workflow for new validation system
- Improve workflow permissions and security settings
- Update action versions to latest SHA-pinned releases

* build: update build configuration and dependencies

- Update Makefile with new validation targets
- Add Python dependencies in pyproject.toml
- Update npm dependencies and scripts
- Enhance Docker testing tools configuration
- Add targets for validator updates and local ref fixes
- Configure uv for Python package management

* chore: update linting and documentation configuration

- Update EditorConfig settings for consistent formatting
- Enhance pre-commit hooks configuration
- Update prettier and yamllint ignore patterns
- Update gitleaks security scanning rules
- Update CodeRabbit review configuration
- Update CLAUDE.md with latest project standards and rules

* docs: update Serena memory files and project metadata

- Remove obsolete PR-186 memory files
- Update project overview with current architecture
- Update project structure documentation
- Add quality standards and communication guidelines
- Add modular validator architecture documentation
- Add shellspec testing framework documentation
- Update project.yml with latest configuration

* feat: moved rules.yml to same folder as action, fixes

* fix(validators): correct token patterns and fix validator bugs

- Fix GitHub classic PAT pattern: ghp_ + 36 chars = 40 total
- Fix GitHub fine-grained PAT pattern: github_pat_ + 71 chars = 82 total
- Initialize result variable in convention_mapper to prevent UnboundLocalError
- Fix empty URL validation in network validator to return error
- Add GitHub expression check to docker architectures validator
- Update docker-build CustomValidator parallel-builds max to 16

* test(validators): fix test fixtures and expectations

- Fix token lengths in test data: github_pat 71 chars, ghp/gho 36 chars
- Update integration tests with correct token lengths
- Fix file validator test to expect absolute paths rejected for security
- Rename TestGenerator import to avoid pytest collection warning
- Update custom validator tests with correct input names
- Change docker-build tests: platforms->architectures, tags->tag
- Update docker-publish tests to match new registry enum validation

* test(shellspec): fix token lengths in test helpers and specs

- Fix default token lengths in spec_helper.sh to use correct 40-char format
- Update csharp-publish default tokens in 4 locations
- Update codeql-analysis default tokens in 2 locations
- Fix codeql-analysis test tokens to correct lengths (40 and 82 chars)
- Fix npm-publish fine-grained token test to use 82-char format

* feat(actions): add permissions documentation and environment variable usage

- Add permissions comments to all action.yml files documenting required GitHub permissions
- Convert direct input usage to environment variables in shell steps for security
- Add validation steps with proper error handling
- Update input descriptions and add security notes where applicable
- Ensure all actions follow consistent patterns for input validation

* chore(workflows): update GitHub Actions workflow versions

- Update workflow action versions to latest
- Improve workflow consistency and maintainability

* docs(security): add comprehensive security policy

- Document security features and best practices
- Add vulnerability reporting process
- Include audit history and security testing information

* docs(memory): add GitHub workflow reference documentation

- Add GitHub Actions workflow commands reference
- Add GitHub workflow expressions guide
- Add secure workflow usage patterns and best practices

* chore: token optimization, code style conventions
* chore: cr fixes
* fix: trivy reported Dockerfile problems
* fix(security): more security fixes
* chore: dockerfile and make targets for publishing
* fix(ci): add creds to test-actions workflow
* fix: security fix and checkout step to codeql-new
* chore: test fixes
* fix(security): codeql detected issues
* chore: code review fixes, ReDos protection
* style: apply MegaLinter fixes
* fix(ci): missing packages read permission
* fix(ci): add missing working directory setting
* chore: linting, add validation-regex to use regex_pattern
* chore: code review fixes
* chore(deps): update actions
* fix(security): codeql fixes
* chore(cr): apply cr comments
* chore: improve POSIX compatibility
* chore(cr): apply cr comments
* fix: codeql warning in Dockerfile, build failures
* chore(cr): apply cr comments
* fix: docker-testing-tools/Dockerfile
* chore(cr): apply cr comments
* fix(docker): update testing-tools image for GitHub Actions compatibility
* chore(cr): apply cr comments
* feat: add more tests, fix issues
* chore: fix codeql issues, update actions
* chore(cr): apply cr comments
* fix: integration tests
* chore: deduplication and fixes
* style: apply MegaLinter fixes
* chore(cr): apply cr comments
* feat: dry-run mode for generate-tests
* fix(ci): kcov installation
* chore(cr): apply cr comments
* chore(cr): apply cr comments
* chore(cr): apply cr comments
* chore(cr): apply cr comments, simplify action testing, use uv
* fix: run-tests.sh action counting
* chore(cr): apply cr comments
* chore(cr): apply cr comments
This commit is contained in:
2025-10-14 13:37:58 +03:00
committed by GitHub
parent d3cc8d4790
commit 78fdad69e5
353 changed files with 55370 additions and 1714 deletions

View File

@@ -0,0 +1,748 @@
"""Security validator for detecting injection patterns and security issues."""
from __future__ import annotations
import re
from typing import ClassVar
from .base import BaseValidator
class SecurityValidator(BaseValidator):
"""Validator for security-related checks across all inputs."""
# Common injection patterns to detect
INJECTION_PATTERNS: ClassVar[list[tuple[str, str]]] = [
(r";\s*rm\s+-rf", "rm -rf command"),
(r";\s*del\s+", "del command"),
(r"&&\s*curl\s+", "curl command injection"),
(r"&&\s*wget\s+", "wget command injection"),
(r"\|\s*sh\b", "pipe to shell"),
(r"\|\s*bash\b", "pipe to bash"),
(r"`[^`]+`", "command substitution"),
(r"\$\([^)]+\)", "command substitution"),
(r"\${[^}]+}", "variable expansion"),
(r"<script[^>]*>", "script tag injection"),
(r"javascript:", "javascript protocol"),
(r"data:text/html", "data URI injection"),
]
def validate_inputs(self, inputs: dict[str, str]) -> bool:
"""Validate all inputs for security issues."""
valid = True
for input_name, value in inputs.items():
# Skip empty values
if not value or not value.strip():
continue
# Apply security validation to all inputs
valid &= self.validate_security_patterns(value, input_name)
# Additional checks for specific input types
if "regex" in input_name or "pattern" in input_name:
valid &= self.validate_regex_pattern(value, input_name)
elif "path" in input_name or "file" in input_name:
valid &= self.validate_path_security(value, input_name)
elif "url" in input_name or "uri" in input_name:
valid &= self.validate_url_security(value, input_name)
elif "command" in input_name or "cmd" in input_name:
valid &= self.validate_command_security(value, input_name)
return valid
def get_required_inputs(self) -> list[str]:
"""Security validator doesn't define required inputs."""
return []
def get_validation_rules(self) -> dict:
"""Return security validation rules."""
return {
"injection_patterns": "Command injection detection",
"path_traversal": "Path traversal prevention",
"xss_prevention": "Cross-site scripting prevention",
}
def validate_injection_patterns(self, value: str, name: str = "input") -> bool:
"""Check for advanced injection patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no injection patterns found, False otherwise
"""
if not value or value.strip() == "":
return True
# Check against known injection patterns
for pattern, description in self.INJECTION_PATTERNS:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Security issue in {name}: detected {description}")
return False
return True
def validate_url_security(self, url: str, name: str = "url") -> bool:
"""Validate URL for security issues.
Args:
url: The URL to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not url or url.strip() == "":
return True
# Check for javascript: protocol
if url.lower().startswith("javascript:"):
self.add_error(f"Security issue in {name}: javascript: protocol not allowed")
return False
# Check for data: URI with HTML
if url.lower().startswith("data:") and "text/html" in url.lower():
self.add_error(f"Security issue in {name}: data:text/html URIs not allowed")
return False
# Check for file: protocol
if url.lower().startswith("file:"):
self.add_error(f"Security issue in {name}: file: protocol not allowed")
return False
return True
def validate_command_security(self, command: str, name: str = "command") -> bool:
"""Validate command for security issues.
Args:
command: The command to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not command or command.strip() == "":
return True
# Dangerous commands that should not be allowed
dangerous_commands = [
"rm -rf",
"rm -fr",
"format c:",
"del /f /s /q",
"shutdown",
"reboot",
":(){:|:&};:", # Fork bomb
"dd if=/dev/zero",
"dd if=/dev/random", # Also dangerous
"mkfs",
"chmod -R 777", # Dangerous permission change
"chmod 777",
"chown -R", # Dangerous ownership change
]
command_lower = command.lower()
for dangerous in dangerous_commands:
if dangerous.lower() in command_lower:
self.add_error(
f"Security issue in {name}: dangerous command pattern '{dangerous}' detected",
)
return False
# Check for base64 encoded commands (often used to hide malicious code)
if re.search(r"base64\s+-d|base64\s+--decode", command, re.IGNORECASE):
self.add_error(f"Security issue in {name}: base64 decode operations not allowed")
return False
return True
def validate_content_security(self, content: str, name: str = "content") -> bool:
"""Validate content for XSS and injection.
Args:
content: The content to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not content or content.strip() == "":
return True
# Check for script tags (match any content between script and >)
if re.search(r"<script[^>]*>.*?</script[^>]*>", content, re.IGNORECASE | re.DOTALL):
self.add_error(f"Security issue in {name}: script tags not allowed")
return False
# Check for event handlers
event_handlers = [
"onclick",
"onload",
"onerror",
"onmouseover",
"onfocus",
"onblur",
"onchange",
"onsubmit",
]
for handler in event_handlers:
if re.search(rf"\b{handler}\s*=", content, re.IGNORECASE):
self.add_error(f"Security issue in {name}: event handler '{handler}' not allowed")
return False
# Check for iframe injection
if re.search(r"<iframe[^>]*>", content, re.IGNORECASE):
self.add_error(f"Security issue in {name}: iframe tags not allowed")
return False
return True
def validate_prefix_security(self, prefix: str, name: str = "prefix") -> bool:
"""Validate prefix for security issues.
Args:
prefix: The prefix to validate
name: The input name for error messages
Returns:
True if secure, False otherwise
"""
if not prefix or prefix.strip() == "":
return True
# Only alphanumeric, dots, underscores, and hyphens
if not re.match(r"^[a-zA-Z0-9_.-]*$", prefix):
self.add_error(f"Security issue in {name}: '{prefix}' contains invalid characters")
return False
return True
def validate_no_injection(self, value: str, name: str = "input") -> bool:
"""Comprehensive injection detection.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no injection patterns found, False otherwise
"""
if not value or value.strip() == "":
return True
# Allow GitHub expressions (they're safe in Actions context)
if self.is_github_expression(value):
return True
# Check for command injection patterns
if not self.validate_security_patterns(value, name):
return False
# Check for single & (background execution)
if re.search(r"(?<!&)&(?!&)", value):
self.add_error(f"Background execution pattern '&' detected in {name}")
return False
# Check for advanced injection patterns
if not self.validate_injection_patterns(value, name):
return False
# Check for SQL injection patterns
sql_patterns = [
r"'\s*OR\s+'[^']*'\s*=\s*'[^']*", # ' OR '1'='1
r'"\s*OR\s+"[^"]*"\s*=\s*"[^"]*', # " OR "1"="1
r"'\s*OR\s+\d+\s*=\s*\d+", # ' OR 1=1
r";\s*DROP\s+TABLE", # ; DROP TABLE
r";\s*DELETE\s+FROM", # ; DELETE FROM
r"UNION\s+SELECT", # UNION SELECT
r"--\s*$", # SQL comment at end
r";\s*EXEC\s+", # ; EXEC
r"xp_cmdshell", # SQL Server command execution
]
for pattern in sql_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"SQL injection pattern detected in {name}")
return False
# Check for script injection patterns
return self.validate_content_security(value, name)
def validate_safe_command(self, command: str, name: str = "command") -> bool:
"""Validate that a command is safe to execute.
Args:
command: The command to validate
name: The input name for error messages
Returns:
True if command appears safe, False otherwise
"""
if not command or command.strip() == "":
return True
# Allow GitHub expressions (they're safe in Actions context)
if self.is_github_expression(command):
return True
# Use existing command security validation
if not self.validate_command_security(command, name):
return False
# Check for dangerous redirect to device files
if re.search(r">\s*/dev/", command):
self.add_error(f"Security issue in {name}: redirect to device file not allowed")
return False
# Check for filesystem creation commands
if re.search(r"\bmkfs", command, re.IGNORECASE):
self.add_error(f"Security issue in {name}: filesystem creation commands not allowed")
return False
# Additional checks for safe commands
# Block shell metacharacters that could be dangerous
dangerous_chars = ["&", "|", ";", "$", "`", "\\", "!", "{", "}", "[", "]", "(", ")"]
for char in dangerous_chars:
if char in command:
# Allow some safe uses
if char == "&" and "&&" not in command and "&>" not in command:
continue
self.add_error(f"Potentially dangerous character '{char}' in {name}")
return False
return True
def validate_safe_environment_variable(self, value: str, name: str = "env_var") -> bool:
"""Validate environment variable value for security.
Args:
value: The environment variable value
name: The input name for error messages
Returns:
True if safe, False otherwise
"""
if not value or value.strip() == "":
return True
# Check for command substitution in env vars
if "$(" in value or "`" in value or "${" in value:
self.add_error(f"Command substitution not allowed in environment variable {name}")
return False
# Check for newlines (could be used to inject multiple commands)
if "\n" in value or "\r" in value:
self.add_error(f"Newlines not allowed in environment variable {name}")
return False
# Check for null bytes (could be used for string termination attacks)
if "\x00" in value:
self.add_error(f"Null bytes not allowed in environment variable {name}")
return False
# Check for shell special chars that might cause issues
if re.search(r"[;&|]", value) and re.search(
r";\s*(rm|del|format|shutdown|reboot)",
value,
re.IGNORECASE,
):
self.add_error(f"Dangerous command pattern in environment variable {name}")
return False
return True
# Alias for test compatibility
def validate_safe_env_var(self, value: str, name: str = "env_var") -> bool:
"""Alias for validate_safe_environment_variable for test compatibility."""
return self.validate_safe_environment_variable(value, name)
def _check_github_tokens(self, value: str, name: str) -> bool:
"""Check for GitHub token patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no GitHub tokens found, False otherwise
"""
github_token_patterns = [
r"ghp_[a-zA-Z0-9]{36}", # GitHub personal access token
r"gho_[a-zA-Z0-9]{36}", # GitHub OAuth token
r"ghu_[a-zA-Z0-9]{36}", # GitHub user token
r"ghs_[a-zA-Z0-9]{36}", # GitHub server token
r"ghr_[a-zA-Z0-9]{36}", # GitHub refresh token
r"github_pat_[a-zA-Z0-9_]{48,}", # GitHub fine-grained PAT
]
for pattern in github_token_patterns:
if re.search(pattern, value):
self.add_error(f"Potential GitHub token detected in {name}")
return False
return True
def _check_api_keys(self, value: str, name: str) -> bool:
"""Check for API key patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no API keys found, False otherwise
"""
api_key_patterns = [
r"api[_-]?key\s*[:=]\s*['\"]?[a-zA-Z0-9]{20,}", # Generic API key
r"secret[_-]?key\s*[:=]\s*['\"]?[a-zA-Z0-9]{20,}", # Secret key
r"access[_-]?key\s*[:=]\s*['\"]?[a-zA-Z0-9]{20,}", # Access key
]
for pattern in api_key_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Potential API key detected in {name}")
return False
return True
def _check_passwords(self, value: str, name: str) -> bool:
"""Check for password patterns.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no passwords found, False otherwise
"""
password_patterns = [
r"password\s*[:=]\s*['\"]?[^\s'\"]{8,}", # Password assignment
r"passwd\s*[:=]\s*['\"]?[^\s'\"]{8,}", # Passwd assignment
r"pwd\s*[:=]\s*['\"]?[^\s'\"]{8,}", # Pwd assignment
]
for pattern in password_patterns:
if re.search(pattern, value, re.IGNORECASE):
self.add_error(f"Potential password detected in {name}")
return False
return True
def _check_private_keys(self, value: str, name: str) -> bool:
"""Check for private key markers.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no private keys found, False otherwise
"""
private_key_markers = [
"-----BEGIN RSA PRIVATE KEY-----",
"-----BEGIN PRIVATE KEY-----",
"-----BEGIN OPENSSH PRIVATE KEY-----",
"-----BEGIN DSA PRIVATE KEY-----",
"-----BEGIN EC PRIVATE KEY-----",
]
for marker in private_key_markers:
if marker in value:
self.add_error(f"Private key detected in {name}")
return False
return True
def _check_encoded_secrets(self, value: str, name: str) -> bool:
"""Check for Base64 encoded secrets.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no encoded secrets found, False otherwise
"""
# Look for long base64 strings that might be credentials
# and if it contains words like secret, key, token, password
if re.search(r"[A-Za-z0-9+/]{40,}={0,2}", value) or (
re.search(r"[A-Za-z0-9+/]{40,}={0,2}", value)
and re.search(r"(secret|key|token|password|credential)", value, re.IGNORECASE)
):
self.add_error(f"Potential encoded secret detected in {name}")
return False
return True
def validate_no_secrets(self, value: str, name: str = "input") -> bool:
"""Validate that no secrets or sensitive data are present.
Args:
value: The value to check
name: The input name for error messages
Returns:
True if no secrets detected, False otherwise
"""
if not value or value.strip() == "":
return True
# Run all secret detection checks
return (
self._check_github_tokens(value, name)
and self._check_api_keys(value, name)
and self._check_passwords(value, name)
and self._check_private_keys(value, name)
and self._check_encoded_secrets(value, name)
)
def _check_command_injection_in_regex(self, pattern: str, name: str) -> bool:
"""Check for command injection patterns in regex.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if command injection detected
"""
dangerous_cmd_patterns = [
r";\s*(rm|del|cat|whoami|id|pwd|ls|curl|wget|nc|bash|sh|cmd)",
r"&&\s*(rm|del|cat|whoami|id|pwd|ls|curl|wget|nc|bash|sh|cmd)",
r"\|\s*(sh|bash|cmd)\b",
r"`[^`]+`",
r"\$\([^)]+\)",
]
for cmd_pattern in dangerous_cmd_patterns:
if re.search(cmd_pattern, pattern, re.IGNORECASE):
self.add_error(f"Command injection detected in {name}")
return False
return True
def _check_nested_quantifiers(self, pattern: str, name: str) -> bool:
"""Check for nested quantifiers that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if nested quantifiers detected
"""
nested_quantifier_patterns = [
r"\([^)]*[+*]\)[+*{]", # (x+)+ or (x*)* or (x+){n,m}
r"\([^)]*\{[0-9,]+\}\)[+*{]", # (x{n,m})+ or (x{n,m})*
r"\([^)]*[+*]\)\{", # (x+){n,m}
]
for redos_pattern in nested_quantifier_patterns:
if re.search(redos_pattern, pattern):
self.add_error(
f"ReDoS risk detected in {name}: nested quantifiers can cause "
"catastrophic backtracking. Avoid patterns like (a+)+, (a*)*, or (a+){n,m}"
)
return False
return True
def _check_duplicate_alternatives(self, alt1: str, alt2: str, group: str, name: str) -> bool:
"""Check if two alternatives are exact duplicates.
Args:
alt1: First alternative
alt2: Second alternative
group: The full group string for error message
name: The input name for error messages
Returns:
True if not duplicates, False if duplicates detected
"""
if alt1 == alt2:
self.add_error(
f"ReDoS risk detected in {name}: duplicate alternatives "
f"in repeating group '({group})' can cause "
"catastrophic backtracking"
)
return False
return True
def _check_overlapping_alternatives(self, alt1: str, alt2: str, group: str, name: str) -> bool:
"""Check if two alternatives have prefix overlap.
Args:
alt1: First alternative
alt2: Second alternative
group: The full group string for error message
name: The input name for error messages
Returns:
True if no overlap, False if overlap detected
"""
if alt1.startswith(alt2) or alt2.startswith(alt1):
self.add_error(
f"ReDoS risk detected in {name}: overlapping alternatives "
f"in repeating group '({group})' can cause "
"catastrophic backtracking"
)
return False
return True
def _validate_alternative_pairs(self, alternatives: list[str], group: str, name: str) -> bool:
"""Validate all pairs of alternatives for duplicates and overlaps.
Args:
alternatives: List of alternatives to check
group: The full group string for error message
name: The input name for error messages
Returns:
True if all pairs are safe, False otherwise
"""
for i, alt1 in enumerate(alternatives):
for alt2 in alternatives[i + 1 :]:
# Check for exact duplicates
if not self._check_duplicate_alternatives(alt1, alt2, group, name):
return False
# Check for prefix overlaps
if not self._check_overlapping_alternatives(alt1, alt2, group, name):
return False
return True
def _check_alternation_repetition(self, pattern: str, name: str) -> bool:
"""Check for alternation with repetition that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if problematic alternation detected
"""
alternation_repetition = r"\([^)]*\|[^)]*\)[+*{]"
if not re.search(alternation_repetition, pattern):
return True
# Check if alternatives overlap (basic heuristic)
matches = re.finditer(r"\(([^)]*\|[^)]*)\)[+*{]", pattern)
for match in matches:
alternatives = match.group(1).split("|")
# Validate all pairs of alternatives
if not self._validate_alternative_pairs(alternatives, match.group(1), name):
return False
return True
def _check_consecutive_quantifiers(self, pattern: str, name: str) -> bool:
"""Check for consecutive quantifiers that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if consecutive quantifiers detected
"""
consecutive_quantifiers = r"[.+*][+*{]"
if re.search(consecutive_quantifiers, pattern):
self.add_error(
f"ReDoS risk detected in {name}: consecutive quantifiers like .*.* or .*+ "
"can cause catastrophic backtracking"
)
return False
return True
def _check_exponential_quantifiers(self, pattern: str, name: str) -> bool:
"""Check for exponential quantifier combinations that can cause ReDoS.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if safe, False if exponential quantifiers detected
"""
depth = 0
max_depth = 0
quantifier_depth_count = 0
i = 0
while i < len(pattern):
char = pattern[i]
if char == "(":
depth += 1
max_depth = max(max_depth, depth)
# Check if followed by quantifier after closing
closing_idx = self._find_closing_paren(pattern, i)
if closing_idx != -1 and closing_idx + 1 < len(pattern):
next_char = pattern[closing_idx + 1]
if next_char in "+*{":
quantifier_depth_count += 1
elif char == ")":
depth -= 1
i += 1
# If we have multiple nested quantified groups (depth > 2 with 3+ quantifiers)
if max_depth > 2 and quantifier_depth_count >= 3:
self.add_error(
f"ReDoS risk detected in {name}: deeply nested groups with multiple "
"quantifiers can cause catastrophic backtracking"
)
return False
return True
def validate_regex_pattern(self, pattern: str, name: str = "regex") -> bool:
"""Validate regex pattern for ReDoS vulnerabilities.
Detects potentially dangerous regex patterns that could cause
Regular Expression Denial of Service (ReDoS) through catastrophic
backtracking.
Args:
pattern: The regex pattern to validate
name: The input name for error messages
Returns:
True if pattern appears safe, False if ReDoS risk detected
"""
if not pattern or pattern.strip() == "":
return True
# Allow GitHub expressions
if self.is_github_expression(pattern):
return True
# Run all ReDoS checks using helper methods
if not self._check_command_injection_in_regex(pattern, name):
return False
if not self._check_nested_quantifiers(pattern, name):
return False
if not self._check_alternation_repetition(pattern, name):
return False
if not self._check_consecutive_quantifiers(pattern, name):
return False
return self._check_exponential_quantifiers(pattern, name)
def _find_closing_paren(self, pattern: str, start: int) -> int:
"""Find the closing parenthesis for an opening one.
Args:
pattern: The regex pattern
start: The index of the opening parenthesis
Returns:
Index of the closing parenthesis, or -1 if not found
"""
if start >= len(pattern) or pattern[start] != "(":
return -1
depth = 1
i = start + 1
while i < len(pattern) and depth > 0:
if pattern[i] == "(":
depth += 1
elif pattern[i] == ")":
depth -= 1
if depth == 0:
return i
i += 1
return -1