Files
gibidify/fileproc/registry.go
Ismo Vuorinen 95b7ef6dd3 chore: modernize workflows, security scanning, and linting configuration (#50)
* build: update Go 1.25, CI workflows, and build tooling

- Upgrade to Go 1.25
- Add benchmark targets to Makefile
- Implement parallel gosec execution
- Lock tool versions for reproducibility
- Add shellcheck directives to scripts
- Update CI workflows with improved caching

* refactor: migrate from golangci-lint to revive

- Replace golangci-lint with revive for linting
- Configure comprehensive revive rules
- Fix all EditorConfig violations
- Add yamllint and yamlfmt support
- Remove deprecated .golangci.yml

* refactor: rename utils to shared and deduplicate code

- Rename utils package to shared
- Add shared constants package
- Deduplicate constants across packages
- Address CodeRabbit review feedback

* fix: resolve SonarQube issues and add safety guards

- Fix all 73 SonarQube OPEN issues
- Add nil guards for resourceMonitor, backpressure, metricsCollector
- Implement io.Closer for headerFileReader
- Propagate errors from processing helpers
- Add metrics and templates packages
- Improve error handling across codebase

* test: improve test infrastructure and coverage

- Add benchmarks for cli, fileproc, metrics
- Improve test coverage for cli, fileproc, config
- Refactor tests with helper functions
- Add shared test constants
- Fix test function naming conventions
- Reduce cognitive complexity in benchmark tests

* docs: update documentation and configuration examples

- Update CLAUDE.md with current project state
- Refresh README with new features
- Add usage and configuration examples
- Add SonarQube project configuration
- Consolidate config.example.yaml

* fix: resolve shellcheck warnings in scripts

- Use ./*.go instead of *.go to prevent dash-prefixed filenames
  from being interpreted as options (SC2035)
- Remove unreachable return statement after exit (SC2317)
- Remove obsolete gibidiutils/ directory reference

* chore(deps): upgrade go dependencies

* chore(lint): megalinter fixes

* fix: improve test coverage and fix file descriptor leaks

- Add defer r.Close() to fix pipe file descriptor leaks in benchmark tests
- Refactor TestProcessorConfigureFileTypes with helper functions and assertions
- Refactor TestProcessorLogFinalStats with output capture and keyword verification
- Use shared constants instead of literal strings (TestFilePNG, FormatMarkdown, etc.)
- Reduce cognitive complexity by extracting helper functions

* fix: align test comments with function names

Remove underscores from test comments to match actual function names:
- benchmark/benchmark_test.go (2 fixes)
- fileproc/filetypes_config_test.go (4 fixes)
- fileproc/filetypes_registry_test.go (6 fixes)
- fileproc/processor_test.go (6 fixes)
- fileproc/resource_monitor_types_test.go (4 fixes)
- fileproc/writer_test.go (3 fixes)

* fix: various test improvements and bug fixes

- Remove duplicate maxCacheSize check in filetypes_registry_test.go
- Shorten long comment in processor_test.go to stay under 120 chars
- Remove flaky time.Sleep in collector_test.go, use >= 0 assertion
- Close pipe reader in benchmark_test.go to fix file descriptor leak
- Use ContinueOnError in flags_test.go to match ResetFlags behavior
- Add nil check for p.ui in processor_workers.go before UpdateProgress
- Fix resource_monitor_validation_test.go by setting hardMemoryLimitBytes directly

* chore(yaml): add missing document start markers

Add --- document start to YAML files to satisfy yamllint:
- .github/workflows/codeql.yml
- .github/workflows/build-test-publish.yml
- .github/workflows/security.yml
- .github/actions/setup/action.yml

* fix: guard nil resourceMonitor and fix test deadlock

- Guard resourceMonitor before CreateFileProcessingContext call
- Add ui.UpdateProgress on emergency stop and path error returns
- Fix potential deadlock in TestProcessFile using wg.Go with defer close
2025-12-10 19:07:11 +02:00

115 lines
2.9 KiB
Go

// Package fileproc provides file processing utilities.
package fileproc
import (
"path/filepath"
"strings"
"sync"
"github.com/ivuorinen/gibidify/shared"
)
const minExtensionLength = 2
var (
registry *FileTypeRegistry
registryOnce sync.Once
)
// FileTypeRegistry manages file type detection and classification.
type FileTypeRegistry struct {
imageExts map[string]bool
binaryExts map[string]bool
languageMap map[string]string
// Cache for frequent lookups to avoid repeated string operations
extCache map[string]string // filename -> normalized extension
resultCache map[string]FileTypeResult // extension -> cached result
cacheMutex sync.RWMutex
maxCacheSize int
// Performance statistics
stats RegistryStats
}
// RegistryStats tracks performance metrics for the registry.
type RegistryStats struct {
TotalLookups uint64
CacheHits uint64
CacheMisses uint64
CacheEvictions uint64
}
// FileTypeResult represents cached file type detection results.
type FileTypeResult struct {
IsImage bool
IsBinary bool
Language string
Extension string
}
// initRegistry initializes the default file type registry with common extensions.
func initRegistry() *FileTypeRegistry {
return &FileTypeRegistry{
imageExts: getImageExtensions(),
binaryExts: getBinaryExtensions(),
languageMap: getLanguageMap(),
extCache: make(map[string]string, shared.FileTypeRegistryMaxCacheSize),
resultCache: make(map[string]FileTypeResult, shared.FileTypeRegistryMaxCacheSize),
maxCacheSize: shared.FileTypeRegistryMaxCacheSize,
}
}
// getRegistry returns the singleton file type registry, creating it if necessary.
func getRegistry() *FileTypeRegistry {
registryOnce.Do(func() {
registry = initRegistry()
})
return registry
}
// DefaultRegistry returns the default file type registry.
func DefaultRegistry() *FileTypeRegistry {
return getRegistry()
}
// Stats returns a copy of the current registry statistics.
func (r *FileTypeRegistry) Stats() RegistryStats {
r.cacheMutex.RLock()
defer r.cacheMutex.RUnlock()
return r.stats
}
// CacheInfo returns current cache size information.
func (r *FileTypeRegistry) CacheInfo() (extCacheSize, resultCacheSize, maxCacheSize int) {
r.cacheMutex.RLock()
defer r.cacheMutex.RUnlock()
return len(r.extCache), len(r.resultCache), r.maxCacheSize
}
// ResetRegistryForTesting resets the registry to its initial state.
// This function should only be used in tests.
func ResetRegistryForTesting() {
registryOnce = sync.Once{}
registry = nil
}
// normalizeExtension extracts and normalizes the file extension.
func normalizeExtension(filename string) string {
return strings.ToLower(filepath.Ext(filename))
}
// isSpecialFile checks if the filename matches special cases like .DS_Store.
func isSpecialFile(filename string, extensions map[string]bool) bool {
if filepath.Ext(filename) == "" {
basename := strings.ToLower(filepath.Base(filename))
return extensions[basename]
}
return false
}