Roadmap to v1.0.0#
Current Version: 0.4.0-dev (Beta Track) Target: v1.0.0 Stable Release
This roadmap outlines the development path for TorchFX from the current beta state to a production-ready v1.0.0 release. The plan is organized into major epics, each containing specific deliverables and tasks.
Vision#
TorchFX v1.0.0 will be a production-ready, GPU-accelerated audio DSP library with:
Real-time processing capabilities for live audio (microphone/instrument input)
Modern CLI tool combining sox compatibility with GPU acceleration
Optimized performance through custom CUDA kernels
Professional documentation with comprehensive tutorials and API reference
>90% test coverage with integration and audio quality tests
Semantic versioning with backward compatibility guarantees
Current State#
Strengths#
✅ Solid core DSP architecture (~2000 LOC)
✅ GPU acceleration working
✅ 393 tests with >90% coverage
✅ Published research paper (arXiv:2504.08624)
✅ Clean API with pipe operator support
✅ Professional Sphinx documentation with tutorials
✅ Real-time audio processing with circular buffers
✅ Full-featured CLI with sox compatibility
✅ Interactive REPL with live performance mode
✅ Complete validation and logging infrastructure
✅ API stability guarantees with deprecation system
Gaps#
❌ No custom CUDA kernels yet
❌ Limited ML integration examples
❌ Missing some advanced effects (compressor, phaser, pitch shift)
❌ No VST3 wrapper
Estimated Completion: ~85% ready for v1.0.0
Epic 1: Core Library Stabilization#
Priority: Critical (Foundation) Goal: Complete essential features and stabilize the public API with semantic versioning guarantees.
1.1 Complete Missing Core Features#
Implement Wave.save() / to_file() method
✅ Support formats: WAV, FLAC (OGG/MP3/AAC require additional backend configuration)
✅ High bit-depth: 32-bit float, 64-bit float (8, 16, 24, 32, 64 bits supported)
✅ High sample rates: up to 192kHz+ (tested with 96kHz and 192kHz)
✅ Metadata preservation (automatic extraction and storage via torchaudio.info)
Implementation details:
Uses torchaudio.save() as backend
Automatic parent directory creation
Format inference from file extension
CPU tensor conversion for compatibility
Comprehensive test suite (16 tests, 1 skipped for OGG)
Complete LoShelving filter
✅ Implemented following HiShelving pattern
✅ Uses Audio EQ Cookbook formulas
✅ Supports both linear and dB gain scales
✅ Full test coverage (7 tests)
Add professional filters
✅ Parametric EQ (essential for music production)
Bell-shaped peaking filter with configurable Q and gain
Intuitive interface: frequency, Q, gain in dB
Perfect for surgical frequency adjustments
✅ Elliptic filters (HiElliptic, LoElliptic)
Sharpest transition for given order
Configurable passband ripple and stopband attenuation
Optimal for applications where phase is not critical
State variable filters (TPT) - deferred to future version
1.2 API Stabilization#
Audit and freeze public API
✅ Marked all public classes in
__all__exports✅ Created api stability with backward compatibility guarantees
✅ Implemented deprecation warning system with decorators (
@deprecated,@deprecated_parameter,DeprecatedAlias)✅ Full test coverage (9 tests) for deprecation utilities
Implement semantic versioning policy
✅ Documented policy: No breaking changes in minor versions (1.x.x)
✅ Deprecation warnings for at least one minor version before removal
✅ Created migration guide template with migration patterns
✅ Added versioning examples and guidelines
Parameter naming consistency
✅ Standardized naming conventions documented in style guide:
cutofffor lowpass/highpass/shelving filtersfrequencyfor ParametricEQ (center frequency)Q(uppercase) for Peaking, Notch, AllPass (mathematical convention)q(lowercase) for Shelving, ParametricEQ (industry convention)gainwithgain_scalefor units (“linear” or “db”)fsfor sampling frequency
✅ Style guide includes naming, units, code organization, and documentation standards
1.3 Error Handling & Validation#
Input validation layer
✅ Validate sample rates, tensor shapes, parameter ranges
✅ Custom exception hierarchy:
TorchFXError,InvalidParameterError,AudioProcessingErrorImplementation details:
New
torchfx.validationsubpackage with exceptions and validatorsException hierarchy:
TorchFXError(base),InvalidParameterError,InvalidSampleRateError,InvalidRangeError,InvalidShapeError,InvalidTypeError,AudioProcessingError,CoefficientComputationError,FilterInstabilityErrorValidators:
validate_sample_rate,validate_positive,validate_range,validate_in_set,validate_tensor_ndim,validate_audio_tensor,validate_type,validate_cutoff_frequency,validate_filter_order,validate_q_factorFull test coverage (76 tests)
Improved error messages
✅ Context-aware messages with actual vs. expected values
✅ Suggestions for fixes
Implementation details:
Built into the validation exception classes (parameter_name, actual_value, expected, suggestion fields)
All exceptions format messages with full context automatically
Logging infrastructure
✅ Structured logging with Python’s
loggingmodule✅ Log levels: DEBUG, INFO, WARNING, ERROR
✅ Performance logging (optional)
Implementation details:
New
torchfx.loggingsubpackageNullHandler by default (opt-in logging per Python guidelines)
Convenience functions:
enable_logging(),enable_debug_logging(),disable_logging(),get_logger()Performance utilities:
log_performance()context manager,LogPerformancedecoratorHierarchical loggers:
torchfx,torchfx.performance,torchfx.<module>Full test coverage (25 tests)
Epic 2: Real-Time Audio Processing#
Priority: Critical (Major Feature) Goal: Enable low-latency live audio processing with GPU acceleration.
2.1 Audio Backend Integration#
Abstract audio backend interface
✅
AudioBackendABC with lifecycle methods (open, start, stop, close)✅ Support input, output, duplex streams via
StreamConfigandStreamDirection✅ Callback-based and blocking APIs
Implementation details:
New
torchfx.realtimesubpackageStreamConfigfrozen dataclass with direction inference and latency calculationAudioCallbacktype alias forCallable[[Tensor, Tensor, int], None]Full test coverage (62 tests)
PortAudio backend (Priority 1)
✅
SoundDeviceBackendusingsounddevicelibrary✅ Cross-platform support (Linux, macOS, Windows)
✅ Buffer size: configurable (64-4096+ samples)
✅ Optional dependency — core library works without sounddevice
Implementation details:
Lazy import via
_compat.pymoduleNumpy-to-tensor zero-copy conversion in callback wrapper
Device enumeration and default device selection
sounddevicein optional dependency grouprealtime
PulseAudio/PipeWire backend (Priority 2) - deferred to future version
Native Linux desktop integration
JACK backend (Future) - deferred to future version
Professional Linux audio routing
2.2 Real-Time Processing Pipeline#
Ring buffer implementation
✅ Lock-free SPSC
TensorRingBufferon PyTorch tensors✅ GPU-compatible tensor buffers (configurable device)
✅ Overlap-add support via
peek()+advance_read()Implementation details:
Power-of-2 capacity with bitwise modular arithmetic
Pre-allocated
(channels, capacity)backing tensorSeparate read/write indices (SPSC model)
Wrap-around handling with split copy operations
Real-time processor class
✅
RealtimeProcessororchestrating backend + effect chain✅
start(),stop(),set_parameter(name, value)(thread-safe)✅ Automatic
fspropagation and coefficient computation✅
reset_state()for clearing filter statesImplementation details:
Double-buffered parameter updates (lock only on swap)
Audio callback processes effects in sequence
Mono-to-stereo expansion for channel mismatch
Ring buffers for input/output queuing
Latency optimization
✅ Target: <10ms total latency at 48kHz, 512 buffer (~10.7ms theoretical)
✅ Pre-allocated tensor buffers via ring buffer
✅ Lock-free audio path (parameters applied at buffer boundaries)
Stream processing for large files
✅
StreamProcessorwith chunk-based processing✅
process_file()for file-to-file processing✅
process_chunks()generator API for streaming pipelines✅ GPU acceleration support (device parameter)
Implementation details:
Uses
torchaudio.load(frame_offset, num_frames)for efficient chunk readingUses
soundfile.SoundFilefor append-mode writingConfigurable overlap for overlap-add processing
2.3 Real-Time Effect Adaptations#
Stateful filter management
✅
reset_state()method on RealtimeProcessor✅ Ring buffer clear on state reset
Thread-safe parameter updates
✅ Double-buffered parameter dict with lock-on-swap
✅ Parameters applied atomically at buffer boundaries
✅ Automatic coefficient recomputation for filter parameters
CPU/GPU hybrid processing
✅ StreamProcessor supports configurable device (“cpu” or “cuda”)
✅ Automatic CPU↔GPU tensor transfers in stream processing
✅ Real-time processor operates on CPU for low-latency callback
Epic 3: CLI Application#
Priority: High (Major Feature) ✅ COMPLETED Goal: Modern, GPU-accelerated CLI tool with sox compatibility and unique features.
3.1 Core CLI Architecture ✅#
CLI framework with Typer
✅ Commands:
process,info,play,record,convert,trim,concat,stats,preset,interactive,watch✅ Global options:
--device,--verbose,--config,--version✅ Rich output with progress bars and tables
Implementation details:
Typer with Rich markup mode
Global state management via callback
Lazy imports for heavy dependencies (torch, sounddevice)
71 CLI tests (356 total tests)
Subcommand structure
torchfx process input.wav output.wav --effect reverb torchfx info audio.flac torchfx play audio.wav torchfx record output.wav --duration 10 torchfx interactive # REPL mode torchfx watch ./input/ ./output/ --effect normalize
Configuration file support (TOML)
✅ Save/load effect chains from TOML files
✅ Preset management:
~/.config/torchfx/presets/✅ TOML-only (tomllib stdlib on 3.11+, tomli fallback for 3.10)
✅
[[effects]]format compatible with presets
3.2 Pipeline Processing & Sox Compatibility ✅#
Unix pipe support
✅ Read from stdin:
cat audio.wav | torchfx process - - -e normalize✅ Write to stdout:
torchfx process input.wav - -e normalize | aplay✅ WAV and raw format support for pipes
Batch processing
torchfx process "*.wav" --output-dir ./processed/ --effect normalize
✅ Glob pattern matching
✅ Rich progress bar with ETA
✅ Error handling per file
✅ GPU acceleration support
Sox-compatible commands
✅
convert— format/rate/channel conversion✅
trim— extract time ranges✅
concat— join multiple files✅
stats— signal statistics (peak/RMS dBFS, crest factor, DC offset)
GPU-accelerated batch processing
✅ Auto-batch via
--device cuda✅ StreamProcessor with chunked processing
✅ Progress bar with completion ETA
3.3 Interactive Mode (REPL) ✅#
Interactive shell
✅ prompt_toolkit with tab completion
✅ Syntax highlighting and Rich formatting
✅ Persistent command history (
~/.config/torchfx/repl_history)✅ Effect name and command completion
Live parameter tweaking
torchfx> load song.wav torchfx> add reverb:decay=0.5 torchfx> live ▶ Live playback started (2 ch, 44100 Hz, looping) torchfx> add normalize # ← Effect applies immediately during playback! torchfx> preset load mastering # ← Entire chain switches in real-time torchfx> live stop
✅ Lock-free circular buffer pattern
✅ Real-time effect hot-swapping
✅ Audio loops continuously
✅ Changes apply at buffer boundaries (~46ms latency)
Commands:
load,add,remove,list,effects,info,play,play raw,live,live stop,save,preset save/load/list,clear,help,exitReal-time visualization — deferred to v1.1
Waveform display
Spectrum analyzer
VU meters
Preset management
✅ Save/load/list/show/delete/apply presets
✅ TOML format compatible with
--config✅ Works in both CLI and REPL
3.4 Watch Mode & Automation ✅#
File system watcher
torchfx watch ./input/ --output ./processed/ --effect reverb torchfx watch ./bounces/ --preset mastering --recursive
✅ Watchdog-based file monitoring
✅ Auto-process new/modified audio files
✅ Recursive directory watching
✅ Process existing files on startup (–existing flag)
✅ Preset and config file support
DAW integration mode
✅ Monitor export folder
✅ Auto-apply mastering chain from preset
✅ Rich status display with live updates
Epic 4: Performance Optimization & CUDA#
Priority: Medium (Can be v1.1) Goal: Maximize throughput with custom CUDA kernels.
4.1 CUDA Development Infrastructure#
CUDA extension build system
PyTorch C++ extension API
Auto CUDA arch detection
Fallback to PyTorch if CUDA unavailable
Kernel development tools
CUDA profiling integration (nvprof, Nsight)
Unit tests for CUDA kernels
Benchmarking harness
4.2 IIR Filter CUDA Kernels (Priority 1)#
Parallel IIR implementation
Parallel prefix scan for state propagation
Target: 2-3x speedup for batch processing
Biquad cascade optimization
Fuse multiple biquad sections
Reduce memory traffic
Stability guarantees
Match scipy/PyTorch numerical behavior
4.3 Time-Domain Effects CUDA Kernels (Priority 2)#
Optimized delay line
Circular buffer with shared memory
Interpolation for fractional delays
Reverb optimization
Parallel all-pass filters
Fused feedback delay network
4.4 Batch Processing Optimizations (Priority 3)#
Multi-file batch processing
Process multiple files in single kernel launch
Maximize GPU occupancy
Operator fusion
Fuse multiple effects:
gain + filter + normalize→ single kernel
Memory optimization
Tensor memory pooling
In-place operations
4.5 Performance Benchmarking#
Comprehensive benchmark suite
PyTorch vs. CUDA vs. CPU comparison
Report throughput and latency
Performance regression testing
Automated benchmarks in CI
Alert on >5% regression
Profiling guides
Documentation for profiling pipelines
Epic 5: Comprehensive Documentation#
Priority: Critical (Continuous) Goal: Professional-grade documentation for v1.0.0 release.
5.1 API Reference Completion#
Complete all docstrings
Every public class, method, function
Parameters with types and ranges
Examples in docstrings
Mathematical formulas in LaTeX
Fix API documentation bugs
Remove non-existent method references
Update all code examples
Validate examples run
Auto-generated API reference
Sphinx autodoc with Napoleon
Type hints rendered
Cross-references
5.2 Tutorial & Guide Documentation#
Getting Started Tutorial (expanded)
✅ Installation
✅ First pipeline
✅ Wave class basics
✅ Saving output
CLI Guide ✅ NEW
✅ Complete CLI tutorial covering all commands
✅ Effect specifications format
✅ TOML configuration examples
✅ Preset management workflows
✅ Interactive REPL with live performance mode
✅ Watch mode for DAW integration
✅ Unix pipe examples
Advanced Tutorials
Real-time audio processing (partially covered in CLI guide)
Custom filter design
GPU optimization
PyTorch model integration
How-To Guides
Audio format conversion
Building EQ/filter bank
Mastering chain
Multi-channel processing
Guitar pedal simulator
ML model integration
5.3 Example Gallery#
Expand examples
Vocal processing chain
Mastering pipeline
Guitar effect pedal
Podcast cleanup
Music production effects
Real-time effects
ML model integration
Interactive examples
Jupyter notebooks with audio playback
Parameter sliders
5.4 Project Documentation#
Contributing Guide
Code style and standards
Git workflow
Testing requirements
Architecture Documentation
High-level overview
Design patterns
Extension points
Migration Guides
Upgrading from 0.x to 1.0
API changes
FAQ & Troubleshooting
Common errors
Performance issues
CUDA/GPU troubleshooting
Epic 6: Testing & Quality Assurance#
Priority: Critical (Parallel with Epic 1) Goal: Achieve >90% test coverage with comprehensive testing.
6.1 Expand Unit Test Coverage#
Complete Wave class tests
✅ File I/O for all formats
✅ Multi-channel audio
✅ Sample rate conversion
✅ Device transfers
✅ Edge cases
✅ 72 Wave tests
Complete filter tests
✅ All filter types (IIR, FIR, Biquad)
✅ Frequency/phase response validation
✅ Filter composition
✅ Edge cases
✅ 85+ filter tests
Complete effect tests
✅ All effects and parameters
✅ Error handling
✅ 43 effect tests
CLI tests ✅ NEW
✅ All CLI commands (process, info, play, record, convert, trim, concat, stats)
✅ Preset management (save, load, list, show, delete, apply)
✅ REPL commands (add, remove, list, clear, load, save)
✅ Watch mode (file monitoring)
✅ 71 CLI tests
✅ Total: 393 tests with >90% coverage
6.2 Integration Tests#
Complex pipeline tests
✅ Multi-stage effect chains
✅ GPU end-to-end processing
✅ File load → process → save
Real-time processing tests
✅ Mock audio backend
✅ Latency measurements
✅ Parameter updates during processing
✅ 62 realtime tests
CLI integration tests
✅ All CLI commands
✅ Pipe I/O
✅ Batch processing
✅ Config file loading
✅ Preset workflows
6.3 Audio Quality Tests#
Audio quality metrics
SNR, THD, frequency response error
Compare against scipy/reference
Regression tests
Golden output files
Detect quality degradation
Perceptual quality tests (optional)
PESQ, PEAQ
6.4 Performance & Memory Tests#
Memory leak detection
Long-running tests
GPU memory monitoring
Performance benchmarks as tests
Minimum speed requirements
Prevent regressions
6.5 CI/CD Improvements#
Coverage reporting
Codecov integration
90% coverage enforcement
Coverage badge
Multi-platform testing
Linux, macOS, Windows
Python 3.10-3.13
With/without CUDA
GPU CI runner
Self-hosted or cloud GPU
CUDA tests and benchmarks
Automated releases
PyPI publishing on tag
Changelog generation
Epic 7: Additional Effects#
Priority: Low (Can be v1.1+) Goal: Expand effect library for common production needs.
7.1 Dynamics Processing#
Compressor (threshold, ratio, attack, release, knee)
Limiter (brickwall, true peak, look-ahead)
Expander / Gate
7.2 Modulation Effects#
Chorus (multi-tap delay with LFO)
Flanger (short delay with feedback)
Phaser (all-pass cascade with LFO)
Tremolo / Vibrato
7.3 Distortion & Saturation#
Overdrive / Distortion (soft/hard clipping)
Waveshaping (custom transfer functions)
Bitcrusher (bit depth/sample rate reduction)
7.4 Pitch & Time Manipulation#
Pitch Shifting (phase vocoder)
Time Stretching (tempo change)
Formant Shifting
7.5 Spatial Audio#
Stereo Widening (mid-side, Haas effect)
Panning (constant power, 3D)
Binaural Audio (HRTF)
Implementation Phases#
Phase 1: Foundation ✅ COMPLETED#
Priority: Critical
Epic 1: Core Library Stabilization ✅
✅ Complete missing features
✅ API stabilization
✅ Error handling
Epic 6: Testing Infrastructure ✅
✅ Expand unit tests (393 tests, >90% coverage)
✅ CI improvements
Phase 2: Major Features ✅ COMPLETED#
Priority: Critical
Epic 2: Real-Time Audio Processing ✅
✅ Audio backends (SoundDevice)
✅ Real-time pipeline with circular buffers
✅ Thread-safe parameter updates
✅ Stream processor for large files
Epic 3: CLI Application ✅
✅ Core CLI with 11 commands
✅ Pipeline processing (batch, pipes, watch)
✅ Interactive mode with live performance
✅ Preset management
✅ Sox-compatible commands
Epic 5: Documentation ✅
✅ Complete API reference
✅ CLI guide
✅ Tutorials and examples
✅ Migration guide and API stability docs
Phase 3: Optimization & Polish (v1.0)#
Priority: Medium — NEXT
Epic 4: CUDA Kernels (can start early)
IIR kernels (priority)
Effect kernels
Epic 7: Additional Effects
Can be added incrementally in v1.1+
Success Metrics for v1.0.0#
✅ API Stability: No breaking changes after v1.0.0 without major version bump
✅ Implemented deprecation system
✅ API stability guarantees documented
✅ Migration guide template created
✅ Test Coverage: >90% code coverage
✅ 393 tests across all modules
✅ Unit, integration, and CLI tests
✅ Documentation: 100% of public API documented with examples
✅ Complete API reference
✅ CLI guide with comprehensive examples
✅ Tutorials and how-to guides
✅ Performance:
✅ Real-time: 48kHz, 2048 buffer, ~46ms latency (tested in REPL)
⚠️ Batch: >100x real-time on modern GPU (needs CUDA kernels for further optimization)
✅ Platform Support: Linux, macOS, Windows with Python 3.10-3.13
✅ CI testing on multiple platforms
✅ CLI Functionality: All core commands working
✅ 11 commands implemented (process, info, play, record, convert, trim, concat, stats, preset, interactive, watch)
✅ Batch processing, pipes, TOML config, presets
✅ Community: Contributing guide, issue templates, active CI
✅ Style guide documented
✅ Roadmap maintained
Status: 6.5/7 metrics achieved — ready for v1.0.0 RC
Code Quality Standards#
TorchFX follows SOLID and DRY principles:
Single Responsibility: Each class has one clear purpose
Open/Closed: Extensible through inheritance and composition
Liskov Substitution: Consistent interfaces across similar classes
Interface Segregation: Narrow, focused interfaces
Dependency Inversion: Depend on abstractions, not implementations
Don’t Repeat Yourself: Shared utilities, reusable components
Future Considerations (Post-v1.0)#
Plugin system for third-party effects
ONNX export for deployment
Model Hub integration (HuggingFace)
Audio ML integration helpers
VST3 plugin wrapper (complex, long-term)
Contributing#
We welcome contributions! See the style guide for guidelines.
Current focus: Phase 3 (CUDA Optimization) & additional effects
Phase 1 & 2: ✅ COMPLETED
Good first issues: Check GitHub issues tagged
good-first-issueCLI Extension Ideas: Real-time visualization, AB comparison mode, spectrum analyzer
Questions: Open a discussion on GitHub