Roadmap to v1.0.0#

Current Version: 0.4.0-dev (Beta Track) Target: v1.0.0 Stable Release

This roadmap outlines the development path for TorchFX from the current beta state to a production-ready v1.0.0 release. The plan is organized into major epics, each containing specific deliverables and tasks.

Vision#

TorchFX v1.0.0 will be a production-ready, GPU-accelerated audio DSP library with:

  • Real-time processing capabilities for live audio (microphone/instrument input)

  • Modern CLI tool combining sox compatibility with GPU acceleration

  • Optimized performance through custom CUDA kernels

  • Professional documentation with comprehensive tutorials and API reference

  • >90% test coverage with integration and audio quality tests

  • Semantic versioning with backward compatibility guarantees


Current State#

Strengths#

  • ✅ Solid core DSP architecture (~2000 LOC)

  • ✅ GPU acceleration working

  • ✅ 393 tests with >90% coverage

  • ✅ Published research paper (arXiv:2504.08624)

  • ✅ Clean API with pipe operator support

  • ✅ Professional Sphinx documentation with tutorials

  • ✅ Real-time audio processing with circular buffers

  • ✅ Full-featured CLI with sox compatibility

  • ✅ Interactive REPL with live performance mode

  • ✅ Complete validation and logging infrastructure

  • ✅ API stability guarantees with deprecation system

Gaps#

  • ❌ No custom CUDA kernels yet

  • ❌ Limited ML integration examples

  • ❌ Missing some advanced effects (compressor, phaser, pitch shift)

  • ❌ No VST3 wrapper

Estimated Completion: ~85% ready for v1.0.0


Epic 1: Core Library Stabilization#

Priority: Critical (Foundation) Goal: Complete essential features and stabilize the public API with semantic versioning guarantees.

1.1 Complete Missing Core Features#

  • Implement Wave.save() / to_file() method

    • ✅ Support formats: WAV, FLAC (OGG/MP3/AAC require additional backend configuration)

    • ✅ High bit-depth: 32-bit float, 64-bit float (8, 16, 24, 32, 64 bits supported)

    • ✅ High sample rates: up to 192kHz+ (tested with 96kHz and 192kHz)

    • ✅ Metadata preservation (automatic extraction and storage via torchaudio.info)

    • Implementation details:

      • Uses torchaudio.save() as backend

      • Automatic parent directory creation

      • Format inference from file extension

      • CPU tensor conversion for compatibility

      • Comprehensive test suite (16 tests, 1 skipped for OGG)

  • Complete LoShelving filter

    • ✅ Implemented following HiShelving pattern

    • ✅ Uses Audio EQ Cookbook formulas

    • ✅ Supports both linear and dB gain scales

    • ✅ Full test coverage (7 tests)

  • Add professional filters

    • Parametric EQ (essential for music production)

      • Bell-shaped peaking filter with configurable Q and gain

      • Intuitive interface: frequency, Q, gain in dB

      • Perfect for surgical frequency adjustments

    • Elliptic filters (HiElliptic, LoElliptic)

      • Sharpest transition for given order

      • Configurable passband ripple and stopband attenuation

      • Optimal for applications where phase is not critical

    • State variable filters (TPT) - deferred to future version

1.2 API Stabilization#

  • Audit and freeze public API

    • ✅ Marked all public classes in __all__ exports

    • ✅ Created api stability with backward compatibility guarantees

    • ✅ Implemented deprecation warning system with decorators (@deprecated, @deprecated_parameter, DeprecatedAlias)

    • ✅ Full test coverage (9 tests) for deprecation utilities

  • Implement semantic versioning policy

    • ✅ Documented policy: No breaking changes in minor versions (1.x.x)

    • ✅ Deprecation warnings for at least one minor version before removal

    • ✅ Created migration guide template with migration patterns

    • ✅ Added versioning examples and guidelines

  • Parameter naming consistency

    • ✅ Standardized naming conventions documented in style guide:

      • cutoff for lowpass/highpass/shelving filters

      • frequency for ParametricEQ (center frequency)

      • Q (uppercase) for Peaking, Notch, AllPass (mathematical convention)

      • q (lowercase) for Shelving, ParametricEQ (industry convention)

      • gain with gain_scale for units (“linear” or “db”)

      • fs for sampling frequency

    • ✅ Style guide includes naming, units, code organization, and documentation standards

1.3 Error Handling & Validation#

  • Input validation layer

    • ✅ Validate sample rates, tensor shapes, parameter ranges

    • ✅ Custom exception hierarchy: TorchFXError, InvalidParameterError, AudioProcessingError

    • Implementation details:

      • New torchfx.validation subpackage with exceptions and validators

      • Exception hierarchy: TorchFXError (base), InvalidParameterError, InvalidSampleRateError, InvalidRangeError, InvalidShapeError, InvalidTypeError, AudioProcessingError, CoefficientComputationError, FilterInstabilityError

      • Validators: validate_sample_rate, validate_positive, validate_range, validate_in_set, validate_tensor_ndim, validate_audio_tensor, validate_type, validate_cutoff_frequency, validate_filter_order, validate_q_factor

      • Full test coverage (76 tests)

  • Improved error messages

    • ✅ Context-aware messages with actual vs. expected values

    • ✅ Suggestions for fixes

    • Implementation details:

      • Built into the validation exception classes (parameter_name, actual_value, expected, suggestion fields)

      • All exceptions format messages with full context automatically

  • Logging infrastructure

    • ✅ Structured logging with Python’s logging module

    • ✅ Log levels: DEBUG, INFO, WARNING, ERROR

    • ✅ Performance logging (optional)

    • Implementation details:

      • New torchfx.logging subpackage

      • NullHandler by default (opt-in logging per Python guidelines)

      • Convenience functions: enable_logging(), enable_debug_logging(), disable_logging(), get_logger()

      • Performance utilities: log_performance() context manager, LogPerformance decorator

      • Hierarchical loggers: torchfx, torchfx.performance, torchfx.<module>

      • Full test coverage (25 tests)


Epic 2: Real-Time Audio Processing#

Priority: Critical (Major Feature) Goal: Enable low-latency live audio processing with GPU acceleration.

2.1 Audio Backend Integration#

  • Abstract audio backend interface

    • AudioBackend ABC with lifecycle methods (open, start, stop, close)

    • ✅ Support input, output, duplex streams via StreamConfig and StreamDirection

    • ✅ Callback-based and blocking APIs

    • Implementation details:

      • New torchfx.realtime subpackage

      • StreamConfig frozen dataclass with direction inference and latency calculation

      • AudioCallback type alias for Callable[[Tensor, Tensor, int], None]

      • Full test coverage (62 tests)

  • PortAudio backend (Priority 1)

    • SoundDeviceBackend using sounddevice library

    • ✅ Cross-platform support (Linux, macOS, Windows)

    • ✅ Buffer size: configurable (64-4096+ samples)

    • ✅ Optional dependency — core library works without sounddevice

    • Implementation details:

      • Lazy import via _compat.py module

      • Numpy-to-tensor zero-copy conversion in callback wrapper

      • Device enumeration and default device selection

      • sounddevice in optional dependency group realtime

  • PulseAudio/PipeWire backend (Priority 2) - deferred to future version

    • Native Linux desktop integration

  • JACK backend (Future) - deferred to future version

    • Professional Linux audio routing

2.2 Real-Time Processing Pipeline#

  • Ring buffer implementation

    • ✅ Lock-free SPSC TensorRingBuffer on PyTorch tensors

    • ✅ GPU-compatible tensor buffers (configurable device)

    • ✅ Overlap-add support via peek() + advance_read()

    • Implementation details:

      • Power-of-2 capacity with bitwise modular arithmetic

      • Pre-allocated (channels, capacity) backing tensor

      • Separate read/write indices (SPSC model)

      • Wrap-around handling with split copy operations

  • Real-time processor class

    • RealtimeProcessor orchestrating backend + effect chain

    • start(), stop(), set_parameter(name, value) (thread-safe)

    • ✅ Automatic fs propagation and coefficient computation

    • reset_state() for clearing filter states

    • Implementation details:

      • Double-buffered parameter updates (lock only on swap)

      • Audio callback processes effects in sequence

      • Mono-to-stereo expansion for channel mismatch

      • Ring buffers for input/output queuing

  • Latency optimization

    • ✅ Target: <10ms total latency at 48kHz, 512 buffer (~10.7ms theoretical)

    • ✅ Pre-allocated tensor buffers via ring buffer

    • ✅ Lock-free audio path (parameters applied at buffer boundaries)

  • Stream processing for large files

    • StreamProcessor with chunk-based processing

    • process_file() for file-to-file processing

    • process_chunks() generator API for streaming pipelines

    • ✅ GPU acceleration support (device parameter)

    • Implementation details:

      • Uses torchaudio.load(frame_offset, num_frames) for efficient chunk reading

      • Uses soundfile.SoundFile for append-mode writing

      • Configurable overlap for overlap-add processing

2.3 Real-Time Effect Adaptations#

  • Stateful filter management

    • reset_state() method on RealtimeProcessor

    • ✅ Ring buffer clear on state reset

  • Thread-safe parameter updates

    • ✅ Double-buffered parameter dict with lock-on-swap

    • ✅ Parameters applied atomically at buffer boundaries

    • ✅ Automatic coefficient recomputation for filter parameters

  • CPU/GPU hybrid processing

    • ✅ StreamProcessor supports configurable device (“cpu” or “cuda”)

    • ✅ Automatic CPU↔GPU tensor transfers in stream processing

    • ✅ Real-time processor operates on CPU for low-latency callback


Epic 3: CLI Application#

Priority: High (Major Feature) ✅ COMPLETED Goal: Modern, GPU-accelerated CLI tool with sox compatibility and unique features.

3.1 Core CLI Architecture ✅#

  • CLI framework with Typer

    • ✅ Commands: process, info, play, record, convert, trim, concat, stats, preset, interactive, watch

    • ✅ Global options: --device, --verbose, --config, --version

    • ✅ Rich output with progress bars and tables

    • Implementation details:

      • Typer with Rich markup mode

      • Global state management via callback

      • Lazy imports for heavy dependencies (torch, sounddevice)

      • 71 CLI tests (356 total tests)

  • Subcommand structure

    torchfx process input.wav output.wav --effect reverb
    torchfx info audio.flac
    torchfx play audio.wav
    torchfx record output.wav --duration 10
    torchfx interactive  # REPL mode
    torchfx watch ./input/ ./output/ --effect normalize
    
  • Configuration file support (TOML)

    • ✅ Save/load effect chains from TOML files

    • ✅ Preset management: ~/.config/torchfx/presets/

    • ✅ TOML-only (tomllib stdlib on 3.11+, tomli fallback for 3.10)

    • [[effects]] format compatible with presets

3.2 Pipeline Processing & Sox Compatibility ✅#

  • Unix pipe support

    • ✅ Read from stdin: cat audio.wav | torchfx process - - -e normalize

    • ✅ Write to stdout: torchfx process input.wav - -e normalize | aplay

    • ✅ WAV and raw format support for pipes

  • Batch processing

    torchfx process "*.wav" --output-dir ./processed/ --effect normalize
    
    • ✅ Glob pattern matching

    • ✅ Rich progress bar with ETA

    • ✅ Error handling per file

    • ✅ GPU acceleration support

  • Sox-compatible commands

    • convert — format/rate/channel conversion

    • trim — extract time ranges

    • concat — join multiple files

    • stats — signal statistics (peak/RMS dBFS, crest factor, DC offset)

  • GPU-accelerated batch processing

    • ✅ Auto-batch via --device cuda

    • ✅ StreamProcessor with chunked processing

    • ✅ Progress bar with completion ETA

3.3 Interactive Mode (REPL) ✅#

  • Interactive shell

    • ✅ prompt_toolkit with tab completion

    • ✅ Syntax highlighting and Rich formatting

    • ✅ Persistent command history (~/.config/torchfx/repl_history)

    • ✅ Effect name and command completion

  • Live parameter tweaking

    torchfx> load song.wav
    torchfx> add reverb:decay=0.5
    torchfx> live
     Live playback started (2 ch, 44100 Hz, looping)
    torchfx> add normalize
    # ← Effect applies immediately during playback!
    torchfx> preset load mastering
    # ← Entire chain switches in real-time
    torchfx> live stop
    
    • ✅ Lock-free circular buffer pattern

    • ✅ Real-time effect hot-swapping

    • ✅ Audio loops continuously

    • ✅ Changes apply at buffer boundaries (~46ms latency)

  • Commands: load, add, remove, list, effects, info, play, play raw, live, live stop, save, preset save/load/list, clear, help, exit

  • Real-time visualization — deferred to v1.1

    • Waveform display

    • Spectrum analyzer

    • VU meters

  • Preset management

    • ✅ Save/load/list/show/delete/apply presets

    • ✅ TOML format compatible with --config

    • ✅ Works in both CLI and REPL

3.4 Watch Mode & Automation ✅#

  • File system watcher

    torchfx watch ./input/ --output ./processed/ --effect reverb
    torchfx watch ./bounces/ --preset mastering --recursive
    
    • ✅ Watchdog-based file monitoring

    • ✅ Auto-process new/modified audio files

    • ✅ Recursive directory watching

    • ✅ Process existing files on startup (–existing flag)

    • ✅ Preset and config file support

  • DAW integration mode

    • ✅ Monitor export folder

    • ✅ Auto-apply mastering chain from preset

    • ✅ Rich status display with live updates


Epic 4: Performance Optimization & CUDA#

Priority: Medium (Can be v1.1) Goal: Maximize throughput with custom CUDA kernels.

4.1 CUDA Development Infrastructure#

  • CUDA extension build system

    • PyTorch C++ extension API

    • Auto CUDA arch detection

    • Fallback to PyTorch if CUDA unavailable

  • Kernel development tools

    • CUDA profiling integration (nvprof, Nsight)

    • Unit tests for CUDA kernels

    • Benchmarking harness

4.2 IIR Filter CUDA Kernels (Priority 1)#

  • Parallel IIR implementation

    • Parallel prefix scan for state propagation

    • Target: 2-3x speedup for batch processing

  • Biquad cascade optimization

    • Fuse multiple biquad sections

    • Reduce memory traffic

  • Stability guarantees

    • Match scipy/PyTorch numerical behavior

4.3 Time-Domain Effects CUDA Kernels (Priority 2)#

  • Optimized delay line

    • Circular buffer with shared memory

    • Interpolation for fractional delays

  • Reverb optimization

    • Parallel all-pass filters

    • Fused feedback delay network

4.4 Batch Processing Optimizations (Priority 3)#

  • Multi-file batch processing

    • Process multiple files in single kernel launch

    • Maximize GPU occupancy

  • Operator fusion

    • Fuse multiple effects: gain + filter + normalize → single kernel

  • Memory optimization

    • Tensor memory pooling

    • In-place operations

4.5 Performance Benchmarking#

  • Comprehensive benchmark suite

    • PyTorch vs. CUDA vs. CPU comparison

    • Report throughput and latency

  • Performance regression testing

    • Automated benchmarks in CI

    • Alert on >5% regression

  • Profiling guides

    • Documentation for profiling pipelines


Epic 5: Comprehensive Documentation#

Priority: Critical (Continuous) Goal: Professional-grade documentation for v1.0.0 release.

5.1 API Reference Completion#

  • Complete all docstrings

    • Every public class, method, function

    • Parameters with types and ranges

    • Examples in docstrings

    • Mathematical formulas in LaTeX

  • Fix API documentation bugs

    • Remove non-existent method references

    • Update all code examples

    • Validate examples run

  • Auto-generated API reference

    • Sphinx autodoc with Napoleon

    • Type hints rendered

    • Cross-references

5.2 Tutorial & Guide Documentation#

  • Getting Started Tutorial (expanded)

    • ✅ Installation

    • ✅ First pipeline

    • ✅ Wave class basics

    • ✅ Saving output

  • CLI GuideNEW

    • ✅ Complete CLI tutorial covering all commands

    • ✅ Effect specifications format

    • ✅ TOML configuration examples

    • ✅ Preset management workflows

    • ✅ Interactive REPL with live performance mode

    • ✅ Watch mode for DAW integration

    • ✅ Unix pipe examples

  • Advanced Tutorials

    • Real-time audio processing (partially covered in CLI guide)

    • Custom filter design

    • GPU optimization

    • PyTorch model integration

  • How-To Guides

    • Audio format conversion

    • Building EQ/filter bank

    • Mastering chain

    • Multi-channel processing

    • Guitar pedal simulator

    • ML model integration

5.4 Project Documentation#

  • Contributing Guide

    • Code style and standards

    • Git workflow

    • Testing requirements

  • Architecture Documentation

    • High-level overview

    • Design patterns

    • Extension points

  • Migration Guides

    • Upgrading from 0.x to 1.0

    • API changes

  • FAQ & Troubleshooting

    • Common errors

    • Performance issues

    • CUDA/GPU troubleshooting


Epic 6: Testing & Quality Assurance#

Priority: Critical (Parallel with Epic 1) Goal: Achieve >90% test coverage with comprehensive testing.

6.1 Expand Unit Test Coverage#

  • Complete Wave class tests

    • ✅ File I/O for all formats

    • ✅ Multi-channel audio

    • ✅ Sample rate conversion

    • ✅ Device transfers

    • ✅ Edge cases

    • ✅ 72 Wave tests

  • Complete filter tests

    • ✅ All filter types (IIR, FIR, Biquad)

    • ✅ Frequency/phase response validation

    • ✅ Filter composition

    • ✅ Edge cases

    • ✅ 85+ filter tests

  • Complete effect tests

    • ✅ All effects and parameters

    • ✅ Error handling

    • ✅ 43 effect tests

  • CLI testsNEW

    • ✅ All CLI commands (process, info, play, record, convert, trim, concat, stats)

    • ✅ Preset management (save, load, list, show, delete, apply)

    • ✅ REPL commands (add, remove, list, clear, load, save)

    • ✅ Watch mode (file monitoring)

    • ✅ 71 CLI tests

    • ✅ Total: 393 tests with >90% coverage

6.2 Integration Tests#

  • Complex pipeline tests

    • ✅ Multi-stage effect chains

    • ✅ GPU end-to-end processing

    • ✅ File load → process → save

  • Real-time processing tests

    • ✅ Mock audio backend

    • ✅ Latency measurements

    • ✅ Parameter updates during processing

    • ✅ 62 realtime tests

  • CLI integration tests

    • ✅ All CLI commands

    • ✅ Pipe I/O

    • ✅ Batch processing

    • ✅ Config file loading

    • ✅ Preset workflows

6.3 Audio Quality Tests#

  • Audio quality metrics

    • SNR, THD, frequency response error

    • Compare against scipy/reference

  • Regression tests

    • Golden output files

    • Detect quality degradation

  • Perceptual quality tests (optional)

    • PESQ, PEAQ

6.4 Performance & Memory Tests#

  • Memory leak detection

    • Long-running tests

    • GPU memory monitoring

  • Performance benchmarks as tests

    • Minimum speed requirements

    • Prevent regressions

6.5 CI/CD Improvements#

  • Coverage reporting

    • Codecov integration

    • 90% coverage enforcement

    • Coverage badge

  • Multi-platform testing

    • Linux, macOS, Windows

    • Python 3.10-3.13

    • With/without CUDA

  • GPU CI runner

    • Self-hosted or cloud GPU

    • CUDA tests and benchmarks

  • Automated releases

    • PyPI publishing on tag

    • Changelog generation


Epic 7: Additional Effects#

Priority: Low (Can be v1.1+) Goal: Expand effect library for common production needs.

7.1 Dynamics Processing#

  • Compressor (threshold, ratio, attack, release, knee)

  • Limiter (brickwall, true peak, look-ahead)

  • Expander / Gate

7.2 Modulation Effects#

  • Chorus (multi-tap delay with LFO)

  • Flanger (short delay with feedback)

  • Phaser (all-pass cascade with LFO)

  • Tremolo / Vibrato

7.3 Distortion & Saturation#

  • Overdrive / Distortion (soft/hard clipping)

  • Waveshaping (custom transfer functions)

  • Bitcrusher (bit depth/sample rate reduction)

7.4 Pitch & Time Manipulation#

  • Pitch Shifting (phase vocoder)

  • Time Stretching (tempo change)

  • Formant Shifting

7.5 Spatial Audio#

  • Stereo Widening (mid-side, Haas effect)

  • Panning (constant power, 3D)

  • Binaural Audio (HRTF)


Implementation Phases#

Phase 1: Foundation ✅ COMPLETED#

Priority: Critical

  • Epic 1: Core Library Stabilization

    • ✅ Complete missing features

    • ✅ API stabilization

    • ✅ Error handling

  • Epic 6: Testing Infrastructure

    • ✅ Expand unit tests (393 tests, >90% coverage)

    • ✅ CI improvements

Phase 2: Major Features ✅ COMPLETED#

Priority: Critical

  • Epic 2: Real-Time Audio Processing

    • ✅ Audio backends (SoundDevice)

    • ✅ Real-time pipeline with circular buffers

    • ✅ Thread-safe parameter updates

    • ✅ Stream processor for large files

  • Epic 3: CLI Application

    • ✅ Core CLI with 11 commands

    • ✅ Pipeline processing (batch, pipes, watch)

    • ✅ Interactive mode with live performance

    • ✅ Preset management

    • ✅ Sox-compatible commands

  • Epic 5: Documentation

    • ✅ Complete API reference

    • ✅ CLI guide

    • ✅ Tutorials and examples

    • ✅ Migration guide and API stability docs

Phase 3: Optimization & Polish (v1.0)#

Priority: Medium — NEXT

  • Epic 4: CUDA Kernels (can start early)

    • IIR kernels (priority)

    • Effect kernels

  • Epic 7: Additional Effects

    • Can be added incrementally in v1.1+


Success Metrics for v1.0.0#

  1. API Stability: No breaking changes after v1.0.0 without major version bump

    • ✅ Implemented deprecation system

    • ✅ API stability guarantees documented

    • ✅ Migration guide template created

  2. Test Coverage: >90% code coverage

    • ✅ 393 tests across all modules

    • ✅ Unit, integration, and CLI tests

  3. Documentation: 100% of public API documented with examples

    • ✅ Complete API reference

    • ✅ CLI guide with comprehensive examples

    • ✅ Tutorials and how-to guides

  4. Performance:

    • ✅ Real-time: 48kHz, 2048 buffer, ~46ms latency (tested in REPL)

    • ⚠️ Batch: >100x real-time on modern GPU (needs CUDA kernels for further optimization)

  5. Platform Support: Linux, macOS, Windows with Python 3.10-3.13

    • ✅ CI testing on multiple platforms

  6. CLI Functionality: All core commands working

    • ✅ 11 commands implemented (process, info, play, record, convert, trim, concat, stats, preset, interactive, watch)

    • ✅ Batch processing, pipes, TOML config, presets

  7. Community: Contributing guide, issue templates, active CI

    • ✅ Style guide documented

    • ✅ Roadmap maintained

Status: 6.5/7 metrics achieved — ready for v1.0.0 RC


Code Quality Standards#

TorchFX follows SOLID and DRY principles:

  • Single Responsibility: Each class has one clear purpose

  • Open/Closed: Extensible through inheritance and composition

  • Liskov Substitution: Consistent interfaces across similar classes

  • Interface Segregation: Narrow, focused interfaces

  • Dependency Inversion: Depend on abstractions, not implementations

  • Don’t Repeat Yourself: Shared utilities, reusable components


Future Considerations (Post-v1.0)#

  • Plugin system for third-party effects

  • ONNX export for deployment

  • Model Hub integration (HuggingFace)

  • Audio ML integration helpers

  • VST3 plugin wrapper (complex, long-term)


Contributing#

We welcome contributions! See the style guide for guidelines.

  • Current focus: Phase 3 (CUDA Optimization) & additional effects

  • Phase 1 & 2: ✅ COMPLETED

  • Good first issues: Check GitHub issues tagged good-first-issue

  • CLI Extension Ideas: Real-time visualization, AB comparison mode, spectrum analyzer

  • Questions: Open a discussion on GitHub