Getting Started#
This guide will help you get started with TorchFX, a GPU-accelerated audio DSP library built on top of PyTorch. You’ll learn the fundamental workflow from loading audio files to creating complex processing pipelines.
What You’ll Learn#
This guide demonstrates the fundamental TorchFX workflow:
Installing the library
Loading audio with
Wave.from_file()Chaining filters and effects using the pipeline operator
|Managing device placement (CPU/GPU)
Combining filters in series and parallel
Saving processed audio
For detailed installation options including platform-specific PyTorch configuration, see Installation. For in-depth explanations of the core concepts, see Core Concepts.
Installation#
Install TorchFX using pip:
pip install torchfx
This command installs TorchFX along with its dependencies: PyTorch, torchaudio, NumPy, SciPy, and soundfile. For advanced installation options, dependency management with uv, or platform-specific PyTorch builds (CPU vs CUDA), see Installation.
See also
Installation - Complete installation guide with GPU setup and development options
Basic Concepts#
TorchFX uses an object-oriented interface where audio signals are wrapped in a Wave object that holds both the audio samples and the sampling rate.
You can build audio processing pipelines by chaining operations using the pipe operator (|), thanks to Python operator overloading.
Key Components#
FXbase class: Foundation for all audio effects and filtersPipeline operator (
|): Chains processing modules togetherPyTorch integration: All modules inherit from
torch.nn.Module
Your First Audio Processing Pipeline#
The following example demonstrates the core TorchFX workflow:
import torch
import torchfx as fx
# Load audio file
wave = fx.Wave.from_file("path_to_audio.wav")
# Apply processing pipeline
filtered_wave = (
wave
| fx.filter.LoButterworth(8000)
| fx.filter.HiShelving(2000)
| fx.effect.Reverb()
)
# Access the processed audio tensor
output_tensor = filtered_wave.ys
This example creates a processing chain that:
Wave Object and Pipeline Processing#
The following diagram illustrates how the Wave class and pipeline operator work together:
graph LR
AudioFile["Audio File<br/>(WAV/MP3/etc)"]
WaveFromFile["Wave.from_file()"]
WaveObj1["Wave object<br/>ys: Tensor<br/>fs: int"]
Filter1["fx.filter.LoButterworth<br/>torch.nn.Module"]
WaveObj2["Wave object<br/>(filtered)"]
Filter2["fx.filter.HiShelving<br/>torch.nn.Module"]
WaveObj3["Wave object<br/>(filtered + shaped)"]
Effect["fx.effect.Reverb<br/>torch.nn.Module"]
WaveObjFinal["Wave object<br/>(final output)"]
AudioFile -->|"load"| WaveFromFile
WaveFromFile --> WaveObj1
WaveObj1 -->|"| operator"| Filter1
Filter1 -->|"returns"| WaveObj2
WaveObj2 -->|"| operator"| Filter2
Filter2 -->|"returns"| WaveObj3
WaveObj3 -->|"| operator"| Effect
Effect -->|"returns"| WaveObjFinal
Each Wave object encapsulates:
ys: A PyTorchTensorcontaining audio samples (shape:[channels, samples])fs: An integer representing the sampling frequency in Hz
The pipe operator | is overloaded on the Wave class to enable functional chaining. Each filter or effect in the chain receives a Wave, processes its ys tensor, and returns a new Wave object.
Working with the Wave Class#
To begin, import the library and load a waveform from file:
import torchfx as fx
# Load an audio file
wave = fx.Wave.from_file("path_to_audio.wav")
# Access the raw audio data and sampling rate
print(wave.ys.shape) # e.g., torch.Size([2, 44100])
print(wave.fs) # e.g., 44100
print(f"Duration: {wave.duration('sec')} seconds")
print(f"Channels: {wave.channels()}")
The Wave object automatically handles stereo or multichannel data and ensures that filters retain sample rate context.
See also
Wave - Digital Audio Representation - Complete guide to the Wave class with detailed examples
Complete Example with File I/O#
Here’s a complete example that loads a file, processes it, and saves the output:
import torch
import torchfx as fx
import torchaudio
# Load audio
signal = fx.Wave.from_file("input.wav")
# Optional: Move to GPU for acceleration
if torch.cuda.is_available():
signal = signal.to("cuda")
# Apply processing pipeline
result = (
signal
| fx.filter.LoButterworth(100, order=2)
| fx.filter.HiButterworth(2000, order=2)
| fx.effect.Gain(db=-6)
)
# Save output (move back to CPU for I/O)
torchaudio.save("output.wav", result.ys.cpu(), result.fs)
This example demonstrates the complete workflow including:
Loading audio from disk
GPU acceleration for processing
Applying multiple filters in series
Saving the result back to disk
Processing Flow with Device Management#
The following sequence diagram shows the complete processing flow including device management:
sequenceDiagram
participant User
participant WaveFromFile as "Wave.from_file()"
participant WaveObj as "Wave object"
participant GPU as "GPU Device"
participant Filter as "Filter/Effect Module"
participant TorchAudio as "torchaudio.save()"
User->>WaveFromFile: "load audio file"
WaveFromFile->>WaveObj: "create Wave(ys, fs)"
Note over WaveObj: "ys is on CPU by default"
User->>WaveObj: "wave.to('cuda')"
WaveObj->>GPU: "move tensor to GPU"
GPU-->>WaveObj: "return GPU Wave"
User->>Filter: "wave | filter"
Filter->>Filter: "process ys tensor on GPU"
Filter-->>WaveObj: "return new Wave on GPU"
User->>WaveObj: "result.ys.cpu()"
WaveObj->>GPU: "move tensor to CPU"
GPU-->>WaveObj: "return CPU tensor"
User->>TorchAudio: "save(path, ys, fs)"
TorchAudio-->>User: "file written"
Key Points#
Wave.from_file()loads audio onto CPU by defaultwave.to("cuda")moves theWaveand itsystensor to GPUFilters and effects process tensors on whatever device they reside
ys.cpu()moves the tensor back to CPU for file I/Otorchaudio.save()requires CPU tensors for writing to disk
See also
GPU Acceleration - GPU acceleration best practices and performance optimization
Applying Built-in Filters#
TorchFX provides a collection of IIR and FIR filters under the torchfx.filter module. All filters are implemented as subclasses of torch.nn.Module.
Here’s an example of chaining filters with the pipe operator:
from torchfx import filter as fx_filter
# Apply a low-pass Butterworth filter at 8 kHz and a high-shelving filter at 2 kHz
filtered = (
fx.Wave.from_file("example.wav")
| fx_filter.LoButterworth(8000)
| fx_filter.HiShelving(2000)
)
# Save the processed signal
filtered.save("filtered_output.wav")
You can also build pipelines using torch.nn.Sequential or define custom modules as in PyTorch.
Parallel Filter Combination#
TorchFX supports combining filters in parallel using the + operator:
result = (
signal
| fx.filter.LoButterworth(100, order=2)
| fx.filter.HiButterworth(2000, order=4) + fx.filter.HiChebyshev1(2000, order=2)
)
This creates a parallel combination where the signal is split, processed by both filters independently, and then summed. The + operator creates a ParallelFilterCombination object that handles this routing automatically.
Series and Parallel Filter Topology#
graph TB
Input["Input Wave"]
LoFilter["fx.filter.LoButterworth<br/>cutoff=100, order=2"]
Split["Split"]
HiButterworth["fx.filter.HiButterworth<br/>cutoff=2000, order=4"]
HiChebyshev["fx.filter.HiChebyshev1<br/>cutoff=2000, order=2"]
Sum["Sum (ParallelFilterCombination)"]
Output["Output Wave"]
Input -->|"| operator (series)"| LoFilter
LoFilter --> Split
Split -->|"+ operator (parallel)"| HiButterworth
Split -->|"+ operator (parallel)"| HiChebyshev
HiButterworth --> Sum
HiChebyshev --> Sum
Sum --> Output
For more details on parallel filter combinations, see Series and Parallel Filter Combinations.
See also
Series and Parallel Filter Combinations - Complete tutorial on series and parallel filter combinations
Complete Series and Parallel Example#
Here’s a complete example demonstrating mixed series/parallel processing:
import torchfx as fx
import torch
# Load audio
wave = fx.Wave.from_file("audio.wav")
# Move to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
wave = wave.to(device)
# Complex processing chain
processed = (
wave
# Stage 1: Remove low-frequency rumble (series)
| fx.filter.LoButterworth(cutoff=100, order=2)
# Stage 2: Parallel high-pass filters (parallel)
| fx.filter.HiButterworth(2000, order=4) + fx.filter.HiChebyshev1(2000, order=2)
# Stage 3: Reduce level (series)
| fx.effect.Gain(gain=0.5, gain_type="amplitude")
)
# Save result (move to CPU for I/O)
processed.to("cpu").save("processed.wav")
Creating Your Own Effect#
To create your own audio effect, subclass the FX class (a utility base class derived from torch.nn.Module):
from torchfx.core import FX
class Invert(FX):
def forward(self, wave):
return wave.new(-wave.ys)
This custom Invert effect simply negates the audio signal. You can now use it like any other TorchFX module:
inverted = wave | Invert()
# Listen or save the output
inverted.save("inverted.wav")
The FX base class ensures that your custom effect works seamlessly with the Wave class and supports the pipe operator.
See also
FX - Audio Effect Base Class - Understanding the FX base class architecture
Creating Custom Effects - Complete tutorial on creating custom effects
Available Filters and Effects#
The following table lists commonly used filters and effects available in TorchFX:
Category |
Class Name |
Description |
|---|---|---|
IIR Filters |
Low-pass Butterworth filter |
|
High-pass Butterworth filter |
||
|
Band-pass Butterworth filter |
|
Low-pass Chebyshev Type I filter |
||
High-pass Chebyshev Type I filter |
||
Low-frequency shelving filter |
||
High-frequency shelving filter |
||
|
Peaking/notch filter |
|
FIR Filters |
Finite impulse response filter |
|
FIR with automatic coefficient design |
||
Effects |
Amplitude/dB gain control |
|
Signal normalization |
||
Reverb effect |
||
BPM-synced delay effect |
For complete API documentation, see the API Reference.
Best Practices#
Use Multi-Line Pipelines for Readability#
# ✅ GOOD: Clear, readable pipeline
filtered = (
wave
| fx.filter.LoButterworth(8000)
| fx.filter.HiShelving(2000)
| fx.effect.Reverb()
)
# ❌ BAD: Hard to read single line
filtered = wave | fx.filter.LoButterworth(8000) | fx.filter.HiShelving(2000) | fx.effect.Reverb()
Device Management#
# ✅ GOOD: Explicit device management
wave = fx.Wave.from_file("audio.wav")
if torch.cuda.is_available():
wave = wave.to("cuda")
processed = wave | effect_chain
# Move back to CPU for saving
processed.to("cpu").save("output.wav")
Reuse Filter Chains#
# ✅ GOOD: Define reusable processing chains
mastering_chain = (
fx.filter.HiButterworth(30, order=2)
| fx.filter.LoButterworth(18000, order=4)
| fx.effect.Normalize()
)
# Apply to multiple files
for audio_file in audio_files:
wave = fx.Wave.from_file(audio_file)
processed = wave | mastering_chain
processed.save(f"mastered_{audio_file}")
Common Pitfalls#
Forgetting to Move to CPU Before Saving#
# ❌ WRONG: Trying to save CUDA tensor
wave_gpu = fx.Wave.from_file("audio.wav").to("cuda")
processed = wave_gpu | effect_chain
processed.save("output.wav") # Error: can't save CUDA tensor
# ✅ CORRECT: Move to CPU before saving
processed.to("cpu").save("output.wav")
Incorrect Tensor Shape#
import torch
# ❌ WRONG: 1D tensor
mono = torch.randn(44100)
wave = fx.Wave(mono, fs=44100) # Error!
# ✅ CORRECT: 2D tensor with channel dimension
mono = torch.randn(1, 44100)
wave = fx.Wave(mono, fs=44100)
Next Steps#
After completing this quick start guide, explore these topics to deepen your understanding:
Core Concepts#
Wave - Digital Audio Representation - Learn about the Wave class in detail
FX - Audio Effect Base Class - Understand the FX base class and architecture
Pipeline Operator - Functional Composition - Deep dive into the pipeline operator
Type System - Musical Time and Audio Units - Time units and musical notation system
Tutorials#
Series and Parallel Filter Combinations - Master series and parallel filter combinations
Creating Custom Filters - Create your own custom filters
Creating Custom Effects - Design custom audio effects
ML Batch Processing - Process audio in ML pipelines
Advanced Topics#
GPU Acceleration - GPU acceleration and performance optimization
PyTorch Integration - Deep PyTorch integration patterns
Multi-Channel Processing - Working with multi-channel audio
Performance Optimization and Benchmarking - Performance tuning and optimization
Examples#
See the examples directory for more complete examples including:
Real-time audio processing
BPM-synchronized effects
Multi-band processing
Vocal processing chains
And more!
External Resources#
PyTorch Documentation - PyTorch framework documentation
torchaudio Documentation - Audio I/O and transformations
Digital Signal Processing on Wikipedia - DSP fundamentals