Getting Started#

This guide will help you get started with TorchFX, a GPU-accelerated audio DSP library built on top of PyTorch. You’ll learn the fundamental workflow from loading audio files to creating complex processing pipelines.

What You’ll Learn#

This guide demonstrates the fundamental TorchFX workflow:

  1. Installing the library

  2. Loading audio with Wave.from_file()

  3. Chaining filters and effects using the pipeline operator |

  4. Managing device placement (CPU/GPU)

  5. Combining filters in series and parallel

  6. Saving processed audio

For detailed installation options including platform-specific PyTorch configuration, see Installation. For in-depth explanations of the core concepts, see Core Concepts.

Installation#

Install TorchFX using pip:

pip install torchfx

This command installs TorchFX along with its dependencies: PyTorch, torchaudio, NumPy, SciPy, and soundfile. For advanced installation options, dependency management with uv, or platform-specific PyTorch builds (CPU vs CUDA), see Installation.

See also

Installation - Complete installation guide with GPU setup and development options

Basic Concepts#

TorchFX uses an object-oriented interface where audio signals are wrapped in a Wave object that holds both the audio samples and the sampling rate.

You can build audio processing pipelines by chaining operations using the pipe operator (|), thanks to Python operator overloading.

Key Components#

  • Wave class: Wraps audio data (ys) and sample rate (fs)

  • FX base class: Foundation for all audio effects and filters

  • Pipeline operator (|): Chains processing modules together

  • PyTorch integration: All modules inherit from torch.nn.Module

Your First Audio Processing Pipeline#

The following example demonstrates the core TorchFX workflow:

import torch
import torchfx as fx

# Load audio file
wave = fx.Wave.from_file("path_to_audio.wav")

# Apply processing pipeline
filtered_wave = (
    wave
    | fx.filter.LoButterworth(8000)
    | fx.filter.HiShelving(2000)
    | fx.effect.Reverb()
)

# Access the processed audio tensor
output_tensor = filtered_wave.ys

This example creates a processing chain that:

  1. Loads an audio file into a Wave object

  2. Applies a low-pass Butterworth filter at 8000 Hz

  3. Applies a high-shelving filter at 2000 Hz

  4. Applies a reverb effect

  5. Returns a new Wave object containing the processed audio

Wave Object and Pipeline Processing#

The following diagram illustrates how the Wave class and pipeline operator work together:

        graph LR
    AudioFile["Audio File<br/>(WAV/MP3/etc)"]
    WaveFromFile["Wave.from_file()"]
    WaveObj1["Wave object<br/>ys: Tensor<br/>fs: int"]

    Filter1["fx.filter.LoButterworth<br/>torch.nn.Module"]
    WaveObj2["Wave object<br/>(filtered)"]

    Filter2["fx.filter.HiShelving<br/>torch.nn.Module"]
    WaveObj3["Wave object<br/>(filtered + shaped)"]

    Effect["fx.effect.Reverb<br/>torch.nn.Module"]
    WaveObjFinal["Wave object<br/>(final output)"]

    AudioFile -->|"load"| WaveFromFile
    WaveFromFile --> WaveObj1
    WaveObj1 -->|"| operator"| Filter1
    Filter1 -->|"returns"| WaveObj2
    WaveObj2 -->|"| operator"| Filter2
    Filter2 -->|"returns"| WaveObj3
    WaveObj3 -->|"| operator"| Effect
    Effect -->|"returns"| WaveObjFinal
    

Each Wave object encapsulates:

  • ys: A PyTorch Tensor containing audio samples (shape: [channels, samples])

  • fs: An integer representing the sampling frequency in Hz

The pipe operator | is overloaded on the Wave class to enable functional chaining. Each filter or effect in the chain receives a Wave, processes its ys tensor, and returns a new Wave object.

Working with the Wave Class#

To begin, import the library and load a waveform from file:

import torchfx as fx

# Load an audio file
wave = fx.Wave.from_file("path_to_audio.wav")

# Access the raw audio data and sampling rate
print(wave.ys.shape)   # e.g., torch.Size([2, 44100])
print(wave.fs)         # e.g., 44100
print(f"Duration: {wave.duration('sec')} seconds")
print(f"Channels: {wave.channels()}")

The Wave object automatically handles stereo or multichannel data and ensures that filters retain sample rate context.

See also

Wave - Digital Audio Representation - Complete guide to the Wave class with detailed examples

Complete Example with File I/O#

Here’s a complete example that loads a file, processes it, and saves the output:

import torch
import torchfx as fx
import torchaudio

# Load audio
signal = fx.Wave.from_file("input.wav")

# Optional: Move to GPU for acceleration
if torch.cuda.is_available():
    signal = signal.to("cuda")

# Apply processing pipeline
result = (
    signal
    | fx.filter.LoButterworth(100, order=2)
    | fx.filter.HiButterworth(2000, order=2)
    | fx.effect.Gain(db=-6)
)

# Save output (move back to CPU for I/O)
torchaudio.save("output.wav", result.ys.cpu(), result.fs)

This example demonstrates the complete workflow including:

  • Loading audio from disk

  • GPU acceleration for processing

  • Applying multiple filters in series

  • Saving the result back to disk

Processing Flow with Device Management#

The following sequence diagram shows the complete processing flow including device management:

        sequenceDiagram
    participant User
    participant WaveFromFile as "Wave.from_file()"
    participant WaveObj as "Wave object"
    participant GPU as "GPU Device"
    participant Filter as "Filter/Effect Module"
    participant TorchAudio as "torchaudio.save()"

    User->>WaveFromFile: "load audio file"
    WaveFromFile->>WaveObj: "create Wave(ys, fs)"
    Note over WaveObj: "ys is on CPU by default"

    User->>WaveObj: "wave.to('cuda')"
    WaveObj->>GPU: "move tensor to GPU"
    GPU-->>WaveObj: "return GPU Wave"

    User->>Filter: "wave | filter"
    Filter->>Filter: "process ys tensor on GPU"
    Filter-->>WaveObj: "return new Wave on GPU"

    User->>WaveObj: "result.ys.cpu()"
    WaveObj->>GPU: "move tensor to CPU"
    GPU-->>WaveObj: "return CPU tensor"

    User->>TorchAudio: "save(path, ys, fs)"
    TorchAudio-->>User: "file written"
    

Key Points#

  1. Wave.from_file() loads audio onto CPU by default

  2. wave.to("cuda") moves the Wave and its ys tensor to GPU

  3. Filters and effects process tensors on whatever device they reside

  4. ys.cpu() moves the tensor back to CPU for file I/O

  5. torchaudio.save() requires CPU tensors for writing to disk

See also

GPU Acceleration - GPU acceleration best practices and performance optimization

Applying Built-in Filters#

TorchFX provides a collection of IIR and FIR filters under the torchfx.filter module. All filters are implemented as subclasses of torch.nn.Module.

Here’s an example of chaining filters with the pipe operator:

from torchfx import filter as fx_filter

# Apply a low-pass Butterworth filter at 8 kHz and a high-shelving filter at 2 kHz
filtered = (
    fx.Wave.from_file("example.wav")
    | fx_filter.LoButterworth(8000)
    | fx_filter.HiShelving(2000)
)

# Save the processed signal
filtered.save("filtered_output.wav")

You can also build pipelines using torch.nn.Sequential or define custom modules as in PyTorch.

Parallel Filter Combination#

TorchFX supports combining filters in parallel using the + operator:

result = (
    signal
    | fx.filter.LoButterworth(100, order=2)
    | fx.filter.HiButterworth(2000, order=4) + fx.filter.HiChebyshev1(2000, order=2)
)

This creates a parallel combination where the signal is split, processed by both filters independently, and then summed. The + operator creates a ParallelFilterCombination object that handles this routing automatically.

Series and Parallel Filter Topology#

        graph TB
    Input["Input Wave"]
    LoFilter["fx.filter.LoButterworth<br/>cutoff=100, order=2"]

    Split["Split"]
    HiButterworth["fx.filter.HiButterworth<br/>cutoff=2000, order=4"]
    HiChebyshev["fx.filter.HiChebyshev1<br/>cutoff=2000, order=2"]
    Sum["Sum (ParallelFilterCombination)"]

    Output["Output Wave"]

    Input -->|"| operator (series)"| LoFilter
    LoFilter --> Split
    Split -->|"+ operator (parallel)"| HiButterworth
    Split -->|"+ operator (parallel)"| HiChebyshev
    HiButterworth --> Sum
    HiChebyshev --> Sum
    Sum --> Output
    

For more details on parallel filter combinations, see Series and Parallel Filter Combinations.

See also

Series and Parallel Filter Combinations - Complete tutorial on series and parallel filter combinations

Complete Series and Parallel Example#

Here’s a complete example demonstrating mixed series/parallel processing:

import torchfx as fx
import torch

# Load audio
wave = fx.Wave.from_file("audio.wav")

# Move to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
wave = wave.to(device)

# Complex processing chain
processed = (
    wave
    # Stage 1: Remove low-frequency rumble (series)
    | fx.filter.LoButterworth(cutoff=100, order=2)

    # Stage 2: Parallel high-pass filters (parallel)
    | fx.filter.HiButterworth(2000, order=4) + fx.filter.HiChebyshev1(2000, order=2)

    # Stage 3: Reduce level (series)
    | fx.effect.Gain(gain=0.5, gain_type="amplitude")
)

# Save result (move to CPU for I/O)
processed.to("cpu").save("processed.wav")

Creating Your Own Effect#

To create your own audio effect, subclass the FX class (a utility base class derived from torch.nn.Module):

from torchfx.core import FX

class Invert(FX):
    def forward(self, wave):
        return wave.new(-wave.ys)

This custom Invert effect simply negates the audio signal. You can now use it like any other TorchFX module:

inverted = wave | Invert()

# Listen or save the output
inverted.save("inverted.wav")

The FX base class ensures that your custom effect works seamlessly with the Wave class and supports the pipe operator.

See also

Available Filters and Effects#

The following table lists commonly used filters and effects available in TorchFX:

Category

Class Name

Description

IIR Filters

LoButterworth

Low-pass Butterworth filter

HiButterworth

High-pass Butterworth filter

BandButterworth

Band-pass Butterworth filter

LoChebyshev1

Low-pass Chebyshev Type I filter

HiChebyshev1

High-pass Chebyshev Type I filter

LoShelving

Low-frequency shelving filter

HiShelving

High-frequency shelving filter

Peaking

Peaking/notch filter

FIR Filters

FIR

Finite impulse response filter

DesignableFIR

FIR with automatic coefficient design

Effects

Gain

Amplitude/dB gain control

Normalize

Signal normalization

Reverb

Reverb effect

Delay

BPM-synced delay effect

For complete API documentation, see the API Reference.

Best Practices#

Use Multi-Line Pipelines for Readability#

# ✅ GOOD: Clear, readable pipeline
filtered = (
    wave
    | fx.filter.LoButterworth(8000)
    | fx.filter.HiShelving(2000)
    | fx.effect.Reverb()
)

# ❌ BAD: Hard to read single line
filtered = wave | fx.filter.LoButterworth(8000) | fx.filter.HiShelving(2000) | fx.effect.Reverb()

Device Management#

# ✅ GOOD: Explicit device management
wave = fx.Wave.from_file("audio.wav")
if torch.cuda.is_available():
    wave = wave.to("cuda")

processed = wave | effect_chain

# Move back to CPU for saving
processed.to("cpu").save("output.wav")

Reuse Filter Chains#

# ✅ GOOD: Define reusable processing chains
mastering_chain = (
    fx.filter.HiButterworth(30, order=2)
    | fx.filter.LoButterworth(18000, order=4)
    | fx.effect.Normalize()
)

# Apply to multiple files
for audio_file in audio_files:
    wave = fx.Wave.from_file(audio_file)
    processed = wave | mastering_chain
    processed.save(f"mastered_{audio_file}")

Common Pitfalls#

Forgetting to Move to CPU Before Saving#

# ❌ WRONG: Trying to save CUDA tensor
wave_gpu = fx.Wave.from_file("audio.wav").to("cuda")
processed = wave_gpu | effect_chain
processed.save("output.wav")  # Error: can't save CUDA tensor

# ✅ CORRECT: Move to CPU before saving
processed.to("cpu").save("output.wav")

Incorrect Tensor Shape#

import torch

# ❌ WRONG: 1D tensor
mono = torch.randn(44100)
wave = fx.Wave(mono, fs=44100)  # Error!

# ✅ CORRECT: 2D tensor with channel dimension
mono = torch.randn(1, 44100)
wave = fx.Wave(mono, fs=44100)

Next Steps#

After completing this quick start guide, explore these topics to deepen your understanding:

Core Concepts#

Tutorials#

Advanced Topics#

Examples#

See the examples directory for more complete examples including:

  • Real-time audio processing

  • BPM-synchronized effects

  • Multi-band processing

  • Vocal processing chains

  • And more!

External Resources#

References#