Effects#

TorchFX provides built-in audio effects for processing audio signals.

Gain and Normalization#

Gain#

class torchfx.effect.Gain(gain, gain_type='amplitude', clamp=False)[source]#

Bases: FX

Adjust volume of audio waveforms with multiple gain modes and optional clamping.

The Gain effect modifies waveform amplitude using three different gain representations: direct amplitude multiplication, decibel (dB) adjustment, or power scaling. An optional clamping parameter prevents clipping artifacts by limiting output values to [-1.0, 1.0].

Parameters:
  • gain (float) – The gain factor to apply to the waveform. Must be positive for "amplitude" and "power" gain types. Can be negative for "db".

  • gain_type ({"amplitude", "db", "power"}, default="amplitude") –

    How the gain value is interpreted:

    • "amplitude": direct multiplication by gain

    • "db": gain in decibels (output multiplied by 10 ** (gain/20))

    • "power": power ratio, converted to dB internally

  • clamp (bool, default=False) – If True, clamp the output waveform to [-1.0, 1.0] after applying the gain.

Raises:

ValueError – If gain is negative when gain_type is "amplitude" or "power".

See also

Normalize

Amplitude normalization with multiple strategies.

Notes

Gain Type Formulas:

  • Amplitude: \(y[n] = x[n] \cdot \text{gain}\)

  • Decibel: \(y[n] = x[n] \cdot 10^{\text{gain}/20}\)

  • Power: \(y[n] = x[n] \cdot 10^{(10 \log_{10}(\text{gain}))/20}\)

Clamping:

When clamp=True the output is constrained: \(y[n] = \text{clip}(y[n], -1.0, 1.0)\)

Coefficient formulas are adapted from torchaudio.transforms.Vol, BSD 2-Clause License (see licenses.torchaudio.BSD-2-Clause.txt).

Examples

Basic amplitude gain to double volume:

>>> import torchfx as fx
>>> wave = fx.Wave.from_file("audio.wav")
>>> gain = fx.Gain(gain=2.0, gain_type="amplitude")
>>> louder = wave | gain

Increase volume by 6 dB with clamping:

>>> gain = fx.Gain(gain=6.0, gain_type="db", clamp=True)
>>> louder = wave | gain

Increase power by 4x (equivalent to +6 dB or 2x amplitude):

>>> gain = fx.Gain(gain=4.0, gain_type="power")
>>> louder = wave | gain

Reduce volume by 50% without clamping:

>>> gain = fx.Gain(gain=0.5, gain_type="amplitude")
>>> quieter = wave | gain

Direct tensor processing:

>>> import torch
>>> waveform = torch.randn(2, 44100)  # (channels, samples)
>>> gain = fx.Gain(gain=0.5, gain_type="amplitude", clamp=True)
>>> quieter = gain(waveform)

Negative dB for attenuation:

>>> gain = fx.Gain(gain=-3.0, gain_type="db")
>>> quieter = wave | gain

Chain with other effects:

>>> processed = wave | fx.Gain(2.0) | fx.Normalize(peak=0.8)
forward(waveform)[source]#
Parameters:

waveform (Tensor) – Tensor of audio of dimension (…, time).

Returns:

Tensor of audio of dimension (…, time).

Return type:

Tensor

Normalize#

class torchfx.effect.Normalize(peak=1.0, strategy=None)[source]#

Bases: FX

Normalize waveform amplitude to a target peak value using pluggable strategies.

The Normalize effect adjusts waveform amplitude to achieve a specified peak value using different normalization algorithms. The normalization strategy can be selected from built-in options (peak, RMS, percentile, per-channel) or provided as a custom callable function.

This effect uses the strategy pattern to support multiple normalization algorithms while maintaining a clean interface. If no strategy is specified, peak normalization is used by default.

Parameters:
  • peak (float, optional) – The target peak value to normalize to. Must be positive. Default is 1.0.

  • strategy (NormalizationStrategy or Callable[[Tensor, float], Tensor] or None, optional) –

    The normalization strategy to use. Can be:

    • None (default): Uses PeakNormalizationStrategy

    • NormalizationStrategy instance: Uses the specified strategy

    • Callable: Custom function wrapped in CustomNormalizationStrategy

    Built-in strategies:

    • PeakNormalizationStrategy: Normalize to absolute maximum value

    • RMSNormalizationStrategy: Normalize to RMS energy level

    • PercentileNormalizationStrategy: Normalize to a percentile threshold

    • PerChannelNormalizationStrategy: Normalize each channel independently

Raises:
  • AssertionError – If peak is not positive.

  • TypeError – If strategy is not an instance of NormalizationStrategy.

See also

PeakNormalizationStrategy

Normalize to absolute maximum value

RMSNormalizationStrategy

Normalize to RMS energy

PercentileNormalizationStrategy

Normalize to percentile threshold

PerChannelNormalizationStrategy

Independent per-channel normalization

CustomNormalizationStrategy

Wrapper for custom normalization functions

Gain

Volume adjustment with multiple gain modes

Notes

Strategy Pattern:

The Normalize effect delegates processing to a strategy object, allowing different normalization algorithms to be used without modifying the core effect implementation. This design pattern promotes extensibility and clean separation of concerns.

Automatic Strategy Wrapping:

If a callable function is passed as the strategy parameter, it is automatically wrapped in a CustomNormalizationStrategy instance. The function must have the signature: func(waveform: Tensor, peak: float) -> Tensor

Processing with @torch.no_grad():

The forward method is decorated with @torch.no_grad() for efficient inference-only operation. If gradients are needed for training, subclass this effect and remove the decorator.

Examples

Basic peak normalization to default peak of 1.0:

>>> import torchfx as fx
>>> wave = fx.Wave.from_file("audio.wav")
>>> normalize = fx.Normalize()
>>> normalized = wave | normalize

Normalize to a specific peak value:

>>> normalize = fx.Normalize(peak=0.8)
>>> normalized = wave | normalize

Use RMS normalization strategy:

>>> from torchfx.effect import RMSNormalizationStrategy
>>> normalize = fx.Normalize(peak=0.7, strategy=RMSNormalizationStrategy())
>>> normalized = wave | normalize

Use percentile normalization (99th percentile):

>>> from torchfx.effect import PercentileNormalizationStrategy
>>> normalize = fx.Normalize(peak=1.0, strategy=PercentileNormalizationStrategy(percentile=99.0))
>>> normalized = wave | normalize

Per-channel normalization for stereo audio:

>>> from torchfx.effect import PerChannelNormalizationStrategy
>>> normalize = fx.Normalize(peak=0.9, strategy=PerChannelNormalizationStrategy())
>>> normalized = wave | normalize

Custom normalization with a callable function:

>>> def custom_normalize(waveform, peak):
...     # Normalize based on standard deviation
...     std = waveform.std()
...     return (waveform / std * peak) if std > 0 else waveform
>>> normalize = fx.Normalize(peak=0.8, strategy=custom_normalize)
>>> normalized = wave | normalize

Direct tensor processing:

>>> import torch
>>> waveform = torch.randn(2, 44100)  # (channels, samples)
>>> normalize = fx.Normalize(peak=0.5)
>>> normalized = normalize(waveform)

Chain with other effects:

>>> result = wave | fx.Gain(2.0) | fx.Normalize(peak=0.8)

References

For detailed information about creating custom normalization strategies and the strategy pattern, see wiki page “3.5 Creating Custom Effects”.

forward(waveform)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

waveform (Tensor)

Return type:

Tensor

Normalization Strategies#

class torchfx.effect.NormalizationStrategy[source]#

Bases: ABC

Abstract base class for normalization strategies.

NormalizationStrategy defines the interface for all normalization algorithms used by the Normalize effect. Concrete implementations must implement the __call__ method to provide specific normalization logic.

This class is part of the strategy pattern implementation, allowing the Normalize effect to support multiple normalization algorithms without modifying its core implementation.

__call__(waveform: Tensor, peak: float) Tensor[source]#

Normalize the waveform to the given peak value using the strategy’s specific algorithm.

See also

Normalize

The effect that uses normalization strategies

PeakNormalizationStrategy

Normalize to absolute maximum value

RMSNormalizationStrategy

Normalize to RMS energy

PercentileNormalizationStrategy

Normalize to percentile threshold

PerChannelNormalizationStrategy

Independent per-channel normalization

CustomNormalizationStrategy

Wrapper for custom functions

Notes

When implementing a custom normalization strategy, ensure that:

  1. The __call__ method handles edge cases (e.g., silent audio)

  2. The returned tensor has the same shape and dtype as the input

  3. The strategy preserves the device of the input tensor

Examples

Implement a custom normalization strategy:

>>> from torchfx.effect import NormalizationStrategy
>>> import torch
>>>
>>> class MedianNormalizationStrategy(NormalizationStrategy):
...     def __call__(self, waveform: torch.Tensor, peak: float) -> torch.Tensor:
...         median = torch.median(torch.abs(waveform))
...         return waveform / median * peak if median > 0 else waveform

Use the custom strategy:

>>> import torchfx as fx
>>> wave = fx.Wave.from_file("audio.wav")
>>> normalize = fx.Normalize(peak=0.8, strategy=MedianNormalizationStrategy())
>>> normalized = wave | normalize

References

For more information about the strategy pattern and creating custom strategies, see wiki page “3.5 Creating Custom Effects”.

class torchfx.effect.PeakNormalizationStrategy[source]#

Bases: NormalizationStrategy

Normalization to the absolute peak value.

\[\begin{split}y[n] = \begin{cases} \frac{x[n]}{max(|x[n]|)} \cdot peak, & \text{if } max(|x[n]|) > 0 \\ x[n], & \text{otherwise} \end{cases}\end{split}\]
where:
  • \(x[n]\) is the input signal,

  • \(y[n]\) is the output signal,

  • \(peak\) is the target peak value.

class torchfx.effect.RMSNormalizationStrategy[source]#

Bases: NormalizationStrategy

Normalization to Root Mean Square (RMS) energy.

\[\begin{split}y[n] = \begin{cases} \frac{x[n]}{RMS(x[n])} \cdot peak, & \text{if } RMS(x[n]) > 0 \\ x[n], & \text{otherwise} \end{cases}\end{split}\]
where:
  • \(x[n]\) is the input signal,

  • \(y[n]\) is the output signal,

  • \(RMS(x[n])\) is the root mean square of the signal,

  • \(peak\) is the target peak value.

class torchfx.effect.PercentileNormalizationStrategy(percentile=99.0)[source]#

Bases: NormalizationStrategy

Normalization using a percentile of absolute values.

\[\begin{split}y[n] = \begin{cases} \frac{x[n]}{P_p(|x[n]|)} \cdot peak, & \text{if } P_p(|x[n]|) > 0 \\ x[n], & \text{otherwise} \end{cases}\end{split}\]
where:
  • \(x[n]\) is the input signal,

  • \(y[n]\) is the output signal,

  • \(P_p(|x[n]|)\) is the p-th percentile of the absolute values of the signal,

  • \(peak\) is the target peak value,

  • \(p\) is the specified percentile (\(0 < p \leqslant 100\)).

Parameters:

percentile (float)

percentile#

The percentile \(p\) to use for normalization (\(0 < p \leqslant 100\)). Default is 99.0.

Type:

float

class torchfx.effect.PerChannelNormalizationStrategy[source]#

Bases: NormalizationStrategy

Normalize each channel independently to its own peak.

\[\begin{split}y_c[n] = \begin{cases} \frac{x_c[n]}{max(|x_c[n]|)} \cdot peak, & \text{if } max(|x_c[n]|) > 0 \\ x_c[n], & \text{otherwise} \end{cases}\end{split}\]
where:
  • \(x_c[n]\) is the input signal for channel c,

  • \(y_c[n]\) is the output signal for channel c,

  • \(peak\) is the target peak value.

class torchfx.effect.CustomNormalizationStrategy(func)[source]#

Bases: NormalizationStrategy

Normalization using a custom user-provided function.

This strategy wraps a user-provided callable function to make it compatible with the NormalizationStrategy interface. It is automatically used when a callable is passed to the Normalize effect’s strategy parameter.

Parameters:

func (Callable[[Tensor, float], Tensor]) – Custom normalization function with signature: func(waveform: Tensor, peak: float) -> Tensor

Raises:

AssertionError – If func is not callable.

See also

Normalize

Effect that uses this strategy wrapper

NormalizationStrategy

Abstract base class for strategies

Notes

The custom function must:

  • Accept two parameters: waveform (Tensor) and peak (float)

  • Return a normalized Tensor with the same shape and dtype as input

  • Preserve the device of the input tensor

  • Handle edge cases (e.g., silent audio with all zeros)

Examples

Define a custom normalization function:

>>> import torch
>>> def std_normalize(waveform, peak):
...     std = waveform.std()
...     return (waveform / std * peak) if std > 0 else waveform

Use directly with Normalize (automatically wrapped):

>>> import torchfx as fx
>>> wave = fx.Wave.from_file("audio.wav")
>>> normalize = fx.Normalize(peak=0.8, strategy=std_normalize)
>>> normalized = wave | normalize

Or explicitly instantiate the strategy:

>>> from torchfx.effect import CustomNormalizationStrategy
>>> strategy = CustomNormalizationStrategy(std_normalize)
>>> normalize = fx.Normalize(peak=0.8, strategy=strategy)

Time-based Effects#

Reverb#

class torchfx.effect.Reverb(room_size=0.5, damping=0.5, mix=0.3, fs=None)[source]#

Bases: FX

Freeverb-style algorithmic reverb (parallel combs + series all-passes).

Replaces the original single-comb reverb with the classic Schroeder/Moorer structure: per channel, 8 parallel low-pass-feedback comb filters are summed and fed through 4 series all-pass diffusers, producing a dense, natural decay. The comb/all-pass delay tunings scale with the sampling rate so the character is consistent across fs. The network runs in a native per-channel C++/CUDA kernel.

Parameters:
  • room_size (float, default=0.5) – Apparent room size in [0, 1] — sets the comb feedback (decay length). Larger values give a longer reverb tail.

  • damping (float, default=0.5) – High-frequency damping in [0, 1]. Larger values absorb highs faster, for a warmer/darker tail.

  • mix (float, default=0.3) – Wet/dry balance in [0, 1] (0 = dry, 1 = fully wet).

  • fs (int or None, default=None) – Sampling rate in Hz, used to scale the delay tunings. May be left None and supplied lazily by a Wave pipeline (wave | reverb).

Returns:

The reverberated waveform, same shape and dtype as the input.

Return type:

Tensor

Raises:

ValueError – If room_size/damping/mix are outside [0, 1], fs is non-positive, or forward is called before fs is known.

See also

Delay

Multi-tap delay effect with BPM synchronisation.

Notes

Each channel is processed independently with identical tunings (no stereo-width spread — a possible follow-up). The comb/all-pass delay-line state is carried across forward calls, so block-wise streaming is state-continuous (the tail flows across chunk boundaries); call reset_state() to cut the tail when switching sources. The constructor signature is a breaking change from the pre-0.7 Reverb(delay, decay, mix) API.

Examples

>>> import torch
>>> from torchfx.effect import Reverb
>>> x = torch.randn(2, 48000)
>>> y = Reverb(room_size=0.7, damping=0.4, mix=0.3, fs=48000)(x)
>>> y.shape
torch.Size([2, 48000])

In a Wave pipeline (fs supplied automatically):

>>> import torchfx as fx
>>> processed = wave | fx.Reverb(room_size=0.8, mix=0.25)  
forward(waveform)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

waveform (Tensor)

Return type:

Tensor

reset_state()[source]#

Clear the accumulated delay-line state (cut the reverb tail).

Call when switching audio sources or seeking; Wave materialisation and StreamProcessor file boundaries call this automatically.

Return type:

None

Delay#

class torchfx.effect.Delay(delay_samples=None, bpm=None, delay_time='1/8', fs=None, feedback=0.3, mix=0.2, taps=3, strategy=None)[source]#

Bases: FX

Apply a delay effect with BPM-synced musical time divisions.

The delay effect creates echoes of the input signal with configurable feedback. Supports BPM-synced delay times for musical applications.

The delay effect is computed as:

\[delayed[n] = \sum_{i=1}^{taps} feedback^{i-1} \cdot x[n - i \cdot delay] y[n] = (1 - mix) x[n] + mix \cdot delayed[n]\]
where:
  • x[n] is the input signal,

  • y[n] is the output signal,

  • delay is the delay time in samples,

  • feedback is the feedback amount (0-0.95) affecting taps 2 and beyond,

  • taps is the number of delay taps,

  • mix is the wet/dry mix parameter.

Parameters:
  • delay_samples (int) – Delay time in samples. If provided, this is used directly. Default is None (requires bpm and delay_time).

  • bpm (float) – Beats per minute for BPM-synced delay. Required if delay_samples is None.

  • delay_time (str) –

    Musical time division for BPM-synced delay. Should be a string in the format n/d[modifier], where:

    • n/d represents the note division (e.g., 1/4 for quarter note).

    • modifier is optional and can be d for dotted notes or t for triplets.

    Valid examples include:

    • 1/4: Quarter note

    • 1/8: Eighth note

    • 1/16: Sixteenth note

    • 1/8d: Dotted eighth note

    • 1/4d: Dotted quarter note

    • 1/8t: Eighth note triplet

    Default is 1/8.

  • fs (int | None) – Sample frequency (sample rate) in Hz. Required if using BPM-synced delay without Wave pipeline. When None (default), fs will be automatically inferred from the Wave object when used with the pipeline operator (wave | delay). Must be positive if provided. Default is None.

  • feedback (float) – Feedback amount (0-0.95). Controls amplitude of taps 2 and beyond. First tap always has amplitude 1.0. Higher values create more prominent echoes. Default is 0.3.

  • mix (float) – Wet/dry mix. 0 = dry (original signal only), 1 = wet (delayed echoes only). Default is 0.2.

  • taps (int) – Number of delay taps (echoes). Each tap is delayed by delay_samples * tap_number. Default is 3.

  • strategy (DelayStrategy | None) – Delay processing strategy. If None, defaults to MonoDelayStrategy. Use PingPongDelayStrategy for stereo ping-pong effect, or provide a custom strategy extending DelayStrategy. Default is None.

Examples

>>> import torchfx as fx
>>> import torch
>>>
>>> # BPM-synced delay with auto fs inference from Wave
>>> wave = fx.Wave.from_file("audio.wav")
>>> delay = fx.effect.Delay(bpm=128, delay_time='1/8', feedback=0.3, mix=0.2)
>>> delayed = wave | delay  # fs automatically inferred from wave
>>>
>>> # BPM-synced delay with explicit fs
>>> waveform = torch.randn(2, 44100)  # (channels, samples)
>>> delay = fx.effect.Delay(bpm=128, delay_time='1/8', fs=44100, feedback=0.3, mix=0.2)
>>> delayed = delay(waveform)
>>>
>>> # Direct delay in samples (no fs needed)
>>> delay = fx.effect.Delay(delay_samples=2205, feedback=0.4, mix=0.3)
>>> delayed = delay(waveform)
>>>
>>> # Ping-pong delay with strategy
>>> delay = fx.effect.Delay(
...     bpm=128, delay_time='1/4', fs=44100,
...     feedback=0.5, mix=0.4, strategy=fx.effect.PingPongDelayStrategy()
... )
>>> delayed = delay(waveform)

Author#

Uzef <@itsuzef>

forward(waveform)[source]#
Parameters:

waveform (Tensor) – Tensor of audio of dimension (…, time) or (channels, time).

Returns:

Tensor of delayed audio. Output length is extended to accommodate delayed echoes. The output will be longer than the input by up to delay_samples * taps samples.

Return type:

Tensor

Delay Strategies#

class torchfx.effect.DelayStrategy[source]#

Bases: ABC

Abstract base class for delay processing strategies.

DelayStrategy defines the interface for different delay processing behaviors used by the Delay effect. Concrete implementations provide specific delay algorithms such as mono delay (uniform across all channels) or ping-pong delay (alternating between stereo channels).

This class is part of the strategy pattern implementation, allowing the Delay effect to support multiple processing behaviors without modifying its core implementation.

apply_delay(waveform, delay_samples, taps, feedback) Tensor[source]#

Apply the delay effect to the waveform using the strategy’s specific algorithm.

Parameters:
Return type:

Tensor

See also

Delay

The effect that uses delay strategies

MonoDelayStrategy

Uniform delay for all channels

PingPongDelayStrategy

Alternating stereo delay

Notes

When implementing a custom delay strategy:

  1. The output length should be extended to accommodate all delayed taps: output_length = input_length + (delay_samples * taps)

  2. The first tap always has amplitude 1.0, subsequent taps use feedback scaling: feedback^(tap-1)

  3. The returned tensor should preserve the device and dtype of the input

  4. Handle different tensor dimensions: 1D (mono), 2D (multi-channel), and higher dimensions

Examples

Implement a custom delay strategy:

>>> from torchfx.effect import DelayStrategy
>>> import torch
>>>
>>> class CrossChannelDelayStrategy(DelayStrategy):
...     '''Apply delay from each channel to all other channels.'''
...     def apply_delay(self, waveform, delay_samples, taps, feedback):
...         # Custom cross-channel delay logic
...         original_length = waveform.size(-1)
...         output_length = original_length + delay_samples * taps
...         # ... implementation ...
...         return delayed_waveform

Use with Delay effect:

>>> import torchfx as fx
>>> wave = fx.Wave.from_file("audio.wav")
>>> delay = fx.Delay(delay_samples=2205, taps=3, feedback=0.4, mix=0.3,
...                  strategy=CrossChannelDelayStrategy())
>>> processed = wave | delay

References

For more information about the strategy pattern and creating custom strategies, see wiki page “3.5 Creating Custom Effects”.

abstract apply_delay(waveform, delay_samples, taps, feedback)[source]#

Apply delay processing to the waveform.

Parameters:
  • waveform (Tensor) – Input audio tensor of shape (…, time) or (channels, time).

  • delay_samples (int) – Delay time in samples for each tap.

  • taps (int) – Number of delay taps (echoes). Each tap is delayed by delay_samples * tap_number.

  • feedback (float) – Feedback amount in range [0, 0.95]. Controls the amplitude of taps 2 and beyond. First tap always has amplitude 1.0, subsequent taps use feedback^(tap-1).

Returns:

Delayed audio with extended length to accommodate all taps. Output length is: input_length + (delay_samples * taps).

Return type:

Tensor

class torchfx.effect.MonoDelayStrategy[source]#

Bases: DelayStrategy

Apply uniform delay to all channels with multiple taps and feedback.

MonoDelayStrategy applies the same delay pattern to all audio channels, creating identical echoes across the stereo field. This is the default delay strategy used by the Delay effect.

The strategy creates multiple delay taps (echoes), each delayed by an integer multiple of the base delay time. The first tap has full amplitude, and subsequent taps decay exponentially based on the feedback parameter.

See also

DelayStrategy

Abstract base class for delay strategies

PingPongDelayStrategy

Alternating stereo delay

Delay

The effect that uses this strategy

Notes

Output Length:

The output is extended to accommodate all delayed taps: output_length = input_length + (delay_samples * taps)

Tap Amplitude:

  • Tap 1: amplitude = 1.0

  • Tap n (n > 1): amplitude = feedback^(n-1)

Multi-dimensional Support:

The strategy handles tensors of various shapes:

  • 1D: (time,) - Mono audio

  • 2D: (channels, time) - Multi-channel audio

  • Higher dimensions: (…, time) - Batched or complex audio

Examples

Use mono delay strategy explicitly:

>>> import torchfx as fx
>>> from torchfx.effect import MonoDelayStrategy
>>> wave = fx.Wave.from_file("audio.wav")
>>> delay = fx.Delay(delay_samples=2205, taps=4, feedback=0.5, mix=0.3,
...                  strategy=MonoDelayStrategy())
>>> processed = wave | delay

MonoDelayStrategy is the default, so this is equivalent:

>>> delay = fx.Delay(delay_samples=2205, taps=4, feedback=0.5, mix=0.3)
>>> processed = wave | delay
apply_delay(waveform, delay_samples, taps, feedback)[source]#

Apply mono delay with multiple taps and feedback.

Output length is extended to accommodate all delayed taps.

Parameters:
  • waveform (Tensor) – Input audio tensor of shape (time,), (channels, time), or (…, time).

  • delay_samples (int) – Delay time in samples for each tap.

  • taps (int) – Number of delay taps (echoes).

  • feedback (float) – Feedback amount for taps 2 and beyond.

Returns:

Delayed audio with shape matching input except extended time dimension.

Return type:

Tensor

class torchfx.effect.PingPongDelayStrategy[source]#

Bases: DelayStrategy

Apply ping-pong delay alternating between left and right channels.

PingPongDelayStrategy creates a stereo delay effect where echoes alternate between the left and right channels, producing a “bouncing” or “ping-pong” spatial effect. This is commonly used in music production for creating wide, spacious delay effects.

The strategy requires stereo (2-channel) input. For non-stereo audio, it automatically falls back to MonoDelayStrategy.

See also

DelayStrategy

Abstract base class for delay strategies

MonoDelayStrategy

Uniform delay for all channels

Delay

The effect that uses this strategy

Notes

Ping-Pong Pattern:

  • Odd taps (1, 3, 5, …): Left channel → Right channel

  • Even taps (2, 4, 6, …): Right channel → Left channel

This creates the characteristic bouncing effect where the delay appears to move back and forth between the left and right speakers.

Fallback Behavior:

If the input is not stereo (2 channels), the strategy automatically falls back to MonoDelayStrategy to process the audio.

Output Length:

The output is extended to accommodate all delayed taps: output_length = input_length + (delay_samples * taps)

Tap Amplitude:

Same as MonoDelayStrategy:

  • Tap 1: amplitude = 1.0

  • Tap n (n > 1): amplitude = feedback^(n-1)

Examples

Create ping-pong delay effect:

>>> import torchfx as fx
>>> from torchfx.effect import PingPongDelayStrategy
>>> wave = fx.Wave.from_file("stereo_audio.wav")  # Must be stereo
>>> delay = fx.Delay(delay_samples=2205, taps=6, feedback=0.5, mix=0.4,
...                  strategy=PingPongDelayStrategy())
>>> processed = wave | delay

BPM-synced ping-pong delay:

>>> delay = fx.Delay(bpm=120, delay_time="1/8", taps=8, feedback=0.6, mix=0.3,
...                  strategy=PingPongDelayStrategy())
>>> processed = wave | delay

Combine with reverb for spacious effect:

>>> reverb = fx.Reverb(delay=4410, decay=0.6, mix=0.2)
>>> delay = fx.Delay(bpm=128, delay_time="1/4", taps=4, feedback=0.5, mix=0.3,
...                  strategy=PingPongDelayStrategy())
>>> processed = wave | reverb | delay
apply_delay(waveform, delay_samples, taps, feedback)[source]#

Apply ping-pong delay (alternates between channels).

Output length is extended to accommodate all delayed taps.

Parameters:
  • waveform (Tensor) – Input audio tensor. Should be stereo with shape (2, time) or (…, 2, time). For non-stereo input, falls back to MonoDelayStrategy.

  • delay_samples (int) – Delay time in samples for each tap.

  • taps (int) – Number of delay taps (echoes).

  • feedback (float) – Feedback amount for taps 2 and beyond.

Returns:

Delayed audio with ping-pong effect. Shape matches input except extended time dimension.

Return type:

Tensor