Effects#
TorchFX provides built-in audio effects for processing audio signals.
Gain and Normalization#
Gain#
- class torchfx.effect.Gain(gain, gain_type='amplitude', clamp=False)[source]#
Bases:
FXAdjust volume of audio waveforms with multiple gain modes and optional clamping.
The Gain effect modifies waveform amplitude using three different gain representations: direct amplitude multiplication, decibel (dB) adjustment, or power scaling. An optional clamping parameter prevents clipping artifacts by limiting output values to [-1.0, 1.0].
This effect extends torchaudio.transforms.Vol by adding the clamp parameter for better control over output dynamic range.
- Parameters:
gain (float) – The gain factor to apply to the waveform. Must be positive for “amplitude” and “power” gain types. Can be negative for “db” type.
gain_type (str, optional) –
The type of gain to apply. Default is “amplitude”.
”amplitude”: Direct multiplication by gain factor
”db”: Decibel-based gain using torchaudio.functional.gain
”power”: Power-based gain converted to dB internally
clamp (bool, optional) – If True, clamps the output waveform to the range [-1.0, 1.0] to prevent clipping. Default is False.
- Raises:
ValueError – If gain is negative when gain_type is “amplitude” or “power”.
See also
torchaudio.transforms.VolOriginal transform this effect is based on
NormalizeAmplitude normalization with multiple strategies
torchaudio.functional.gainFunction used for dB and power gain
Notes
Gain Type Formulas:
Amplitude: \(y[n] = x[n] \cdot \text{gain}\)
Decibel: \(y[n] = x[n] \cdot 10^{\text{gain}/20}\)
Power: \(y[n] = x[n] \cdot 10^{(10 \log_{10}(\text{gain}))/20}\)
Clamping:
When clamp=True, the final output is constrained: \(y[n] = \text{clip}(y[n], -1.0, 1.0)\)
The @torch.no_grad() decorator disables gradient computation for efficiency during inference-only operations.
This class is based on torchaudio.transforms.Vol, licensed under the BSD 2-Clause License. See licenses.torchaudio.BSD-2-Clause.txt for details.
Examples
Basic amplitude gain to double volume:
>>> import torchfx as fx >>> wave = fx.Wave.from_file("audio.wav") >>> gain = fx.Gain(gain=2.0, gain_type="amplitude") >>> louder = wave | gain
Increase volume by 6 dB with clamping:
>>> gain = fx.Gain(gain=6.0, gain_type="db", clamp=True) >>> louder = wave | gain
Increase power by 4x (equivalent to +6 dB or 2x amplitude):
>>> gain = fx.Gain(gain=4.0, gain_type="power") >>> louder = wave | gain
Reduce volume by 50% without clamping:
>>> gain = fx.Gain(gain=0.5, gain_type="amplitude") >>> quieter = wave | gain
Direct tensor processing:
>>> import torch >>> waveform = torch.randn(2, 44100) # (channels, samples) >>> gain = fx.Gain(gain=0.5, gain_type="amplitude", clamp=True) >>> quieter = gain(waveform)
Negative dB for attenuation:
>>> gain = fx.Gain(gain=-3.0, gain_type="db") >>> quieter = wave | gain
Chain with other effects:
>>> processed = wave | fx.Gain(2.0) | fx.Normalize(peak=0.8)
Normalize#
- class torchfx.effect.Normalize(peak=1.0, strategy=None)[source]#
Bases:
FXNormalize waveform amplitude to a target peak value using pluggable strategies.
The Normalize effect adjusts waveform amplitude to achieve a specified peak value using different normalization algorithms. The normalization strategy can be selected from built-in options (peak, RMS, percentile, per-channel) or provided as a custom callable function.
This effect uses the strategy pattern to support multiple normalization algorithms while maintaining a clean interface. If no strategy is specified, peak normalization is used by default.
- Parameters:
peak (float, optional) – The target peak value to normalize to. Must be positive. Default is 1.0.
strategy (NormalizationStrategy or Callable[[Tensor, float], Tensor] or None, optional) –
The normalization strategy to use. Can be:
None (default): Uses PeakNormalizationStrategy
NormalizationStrategy instance: Uses the specified strategy
Callable: Custom function wrapped in CustomNormalizationStrategy
Built-in strategies:
PeakNormalizationStrategy: Normalize to absolute maximum value
RMSNormalizationStrategy: Normalize to RMS energy level
PercentileNormalizationStrategy: Normalize to a percentile threshold
PerChannelNormalizationStrategy: Normalize each channel independently
- Raises:
AssertionError – If peak is not positive.
TypeError – If strategy is not an instance of NormalizationStrategy.
See also
PeakNormalizationStrategyNormalize to absolute maximum value
RMSNormalizationStrategyNormalize to RMS energy
PercentileNormalizationStrategyNormalize to percentile threshold
PerChannelNormalizationStrategyIndependent per-channel normalization
CustomNormalizationStrategyWrapper for custom normalization functions
GainVolume adjustment with multiple gain modes
Notes
Strategy Pattern:
The Normalize effect delegates processing to a strategy object, allowing different normalization algorithms to be used without modifying the core effect implementation. This design pattern promotes extensibility and clean separation of concerns.
Automatic Strategy Wrapping:
If a callable function is passed as the strategy parameter, it is automatically wrapped in a CustomNormalizationStrategy instance. The function must have the signature:
func(waveform: Tensor, peak: float) -> TensorProcessing with @torch.no_grad():
The forward method is decorated with @torch.no_grad() for efficient inference-only operation. If gradients are needed for training, subclass this effect and remove the decorator.
Examples
Basic peak normalization to default peak of 1.0:
>>> import torchfx as fx >>> wave = fx.Wave.from_file("audio.wav") >>> normalize = fx.Normalize() >>> normalized = wave | normalize
Normalize to a specific peak value:
>>> normalize = fx.Normalize(peak=0.8) >>> normalized = wave | normalize
Use RMS normalization strategy:
>>> from torchfx.effect import RMSNormalizationStrategy >>> normalize = fx.Normalize(peak=0.7, strategy=RMSNormalizationStrategy()) >>> normalized = wave | normalize
Use percentile normalization (99th percentile):
>>> from torchfx.effect import PercentileNormalizationStrategy >>> normalize = fx.Normalize(peak=1.0, strategy=PercentileNormalizationStrategy(percentile=99.0)) >>> normalized = wave | normalize
Per-channel normalization for stereo audio:
>>> from torchfx.effect import PerChannelNormalizationStrategy >>> normalize = fx.Normalize(peak=0.9, strategy=PerChannelNormalizationStrategy()) >>> normalized = wave | normalize
Custom normalization with a callable function:
>>> def custom_normalize(waveform, peak): ... # Normalize based on standard deviation ... std = waveform.std() ... return (waveform / std * peak) if std > 0 else waveform >>> normalize = fx.Normalize(peak=0.8, strategy=custom_normalize) >>> normalized = wave | normalize
Direct tensor processing:
>>> import torch >>> waveform = torch.randn(2, 44100) # (channels, samples) >>> normalize = fx.Normalize(peak=0.5) >>> normalized = normalize(waveform)
Chain with other effects:
>>> result = wave | fx.Gain(2.0) | fx.Normalize(peak=0.8)
References
For detailed information about creating custom normalization strategies and the strategy pattern, see wiki page “3.5 Creating Custom Effects”.
- forward(waveform)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Normalization Strategies#
- class torchfx.effect.NormalizationStrategy[source]#
Bases:
ABCAbstract base class for normalization strategies.
NormalizationStrategy defines the interface for all normalization algorithms used by the Normalize effect. Concrete implementations must implement the __call__ method to provide specific normalization logic.
This class is part of the strategy pattern implementation, allowing the Normalize effect to support multiple normalization algorithms without modifying its core implementation.
- __call__(waveform: Tensor, peak: float) Tensor[source]#
Normalize the waveform to the given peak value using the strategy’s specific algorithm.
See also
NormalizeThe effect that uses normalization strategies
PeakNormalizationStrategyNormalize to absolute maximum value
RMSNormalizationStrategyNormalize to RMS energy
PercentileNormalizationStrategyNormalize to percentile threshold
PerChannelNormalizationStrategyIndependent per-channel normalization
CustomNormalizationStrategyWrapper for custom functions
Notes
When implementing a custom normalization strategy, ensure that:
The __call__ method handles edge cases (e.g., silent audio)
The returned tensor has the same shape and dtype as the input
The strategy preserves the device of the input tensor
Examples
Implement a custom normalization strategy:
>>> from torchfx.effect import NormalizationStrategy >>> import torch >>> >>> class MedianNormalizationStrategy(NormalizationStrategy): ... def __call__(self, waveform: torch.Tensor, peak: float) -> torch.Tensor: ... median = torch.median(torch.abs(waveform)) ... return waveform / median * peak if median > 0 else waveform
Use the custom strategy:
>>> import torchfx as fx >>> wave = fx.Wave.from_file("audio.wav") >>> normalize = fx.Normalize(peak=0.8, strategy=MedianNormalizationStrategy()) >>> normalized = wave | normalize
References
For more information about the strategy pattern and creating custom strategies, see wiki page “3.5 Creating Custom Effects”.
- class torchfx.effect.PeakNormalizationStrategy[source]#
Bases:
NormalizationStrategyNormalization to the absolute peak value.
\[\begin{split}y[n] = \begin{cases} \frac{x[n]}{max(|x[n]|)} \cdot peak, & \text{if } max(|x[n]|) > 0 \\ x[n], & \text{otherwise} \end{cases}\end{split}\]- where:
\(x[n]\) is the input signal,
\(y[n]\) is the output signal,
\(peak\) is the target peak value.
- class torchfx.effect.RMSNormalizationStrategy[source]#
Bases:
NormalizationStrategyNormalization to Root Mean Square (RMS) energy.
\[\begin{split}y[n] = \begin{cases} \frac{x[n]}{RMS(x[n])} \cdot peak, & \text{if } RMS(x[n]) > 0 \\ x[n], & \text{otherwise} \end{cases}\end{split}\]- where:
\(x[n]\) is the input signal,
\(y[n]\) is the output signal,
\(RMS(x[n])\) is the root mean square of the signal,
\(peak\) is the target peak value.
- class torchfx.effect.PercentileNormalizationStrategy(percentile=99.0)[source]#
Bases:
NormalizationStrategyNormalization using a percentile of absolute values.
\[\begin{split}y[n] = \begin{cases} \frac{x[n]}{P_p(|x[n]|)} \cdot peak, & \text{if } P_p(|x[n]|) > 0 \\ x[n], & \text{otherwise} \end{cases}\end{split}\]- where:
\(x[n]\) is the input signal,
\(y[n]\) is the output signal,
\(P_p(|x[n]|)\) is the p-th percentile of the absolute values of the signal,
\(peak\) is the target peak value,
\(p\) is the specified percentile (\(0 < p \leqslant 100\)).
- Parameters:
percentile (float)
- class torchfx.effect.PerChannelNormalizationStrategy[source]#
Bases:
NormalizationStrategyNormalize each channel independently to its own peak.
\[\begin{split}y_c[n] = \begin{cases} \frac{x_c[n]}{max(|x_c[n]|)} \cdot peak, & \text{if } max(|x_c[n]|) > 0 \\ x_c[n], & \text{otherwise} \end{cases}\end{split}\]- where:
\(x_c[n]\) is the input signal for channel c,
\(y_c[n]\) is the output signal for channel c,
\(peak\) is the target peak value.
- class torchfx.effect.CustomNormalizationStrategy(func)[source]#
Bases:
NormalizationStrategyNormalization using a custom user-provided function.
This strategy wraps a user-provided callable function to make it compatible with the NormalizationStrategy interface. It is automatically used when a callable is passed to the Normalize effect’s strategy parameter.
- Parameters:
func (Callable[[Tensor, float], Tensor]) – Custom normalization function with signature: func(waveform: Tensor, peak: float) -> Tensor
- Raises:
AssertionError – If func is not callable.
See also
NormalizeEffect that uses this strategy wrapper
NormalizationStrategyAbstract base class for strategies
Notes
The custom function must:
Accept two parameters: waveform (Tensor) and peak (float)
Return a normalized Tensor with the same shape and dtype as input
Preserve the device of the input tensor
Handle edge cases (e.g., silent audio with all zeros)
Examples
Define a custom normalization function:
>>> import torch >>> def std_normalize(waveform, peak): ... std = waveform.std() ... return (waveform / std * peak) if std > 0 else waveform
Use directly with Normalize (automatically wrapped):
>>> import torchfx as fx >>> wave = fx.Wave.from_file("audio.wav") >>> normalize = fx.Normalize(peak=0.8, strategy=std_normalize) >>> normalized = wave | normalize
Or explicitly instantiate the strategy:
>>> from torchfx.effect import CustomNormalizationStrategy >>> strategy = CustomNormalizationStrategy(std_normalize) >>> normalize = fx.Normalize(peak=0.8, strategy=strategy)
Time-based Effects#
Reverb#
- class torchfx.effect.Reverb(delay=4410, decay=0.5, mix=0.5)[source]#
Bases:
FXApply reverb effect using a feedback delay network for spatial ambiance.
The Reverb effect creates spatial ambiance by simulating sound reflections in an acoustic space. It uses a simple feedback comb filter (feedback delay network) to produce reverb-like effects with controllable decay time and wet/dry mix.
This is a basic reverb implementation suitable for adding spatial depth to audio signals. For more complex reverb algorithms, consider using convolution reverbs with impulse responses.
- Parameters:
delay (int, optional) – Delay time in samples for the feedback comb filter. Determines the apparent size of the simulated space. Default is 4410 samples, which corresponds to approximately 100ms at 44.1kHz sample rate.
decay (float, optional) – Feedback decay factor controlling how quickly the reverb tail fades. Must be in the range (0, 1). Higher values create longer reverb tails. Default is 0.5.
mix (float, optional) –
Wet/dry mix controlling the balance between processed (wet) and original (dry) signals. Range is [0, 1] where:
0.0 = fully dry (no reverb)
1.0 = fully wet (only reverb)
0.5 = equal mix
Default is 0.5.
- Raises:
AssertionError – If delay is not positive, decay is not in (0, 1), or mix is not in [0, 1].
Notes
Algorithm:
The reverb is computed using a feedback comb filter:
\[y[n] = (1 - mix) \cdot x[n] + mix \cdot (x[n] + decay \cdot x[n - delay])\]- where:
\(x[n]\) is the input signal
\(y[n]\) is the output signal
\(delay\) is the delay time in samples
\(decay\) is the feedback decay factor
\(mix\) is the wet/dry mix parameter
Processing Details:
If the input waveform is shorter than the delay time, the input is returned unchanged.
The effect processes tensors of arbitrary shape (…, time).
Uses @torch.no_grad() decorator for efficient inference-only operation.
Padding is applied using torch.nn.functional.pad for the delay buffer.
Delay Time Calculation:
To convert time in milliseconds to samples:
\[delay_{samples} = \frac{time_{ms}}{1000} \cdot sample\_rate\]- For example, at 44.1kHz:
50ms = 2205 samples
100ms = 4410 samples (default)
200ms = 8820 samples
Examples
Basic reverb with default parameters:
>>> import torchfx as fx >>> wave = fx.Wave.from_file("audio.wav") >>> reverb = fx.Reverb() >>> processed = wave | reverb
Short room reverb (50ms delay):
>>> reverb = fx.Reverb(delay=2205, decay=0.4, mix=0.3) >>> processed = wave | reverb
Long hall reverb (200ms delay):
>>> reverb = fx.Reverb(delay=8820, decay=0.7, mix=0.4) >>> processed = wave | reverb
Subtle reverb with low mix:
>>> reverb = fx.Reverb(delay=4410, decay=0.5, mix=0.2) >>> processed = wave | reverb
Direct tensor processing:
>>> import torch >>> waveform = torch.randn(2, 44100) # (channels, samples) >>> reverb = fx.Reverb(delay=4410, decay=0.6, mix=0.3) >>> reverberated = reverb(waveform)
Chain with other effects:
>>> processed = wave | fx.Gain(0.8) | fx.Reverb(delay=4410, decay=0.5, mix=0.3)
GPU processing:
>>> wave = wave.to("cuda") >>> reverb = fx.Reverb(delay=4410, decay=0.6, mix=0.3).to("cuda") >>> processed = wave | reverb
- forward(waveform)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Delay#
- class torchfx.effect.Delay(delay_samples=None, bpm=None, delay_time='1/8', fs=None, feedback=0.3, mix=0.2, taps=3, strategy=None)[source]#
Bases:
FXApply a delay effect with BPM-synced musical time divisions.
The delay effect creates echoes of the input signal with configurable feedback. Supports BPM-synced delay times for musical applications.
The delay effect is computed as:
\[delayed[n] = \sum_{i=1}^{taps} feedback^{i-1} \cdot x[n - i \cdot delay] y[n] = (1 - mix) x[n] + mix \cdot delayed[n]\]- where:
x[n] is the input signal,
y[n] is the output signal,
delay is the delay time in samples,
feedback is the feedback amount (0-0.95) affecting taps 2 and beyond,
taps is the number of delay taps,
mix is the wet/dry mix parameter.
- Parameters:
delay_samples (int) – Delay time in samples. If provided, this is used directly. Default is None (requires bpm and delay_time).
bpm (float) – Beats per minute for BPM-synced delay. Required if delay_samples is None.
delay_time (str) –
Musical time division for BPM-synced delay. Should be a string in the format
n/d[modifier], where:n/drepresents the note division (e.g.,1/4for quarter note).modifieris optional and can bedfor dotted notes ortfor triplets.
Valid examples include:
1/4: Quarter note1/8: Eighth note1/16: Sixteenth note1/8d: Dotted eighth note1/4d: Dotted quarter note1/8t: Eighth note triplet
Default is
1/8.fs (int | None) – Sample frequency (sample rate) in Hz. Required if using BPM-synced delay without Wave pipeline. When None (default), fs will be automatically inferred from the Wave object when used with the pipeline operator (wave | delay). Must be positive if provided. Default is None.
feedback (float) – Feedback amount (0-0.95). Controls amplitude of taps 2 and beyond. First tap always has amplitude 1.0. Higher values create more prominent echoes. Default is 0.3.
mix (float) – Wet/dry mix. 0 = dry (original signal only), 1 = wet (delayed echoes only). Default is 0.2.
taps (int) – Number of delay taps (echoes). Each tap is delayed by delay_samples * tap_number. Default is 3.
strategy (DelayStrategy | None) – Delay processing strategy. If None, defaults to MonoDelayStrategy. Use PingPongDelayStrategy for stereo ping-pong effect, or provide a custom strategy extending DelayStrategy. Default is None.
Examples
>>> import torchfx as fx >>> import torch >>> >>> # BPM-synced delay with auto fs inference from Wave >>> wave = fx.Wave.from_file("audio.wav") >>> delay = fx.effect.Delay(bpm=128, delay_time='1/8', feedback=0.3, mix=0.2) >>> delayed = wave | delay # fs automatically inferred from wave >>> >>> # BPM-synced delay with explicit fs >>> waveform = torch.randn(2, 44100) # (channels, samples) >>> delay = fx.effect.Delay(bpm=128, delay_time='1/8', fs=44100, feedback=0.3, mix=0.2) >>> delayed = delay(waveform) >>> >>> # Direct delay in samples (no fs needed) >>> delay = fx.effect.Delay(delay_samples=2205, feedback=0.4, mix=0.3) >>> delayed = delay(waveform) >>> >>> # Ping-pong delay with strategy >>> delay = fx.effect.Delay( ... bpm=128, delay_time='1/4', fs=44100, ... feedback=0.5, mix=0.4, strategy=fx.effect.PingPongDelayStrategy() ... ) >>> delayed = delay(waveform)
Delay Strategies#
- class torchfx.effect.DelayStrategy[source]#
Bases:
ABCAbstract base class for delay processing strategies.
DelayStrategy defines the interface for different delay processing behaviors used by the Delay effect. Concrete implementations provide specific delay algorithms such as mono delay (uniform across all channels) or ping-pong delay (alternating between stereo channels).
This class is part of the strategy pattern implementation, allowing the Delay effect to support multiple processing behaviors without modifying its core implementation.
- apply_delay(waveform, delay_samples, taps, feedback) Tensor[source]#
Apply the delay effect to the waveform using the strategy’s specific algorithm.
See also
DelayThe effect that uses delay strategies
MonoDelayStrategyUniform delay for all channels
PingPongDelayStrategyAlternating stereo delay
Notes
When implementing a custom delay strategy:
The output length should be extended to accommodate all delayed taps:
output_length = input_length + (delay_samples * taps)The first tap always has amplitude 1.0, subsequent taps use feedback scaling:
feedback^(tap-1)The returned tensor should preserve the device and dtype of the input
Handle different tensor dimensions: 1D (mono), 2D (multi-channel), and higher dimensions
Examples
Implement a custom delay strategy:
>>> from torchfx.effect import DelayStrategy >>> import torch >>> >>> class CrossChannelDelayStrategy(DelayStrategy): ... '''Apply delay from each channel to all other channels.''' ... def apply_delay(self, waveform, delay_samples, taps, feedback): ... # Custom cross-channel delay logic ... original_length = waveform.size(-1) ... output_length = original_length + delay_samples * taps ... # ... implementation ... ... return delayed_waveform
Use with Delay effect:
>>> import torchfx as fx >>> wave = fx.Wave.from_file("audio.wav") >>> delay = fx.Delay(delay_samples=2205, taps=3, feedback=0.4, mix=0.3, ... strategy=CrossChannelDelayStrategy()) >>> processed = wave | delay
References
For more information about the strategy pattern and creating custom strategies, see wiki page “3.5 Creating Custom Effects”.
- abstract apply_delay(waveform, delay_samples, taps, feedback)[source]#
Apply delay processing to the waveform.
- Parameters:
waveform (Tensor) – Input audio tensor of shape (…, time) or (channels, time).
delay_samples (int) – Delay time in samples for each tap.
taps (int) – Number of delay taps (echoes). Each tap is delayed by delay_samples * tap_number.
feedback (float) – Feedback amount in range [0, 0.95]. Controls the amplitude of taps 2 and beyond. First tap always has amplitude 1.0, subsequent taps use feedback^(tap-1).
- Returns:
Delayed audio with extended length to accommodate all taps. Output length is: input_length + (delay_samples * taps).
- Return type:
Tensor
- class torchfx.effect.MonoDelayStrategy[source]#
Bases:
DelayStrategyApply uniform delay to all channels with multiple taps and feedback.
MonoDelayStrategy applies the same delay pattern to all audio channels, creating identical echoes across the stereo field. This is the default delay strategy used by the Delay effect.
The strategy creates multiple delay taps (echoes), each delayed by an integer multiple of the base delay time. The first tap has full amplitude, and subsequent taps decay exponentially based on the feedback parameter.
See also
DelayStrategyAbstract base class for delay strategies
PingPongDelayStrategyAlternating stereo delay
DelayThe effect that uses this strategy
Notes
Output Length:
The output is extended to accommodate all delayed taps:
output_length = input_length + (delay_samples * taps)Tap Amplitude:
Tap 1: amplitude = 1.0
Tap n (n > 1): amplitude = feedback^(n-1)
Multi-dimensional Support:
The strategy handles tensors of various shapes:
1D: (time,) - Mono audio
2D: (channels, time) - Multi-channel audio
Higher dimensions: (…, time) - Batched or complex audio
Examples
Use mono delay strategy explicitly:
>>> import torchfx as fx >>> from torchfx.effect import MonoDelayStrategy >>> wave = fx.Wave.from_file("audio.wav") >>> delay = fx.Delay(delay_samples=2205, taps=4, feedback=0.5, mix=0.3, ... strategy=MonoDelayStrategy()) >>> processed = wave | delay
MonoDelayStrategy is the default, so this is equivalent:
>>> delay = fx.Delay(delay_samples=2205, taps=4, feedback=0.5, mix=0.3) >>> processed = wave | delay
- class torchfx.effect.PingPongDelayStrategy[source]#
Bases:
DelayStrategyApply ping-pong delay alternating between left and right channels.
PingPongDelayStrategy creates a stereo delay effect where echoes alternate between the left and right channels, producing a “bouncing” or “ping-pong” spatial effect. This is commonly used in music production for creating wide, spacious delay effects.
The strategy requires stereo (2-channel) input. For non-stereo audio, it automatically falls back to MonoDelayStrategy.
See also
DelayStrategyAbstract base class for delay strategies
MonoDelayStrategyUniform delay for all channels
DelayThe effect that uses this strategy
Notes
Ping-Pong Pattern:
Odd taps (1, 3, 5, …): Left channel → Right channel
Even taps (2, 4, 6, …): Right channel → Left channel
This creates the characteristic bouncing effect where the delay appears to move back and forth between the left and right speakers.
Fallback Behavior:
If the input is not stereo (2 channels), the strategy automatically falls back to MonoDelayStrategy to process the audio.
Output Length:
The output is extended to accommodate all delayed taps:
output_length = input_length + (delay_samples * taps)Tap Amplitude:
Same as MonoDelayStrategy:
Tap 1: amplitude = 1.0
Tap n (n > 1): amplitude = feedback^(n-1)
Examples
Create ping-pong delay effect:
>>> import torchfx as fx >>> from torchfx.effect import PingPongDelayStrategy >>> wave = fx.Wave.from_file("stereo_audio.wav") # Must be stereo >>> delay = fx.Delay(delay_samples=2205, taps=6, feedback=0.5, mix=0.4, ... strategy=PingPongDelayStrategy()) >>> processed = wave | delay
BPM-synced ping-pong delay:
>>> delay = fx.Delay(bpm=120, delay_time="1/8", taps=8, feedback=0.6, mix=0.3, ... strategy=PingPongDelayStrategy()) >>> processed = wave | delay
Combine with reverb for spacious effect:
>>> reverb = fx.Reverb(delay=4410, decay=0.6, mix=0.2) >>> delay = fx.Delay(bpm=128, delay_time="1/4", taps=4, feedback=0.5, mix=0.3, ... strategy=PingPongDelayStrategy()) >>> processed = wave | reverb | delay
- apply_delay(waveform, delay_samples, taps, feedback)[source]#
Apply ping-pong delay (alternates between channels).
Output length is extended to accommodate all delayed taps.
- Parameters:
waveform (Tensor) – Input audio tensor. Should be stereo with shape (2, time) or (…, 2, time). For non-stereo input, falls back to MonoDelayStrategy.
delay_samples (int) – Delay time in samples for each tap.
taps (int) – Number of delay taps (echoes).
feedback (float) – Feedback amount for taps 2 and beyond.
- Returns:
Delayed audio with ping-pong effect. Shape matches input except extended time dimension.
- Return type:
Tensor