Type System - Musical Time and Audio Units#

TorchFX provides a rich type system designed specifically for audio processing. This system enables you to express time in musical terms (BPM, note divisions) rather than just samples or seconds, making effects more intuitive for musical applications.

Overview#

The type system includes:

Musical Time: BPM-synchronized time divisions (quarter notes, eighth notes, triplets, dotted notes)
Time Units: Seconds, milliseconds, samples
Audio Units: Decibels, bit rates
Device Types: CPU and CUDA device specifications
Window Types: Window functions for spectral analysis

        graph TB
    subgraph "TorchFX Type System"
        Musical["Musical Time<br/>MusicalTime class"]
        Time["Time Units<br/>Second, Millisecond"]
        Audio["Audio Units<br/>Decibel, BitRate"]
        Device["Device Types<br/>CPU, CUDA"]
        Window["Window Types<br/>Hann, Hamming, etc."]
    end

    subgraph "Applications"
        Delay["BPM-synced Delay"]
        Filter["Filters with cutoff"]
        Save["Audio file saving"]
        GPU["GPU acceleration"]
    end

    Musical --> Delay
    Time --> Filter
    Audio --> Save
    Device --> GPU
    Window --> Spectral["Spectral analysis"]

    style Musical fill:#e1f5ff
    style Time fill:#fff5e1
    style Audio fill:#ffe1e1
    style Device fill:#e8f5e1
    style Window fill:#f5e1ff

Musical Time#

The MusicalTime class represents musical time divisions as fractions of a bar (measure).

Basic Concept#

In music production, delays and rhythmic effects are often synchronized to the tempo (BPM). Instead of specifying delay times in milliseconds, you specify them as note divisions:

1/4: Quarter note (one beat in 4/4 time)
1/8: Eighth note (half a beat)
1/16: Sixteenth note (quarter of a beat)
1/2: Half note (two beats)

Creating Musical Times#

From Strings#

from torchfx.typing import MusicalTime

# Basic note divisions
quarter = MusicalTime.from_string("1/4")
eighth = MusicalTime.from_string("1/8")
sixteenth = MusicalTime.from_string("1/16")

# Dotted notes (1.5x duration)
dotted_quarter = MusicalTime.from_string("1/4d")
dotted_eighth = MusicalTime.from_string("1/8d")

# Triplets (2/3x duration)
eighth_triplet = MusicalTime.from_string("1/8t")
quarter_triplet = MusicalTime.from_string("1/4t")

Direct Construction#

from torchfx.typing import MusicalTime

# Create directly
quarter = MusicalTime(numerator=1, denominator=4)
dotted_eighth = MusicalTime(numerator=1, denominator=8, modifier="d")
eighth_triplet = MusicalTime(numerator=1, denominator=8, modifier="t")

Converting to Time#

Convert musical time to seconds based on BPM:

from torchfx.typing import MusicalTime

quarter = MusicalTime.from_string("1/4")

# At 120 BPM in 4/4 time
duration = quarter.duration_seconds(bpm=120, beats_per_bar=4)
print(f"Duration: {duration} seconds")  # 0.5 seconds

# At 140 BPM
duration = quarter.duration_seconds(bpm=140)
print(f"Duration: {duration} seconds")  # ~0.428 seconds

Using with Effects#

The Delay effect uses MusicalTime for BPM-synced delays:

import torchfx as fx

wave = fx.Wave.from_file("audio.wav")

# Delay synced to 128 BPM, eighth note
delay = fx.effect.Delay(bpm=128, delay_time="1/8", feedback=0.4, mix=0.3)
delayed = wave | delay

# Dotted quarter note delay (classic rock delay)
delay = fx.effect.Delay(bpm=120, delay_time="1/4d", feedback=0.5, mix=0.25)
delayed = wave | delay

# Triplet delay
delay = fx.effect.Delay(bpm=140, delay_time="1/8t", feedback=0.3, mix=0.2)
delayed = wave | delay

Mathematical Representation#

The duration of a musical time division is calculated as:

\[ t = \frac{n}{d} \times m \times \frac{60}{BPM} \times \text{beats\_per\_bar} \]

where:

\(n\) is the numerator (e.g., 1 in “1/4”)
\(d\) is the denominator (e.g., 4 in “1/4”)
\(m\) is the modifier coefficient:
- \(m = 1.0\) for normal notes
- \(m = 1.5\) for dotted notes (d)
- \(m = \frac{1}{3}\) for triplets (t)
\(BPM\) is beats per minute
\(\text{beats\_per\_bar}\) is the number of beats in a bar (default 4 for 4/4 time)

Note Division Reference#

Common note divisions and their durations at 120 BPM in 4/4 time:

Division	Name	Modifier	Duration (seconds)	Duration (ms)
`1/1`	Whole note	-	2.000	2000
`1/2`	Half note	-	1.000	1000
`1/4`	Quarter note	-	0.500	500
`1/4d`	Dotted quarter	Dotted	0.750	750
`1/8`	Eighth note	-	0.250	250
`1/8d`	Dotted eighth	Dotted	0.375	375
`1/8t`	Eighth triplet	Triplet	0.167	167
`1/16`	Sixteenth note	-	0.125	125
`1/16d`	Dotted sixteenth	Dotted	0.188	188

Time Units#

Second and Millisecond#

Type-annotated aliases for time values:

from torchfx.typing import Second, Millisecond

# These are annotated types ensuring non-negative values
duration_sec: Second = 1.5  # 1.5 seconds
duration_ms: Millisecond = 1500.0  # 1500 milliseconds

Used in Wave duration methods:

import torchfx as fx

wave = fx.Wave.from_file("audio.wav")

# Get duration in seconds
duration_sec = wave.duration("sec")  # Returns Second type

# Get duration in milliseconds
duration_ms = wave.duration("ms")  # Returns Millisecond type

Sample-Based Time#

Convert between samples and time units:

import torchfx as fx

wave = fx.Wave.from_file("audio.wav")
fs = wave.fs  # Sample rate

# Samples to seconds
num_samples = 44100
duration_sec = num_samples / fs  # 1.0 second at 44100 Hz

# Seconds to samples
duration_sec = 0.5
num_samples = int(duration_sec * fs)  # 22050 samples at 44100 Hz

# Milliseconds to samples
duration_ms = 100.0  # 100 ms
num_samples = int((duration_ms / 1000.0) * fs)  # 4410 samples at 44100 Hz

Audio Units#

Decibels#

Type-annotated for decibel values (≤ 0):

from torchfx.typing import Decibel

# Decibel values must be non-positive (≤ 0)
gain_db: Decibel = -6.0  # -6 dB attenuation
reference_db: Decibel = 0.0  # 0 dB (unity gain)

Used in effects like Gain:

import torchfx as fx

wave = fx.Wave.from_file("audio.wav")

# Gain in decibels
gained = wave | fx.effect.Gain(gain=-3.0, gain_type="db")

# Convert between amplitude and dB
import math

amplitude = 0.5
db = 20 * math.log10(amplitude)  # -6.02 dB

db_value = -6.0
amplitude = 10 ** (db_value / 20)  # 0.501

Bit Rate#

Audio bit depth specification:

from torchfx.typing import BitRate

# Valid bit rates: 8, 16, 24, 32
bit_depth: BitRate = 24

Used when saving audio files:

import torchfx as fx

wave = fx.Wave.from_file("audio.wav")

# Save as 24-bit audio
wave.save("output.wav", bits_per_sample=24, encoding="PCM_S")

# Save as 16-bit audio (CD quality)
wave.save("output.wav", bits_per_sample=16, encoding="PCM_S")

Device Types#

CPU and CUDA#

Specify where computations should run:

from torchfx.typing import Device
import torch

# Type can be string literal or torch.device
device1: Device = "cpu"
device2: Device = "cuda"
device3: Device = torch.device("cuda:0")  # Specific GPU

Used in Wave for device management:

import torchfx as fx
import torch

wave = fx.Wave.from_file("audio.wav")

# Move to GPU
wave.to("cuda")

# Check device
print(wave.device)  # "cuda"

# Create wave directly on GPU
if torch.cuda.is_available():
    tensor = torch.randn(2, 44100, device="cuda")
    wave = fx.Wave(tensor, fs=44100, device="cuda")

Window Types#

Window Functions#

For spectral analysis and filter design:

from torchfx.typing import WindowType

# Available window types
valid_windows: list[WindowType] = [
    "hann",
    "hamming",
    "blackman",
    "kaiser",
    "boxcar",
    "bartlett",
    "flattop",
    "parzen",
    "bohman",
    "nuttall",
    "barthann",
]

Used in spectral analysis and filter design:

import torch
from torchfx.typing import WindowType

# Create window for FFT
window_type: WindowType = "hann"
window_length = 2048

# Using torch directly
window = torch.hann_window(window_length)

# Or using scipy.signal.get_window (if needed)
from scipy import signal
window = signal.get_window(window_type, window_length)

Window Characteristics#

Different windows have different characteristics:

Window	Main Lobe Width	Side Lobe Level	Use Case
`boxcar`	Narrow	High (-13 dB)	Fast transitions, high resolution
`hann`	Medium	Medium (-31 dB)	General purpose, balanced
`hamming`	Medium	Low (-43 dB)	Low side lobes, smooth
`blackman`	Wide	Very Low (-58 dB)	Minimal leakage, wide transitions
`kaiser`	Adjustable	Adjustable	Configurable trade-offs
`flattop`	Very Wide	Very Low	Amplitude accuracy

Filter-Specific Types#

Filter Type#

Specify filter mode:

from torchfx.typing import FilterType

# Valid filter types
mode1: FilterType = "low"   # Low-pass
mode2: FilterType = "high"  # High-pass

Filter Order Scale#

For displaying filter order:

from torchfx.typing import FilterOrderScale

# Display order in dB/octave or linear units
scale1: FilterOrderScale = "db"       # dB/octave (e.g., -40 dB/oct)
scale2: FilterOrderScale = "linear"   # Linear order (e.g., order=4)

Spectral Types#

Spectrogram Scale#

Specify frequency scale for spectrograms:

from torchfx.typing import SpecScale

# Valid spectrogram scales
scale1: SpecScale = "mel"    # Mel scale (perceptual)
scale2: SpecScale = "lin"    # Linear scale
scale3: SpecScale = "log"    # Logarithmic scale

Type Annotations in Custom Effects#

Use TorchFX types for better type safety:

from torchfx.effect import FX
from torchfx.typing import Second, Millisecond, Device, Decibel
from torch import Tensor
import torch

class CustomEffect(FX):
    """Custom effect with type-annotated parameters."""

    def __init__(
        self,
        duration_ms: Millisecond,
        gain_db: Decibel,
        fs: int | None = None,
        device: Device = "cpu",
    ) -> None:
        super().__init__()
        assert duration_ms >= 0, "Duration must be non-negative"
        assert gain_db <= 0, "Gain in dB must be non-positive"

        self.duration_ms = duration_ms
        self.gain_db = gain_db
        self.fs = fs
        self.device = device

    @torch.no_grad()
    def forward(self, x: Tensor) -> Tensor:
        # Convert dB to linear gain
        linear_gain = 10 ** (self.gain_db / 20)

        # Convert ms to samples (if fs is set)
        if self.fs is not None:
            duration_samples = int((self.duration_ms / 1000.0) * self.fs)
            # Use duration_samples for processing

        return x * linear_gain

Best Practices#

Use Musical Time for Rhythmic Effects#

# ✅ GOOD: Musical time for delay
delay = fx.effect.Delay(bpm=128, delay_time="1/8", feedback=0.4)

# ❌ LESS GOOD: Hardcoded samples (not tempo-aware)
delay = fx.effect.Delay(delay_samples=5512)  # What BPM is this?

Validate Type Constraints#

from torchfx.typing import Decibel

def set_gain(gain_db: Decibel) -> None:
    # Decibel type hints that gain_db should be ≤ 0
    assert gain_db <= 0, "Gain in dB must be non-positive"
    # ... implementation

# This hints at incorrect usage
set_gain(6.0)  # Type checker may warn (positive dB)

# Correct usage
set_gain(-6.0)  # Attenuation

Document Units in Docstrings#

class MyEffect(FX):
    """Custom effect with time-based parameter.

    Parameters
    ----------
    delay_ms : Millisecond
        Delay time in milliseconds. Must be non-negative.
    gain_db : Decibel
        Gain in decibels. Must be non-positive (≤ 0).
    fs : int, optional
        Sample rate in Hz. If None, inferred from Wave.
    """

Common Conversions#

BPM to Time#

from torchfx.typing import MusicalTime

bpm = 120
time_div = "1/4"  # Quarter note

musical_time = MusicalTime.from_string(time_div)
duration_sec = musical_time.duration_seconds(bpm=bpm)

print(f"At {bpm} BPM, a {time_div} note lasts {duration_sec} seconds")
# At 120 BPM, a 1/4 note lasts 0.5 seconds

Samples to Time#

fs = 44100  # Sample rate

# Samples to seconds
samples = 22050
seconds = samples / fs  # 0.5 seconds

# Samples to milliseconds
milliseconds = (samples / fs) * 1000  # 500 ms

Amplitude to Decibels#

import math

# Amplitude to dB
amplitude = 0.5
db = 20 * math.log10(amplitude)  # -6.02 dB

# dB to amplitude
db = -3.0
amplitude = 10 ** (db / 20)  # 0.708

External Resources#

Musical Note Values on Wikipedia - Understanding musical time divisions
Decibel on Wikipedia - Understanding dB scale
Audio Bit Depth on Wikipedia - Bit rate and audio quality
Python Type Hints - Type annotation system

Type System - Musical Time and Audio Units#

Overview#

Musical Time#

Basic Concept#

Creating Musical Times#

From Strings#

Direct Construction#

Converting to Time#

Using with Effects#

Mathematical Representation#

Note Division Reference#

Time Units#

Second and Millisecond#

Sample-Based Time#

Audio Units#

Decibels#

Bit Rate#

Device Types#

CPU and CUDA#

Window Types#

Window Functions#

Window Characteristics#

Filter-Specific Types#

Filter Type#

Filter Order Scale#

Spectral Types#

Spectrogram Scale#

Type Annotations in Custom Effects#

Best Practices#

Use Musical Time for Rhythmic Effects#

Validate Type Constraints#

Document Units in Docstrings#

Common Conversions#

BPM to Time#

Samples to Time#

Amplitude to Decibels#

External Resources#

References#

This Page

Type System - Musical Time and Audio Units#

Overview#

Musical Time#

Basic Concept#

Creating Musical Times#

From Strings#

Direct Construction#

Converting to Time#

Using with Effects#

Mathematical Representation#

Note Division Reference#

Time Units#

Second and Millisecond#

Sample-Based Time#

Audio Units#

Decibels#

Bit Rate#

Device Types#

CPU and CUDA#

Window Types#

Window Functions#

Window Characteristics#

Filter-Specific Types#

Filter Type#

Filter Order Scale#

Spectral Types#

Spectrogram Scale#

Type Annotations in Custom Effects#

Best Practices#

Use Musical Time for Rhythmic Effects#

Validate Type Constraints#

Document Units in Docstrings#

Common Conversions#

BPM to Time#

Samples to Time#

Amplitude to Decibels#

Related Concepts#

External Resources#

References#

This Page