librosa

Python audio and music analysis library — NumPy-based audio feature extraction and signal processing. librosa features: librosa.load() (audio loading with resampling), librosa.feature.melspectrogram(), librosa.feature.mfcc(), librosa.feature.chroma_stft(), librosa.beat.beat_track() (tempo and beats), librosa.onset.onset_detect(), librosa.effects.pitch_shift() and time_stretch(), librosa.stft() and librosa.istft(), librosa.display for visualization, harmonic/percussive source separation, and 50+ audio feature functions. NumPy-based — integrates with matplotlib for visualization and scikit-learn for ML. Standard audio analysis library for music information retrieval and audio ML feature extraction.

Evaluated Mar 06, 2026 (0d ago) v0.10.x
Homepage ↗ Repo ↗ AI & Machine Learning python librosa audio music signal-processing mel-spectrogram beat-tracking mfcc
⚙ Agent Friendliness
67
/ 100
Can an agent use this?
🔒 Security
92
/ 100
Is it safe for agents?
⚡ Reliability
82
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
85
Error Messages
82
Auth Simplicity
98
Rate Limits
98

🔒 Security

TLS Enforcement
95
Auth Strength
95
Scope Granularity
90
Dep. Hygiene
85
Secret Handling
95

Local audio analysis — no network access, no data exfiltration. Audio file loading via soundfile/audioread — validate audio file sources for agent pipelines handling user-uploaded content. No known security concerns beyond standard Python dependency hygiene.

⚡ Reliability

Uptime/SLA
85
Version Stability
82
Breaking Changes
80
Error Recovery
82
AF Security Reliability

Best When

Extracting audio features for ML model training (MFCC, mel spectrograms, chroma, beat features) or analyzing music/audio with NumPy-based pipelines — librosa is the standard Python audio analysis library with the richest feature extraction API.

Avoid When

You need GPU acceleration (use torchaudio), real-time processing, or are training PyTorch models (use torchaudio for better DataLoader integration).

Use Cases

  • Agent audio feature extraction — y, sr = librosa.load('audio.wav', sr=22050); mel = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128, fmax=8000); mel_db = librosa.power_to_db(mel, ref=np.max) — mel spectrogram in dB for agent audio classifier; standard preprocessing for music genre classification
  • Agent MFCC features — mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13); mfcc_delta = librosa.feature.delta(mfcc) — 13 MFCC coefficients + deltas for agent speech/audio classification; standard features for keyword spotting and audio event detection
  • Agent beat analysis — tempo, beats = librosa.beat.beat_track(y=y, sr=sr); beat_times = librosa.frames_to_time(beats, sr=sr) — detect tempo (BPM) and beat positions; agent music analysis pipeline extracts rhythmic structure; music-synchronized agent actions
  • Agent pitch shifting — shifted = librosa.effects.pitch_shift(y, sr=sr, n_steps=2) — shift audio pitch by 2 semitones; agent audio augmentation for voice/instrument training data; librosa.effects.time_stretch(y, rate=1.2) changes speed without pitch change
  • Agent harmonic separation — y_harmonic, y_percussive = librosa.effects.hpss(y) — separate harmonic (melodic) and percussive (rhythm) components; agent music analysis isolates melody from rhythm; harmonic component used for pitch/chord analysis, percussive for beat tracking

Not For

  • Real-time audio processing — librosa is offline analysis; for real-time use sounddevice or torchaudio.io.StreamReader
  • GPU-accelerated processing — librosa is CPU/NumPy only; for GPU audio transforms use torchaudio
  • Professional audio production — librosa is analysis-focused; for production audio editing use soundfile, pydub, or DAW software

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

No auth — local audio processing library.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

librosa is ISC licensed. Free for all use.

Agent Metadata

Pagination
none
Idempotent
Full
Retry Guidance
Not documented

Known Gotchas

  • librosa.load returns float32 normalized to [-1, 1] — unlike soundfile which preserves original int16; agent code doing arithmetic on librosa-loaded audio may overflow if expecting int16 range; librosa output is always float32 normalized; don't multiply by 32768 expecting int16 range from librosa.load
  • Default sr=22050 resamples audio — librosa.load('audio.wav') resamples to 22050 Hz regardless of source; pass sr=None to preserve original sample rate: y, sr = librosa.load('audio.wav', sr=None); agent code comparing features from different-rate audio must use consistent sr
  • Mono conversion is default — librosa.load('stereo.wav') returns mono by default (averaged channels); librosa.load('audio.wav', mono=False) returns (2, samples) for stereo; agent code expecting stereo gets mono; be explicit about mono=True/False for agent audio pipelines
  • STFT frame/sample unit confusion — librosa.feature.melspectrogram returns (n_mels, n_frames); n_frames depends on hop_length; librosa.frames_to_time(frames, sr=sr, hop_length=512) converts frames to seconds; agent code mixing frame and sample indices in time calculations gets wrong timestamps
  • librosa.load of MP3 requires audioread or soundfile with MPEG support — pure soundfile doesn't read MP3; librosa falls back to audioread (requires ffmpeg/libav); agent environments without ffmpeg cannot load MP3 with librosa; install ffmpeg or convert to WAV before agent processing
  • power_to_db should use ref=np.max not ref=1.0 for visualization — librosa.power_to_db(mel, ref=np.max) scales relative to max value giving nice 0 to -80dB range; ref=1.0 gives absolute dB values often large negative numbers; agent mel spectrogram visualizations use ref=np.max for interpretable display

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for librosa.

$99

Scores are editorial opinions as of 2026-03-06.

5173
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered