MIR: Music Information Retrieval

What is MIR and how is it used to categorize, manipulate and even create music?

MIR is the interdisciplinary science of retrieving information from music.  MIR is a small but growing field of research with many real-world applications.  Those involved in MIR may have a background in musicology, psychoacoustics, psychology, academic music study, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these.

Applications involve Recommender systems, Track separation and instrument recognition, automatic music transcription, automatic categorization, and music generation.

Methods Used:

Data Source: Scores, MIDI music, Digital audio formats such as WAV, mp3 and ogg.  Increasingly, metadata mined from the web is incorporated in MIR for a more rounded understanding of the music within its cultural context and this recently consists of analysis of social tags (folksonomy) for music.

Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later.

Feature Extraction involves reducing the number of resources required to describe a large set of data.  Many machine learning practitioners believe that properly optimized feature extraction is the key to effective model construction.   One common feature extracted may be employed to represent the key, chords, harmonies, melody, main pitch, beats per minute or rhythm in the piece.

Existing Feature Extraction Toolboxes: There are a number of audio feature extraction toolboxes available, delivered to the community in differing formats, but usually as at least on of the following formats: -stand alone applications, -plug-ins for a host application, -software function library.  To allow for delivery of tools, some APIs have been constructed to allow for feature extraction plug-ins to be developed.

Some extraction tools:

Essentia: Full function workflow environment for high and low level features, facilitating audio input, preprocessing and statistical analysis of output.

Librosa: API for feature extraction, fro processing data in python.

LibXtract: Low level feature extraction tool

Marsyas: Full real time audio processing standalone framework for data flow audio processing with GUI and CLI.

Meyda: Web audio API based low level feature extraction tool, written in Javascript. Designed for web browser based efficient real time processing.

MIR Toolbox: Audio processing API for offline extraction of hight and low level audio features in MATLAB.  Includes preprocessing classification nd clustering functionality along with audio similarity and distance metrics as part of the toolbox functionality.

Timbre Toolbox: A Matlab toolbox for offline high and low level feature extraction specifically made efficient for identifying timbre and to fulfill the Cuidado standards.

YAAFE: low level feature extraction library designed for computational efficiency and batch processing by utilizing data flow graphs.


Here are some examples of Extracting musical information from sound.  Adrian Holovaty

audio missing from first 3 min] Music Information Retrieval technology has gotten good enough that you can extract musical metadata from your sound files with some degree of accuracy. Find out how to use Python (along with third-party AP)


From Valerio Velardo – The Sound of AI

In this video, you can learn about sound power, intensity, and loudness. Delving into timbre, introducing key concepts like amplitude envelope, harmonic content, and amplitude/frequency modulation.

You may also like