blob: ae8317993248aefe54deccbd65b08a9528656f35 [file] [log] [blame]
$Id: musicmatch.txt,v 1.4 2000/09/09 23:02:37 eldamitri Exp $
MusicMatch (TM) tag format description
Status of this document
This document is a description of a deprecated tagging format. The
information contained herein is not a specification; its intent is to
interpret the format based on hundreds of examples. It also relies heavily
on information obtained by others who have done similar investigations. It
is not based on any official documentation of the format, as such
documentation is not publicly available. Therefore the contents of this
document may change to adjust for newly-discovered information, but the
format itself is unlikely to change due to its deprecation.
Distribution of this document is unlimited.
Abstract
This document describes the MusicMatch tagging format present in some
digital audio files. This format, like other tagging specifications,
provides a method for storing information about an audio file within itself
to document its contents. This format was developed by MusicMatch and
used exclusively by older versions of Jukebox, their popular, "all-in-one"
MP3 application.
1. Table of contents
Status of this document
Abstract
1. Table of contents
2. Introduction
3. Conventions in this document
4. Tagging format
4.1. Header
4.2. Image extension
4.3. Image binary
4.4. Unused
4.5. Version information
4.6. Audio meta-data
4.6.1. Single-line text fields
4.6.2. Non-text fields
4.6.3. Multi-line text fields
4.6.4. Internet addresses
4.6.5. Padding
4.7. Data offsets
4.8. Footer
5. Identifying and parsing a MusicMatch tag
6. Converting to ID3v2
7. Copyright
8. References
9. Author's Address
2. Introduction
The following document describes the structure of the tagging format used
by MusicMatch (TM) Jukebox, prior to version 4.0 of that application. This
program is a so-called "All-In-One" MP3 program and provides a CD-Ripper,
WAV-to-MP3 converter, database, and MP3 player.
The MusicMatch tagging format has gone through several incremental
iterations in its format, although the basic structure has remained fairly
constant throughout its history. The various formats of the MusicMatch tag
have been tightly coupled with the version of Jukebox that created it. As
such, this document will refer to the Jukebox version and tagging format
version interchangeably.
As of version 4.0, MusicMatch has deprecated the use of this format in
their own Jukebox application, transitioning instead to ID3v2, an open
standard for tagging digital audio. Unfortunately, despite repeated
requests, MusicMatch has not provided to the public any documents
describing this format, and the MusicMatch Jukebox is the only
widely-distributed software application that can read and write these tags.
As such, this text may not be completely accurate and is surely incomplete,
but it covers enough to the format to enable one to write robust software
to find and parse tags in this format. For example, the id3lib tagging
library's MusicMatch parsing routines were written solely based on the
information found in this document. However, the authors cannot be held
responsible for any inaccuracies or any harm caused by using this
information. One can assume that the specifition is unlikely to change,
given MusicMatch's own abandonment of the format. It should also be noted
that incoporating functionality into applications to write tags in this
format is discouraged, as the format has been officially deprecated by
MusicMatch themselves.
3. Conventions in this document
This document borrows heavily from specifications written by Martin
Nillson, author of the ID3v2 tagging standard. Much of the structure,
formatting, and other such conventions used in the ID3v2 specifications are
carried over into this document.
Text within "" is a text string exactly as it appears in a tag. Numbers
preceded with $ are hexadecimal and numbers preceded with % are binary. $xx
is used to indicate a byte with unknown content.
4. Tag overview
The MusicMatch Tagging Format was designed to store specific types of audio
meta-data inside the audio file itself. As the format was used exclusively
by the MusicMatch Jukebox application, it is used only with MPEG-1/2 layer
III files encoded with that program. However, its tagging format is not
inherently exclusive of other audio formats, and could conceivably be
used with other types of encodings.
MusicMatch tags were originally designed to come at the very end of MP3
files, after all of the MP3 audio frames. Starting with Jukebox version
3.1, the application became more ID3-friendly and started placing ID3v1
tags after the MusicMatch tag as well. In practice, since very few
applications outside of the MusicMatch Jukebox are capable of reading and
understanding this format, it is not unusual to find MusicMatch tags
"buried" within mp3 files, coming before other types of tagging formats in
a file, such as Lyrics3 or ID3v2.4.0. Such "relocations" are not uncommon,
and therefore any software application that intends to find, read, and
parse MusicMatch tags should be flexible in this endeavor, despite the
apparent intentions of the original specification.
Although various sections of a MusicMatch tag are fixed in length, other
sections are not, and so tag lengths can vary from one file to another. A
valid MusicMatch tag will be at least 8 kilobytes (8192 bytes) in length.
Those tags with image data will often be much larger.
The byte-order in 4-byte pointers and multibyte numbers for MusicMatch tags
is least-significant byte (LSB) first, also known as "little endian". For
example, $12345678 is encoded as $78 56 34 12.
Overall tag structure:
+-----------------------------+
| Header |
| (256 bytes, OPTIONAL) |
+-----------------------------+
| Image extension (4 bytes) |
+-----------------------------+
| Image binary |
| (var. length >= 4 bytes) |
+-----------------------------+
| Unused (4 bytes) |
+-----------------------------+
| Version info (256 bytes) |
+-----------------------------+
| Audio meta-data |
| (var. length >= 7868 bytes) |
+-----------------------------+
| Data offsets (20 bytes) |
+-----------------------------+
| Footer (48 bytes) |
+-----------------------------+
This document will describe the various sections of the tag in the order
listed above (that is, in the sequential order that they appear when
reading the tag from beginning to end). However, due to the nature of the
tag's format, in practice the tag's sections will often be parsed in the
reverse order. A robust parsing algorithm will be suggested and described
later in the document.
4.1. Header
An optional tag header often precedes the tag data in a MusicMatch tag.
Although the rules that determine this header's required presence are
unknown, the header is usually found in tag versions up to and including
2.50, and is usually lacking otherwise. Luckily, its format is rigid and
therefore its presence is easy to determine. The data in the header are
not vital to the correct parsing of the rest of the tag and can thus be
discarded. The header is the only optional section in a MusicMatch tag.
All other sections are required to consider the tag valid.
The header section is always 256 bytes in length. It begins with three
10-byte subsections, and ends with 226 bytes of space ($20) padding. Each
of the first three subsections contains an 8-byte ASCII text string
followed by two bytes of null ($00) padding.
The first subsection serves as a sync string: its 8-byte string is always
"18273645".
The second subsection's 8-byte string is the version of the Xing encoder
used to encode the mp3 file. The last four bytes of this string are
usually '0' ($30). An example of this string is "1.010000".
The third and final 10-byte subsection is the version of the MusicMatch
Jukebox used to encode the mp3 file. The last four bytes of this string
are usually '0' ($30). An example of this string is "2.120000".
Sync string "18273645"
Null padding $00 00
Xing encoder version <8-byte numerical ASCII string>
Null padding $00 00
MusicMatch version <8-byte numerical ASCII string>
Null padding $00 00
Space padding 226 * $20
4.2. Image extension
MusicMatch tags can contain at most one image. This first required section
is the extension of the image when saved as a file (for example, "jpg" or
"bmp"). This section is 4 bytes in length, and the data is padded with
spaces ($20) if the extension doesn't use all 4 bytes (in practice, 3-byte
extensions are the most prevalent). Likewise, tags without images have all
spaces for this section (4 * $20).
Picture extension $xx xx xx xx
4.3. Image binary
When an image is present in the tag, the image binary section consists of
two fields. The first field is the size of the image data, in bytes. The
second is the actual image data.
Image size $xx xx xx xx
Image data <binary data>
If no image is present, the image binary section consists of exactly four
null bytes ($00 00 00 00).
4.4. Unused
This section is never used, to the best of the author's knowledge. It is
always 4 null ($00) bytes.
Null padding $00 00 00 00
4.5. Version information
This section of the tag has the exact same format as the header. Unlike
the header, this section is required for the tag to be considered valid.
Sync string "18273645"
Null padding $00 00
Xing encoder version <8-byte numerical ASCII string>
Null padding $00 00
MusicMatch version <8-byte numerical ASCII string>
Null padding $00 00
Space padding 226 * $20
4.6. Audio meta-data
The audio meta-data is the heart of the MusicMatch tag. It contains most
of the pertinent information found in other tagging formats (song title,
album title, artist, etc.) and some that are unique to this format (mood,
prefernce, situation).
In all versions of the MusicMatch format up to and including 3.00, this
section is always 7868 bytes in length. All subsequent versions allowed
three possible lengths for this section: 7936, 8004, and 8132 bytes. The
conditions under which a particular length from these three possibilities
was used is unknown. In all cases, this section is padded with dashes
($2D) to achieve this constant size.
Due to the great number of fields in this portion of the tag, they are
divided amongst the next four sections of the document: single-line text
fields, non-text fields, multi-line text fields, and internet addresses.
This clarification is somewhat arbitrary and somewhat inaccurate (some of
the fields described as "non-text" are indeed ASCII strings). However, the
clarification does allow for easier description of the meta-data as a
whole. At any rate, the actual fields in this section of the tag appear
sequentially in the order presented.
4.6.1. Single-line text fields
The first group entries in this section of the tag are variable-length
ASCII text strings. Each of these strings are preceded by a two-byte field
describing the size of the following string (again, in LSB order).
Multiple entries in a text field are separated by a semicolon ($3B). An
empty (and non-existant) text field is indicated by a size field of 0 ($00
00).
The first three of these entries are fairly-self explanatory: song title,
album title, and artist name.
The final five entries are a little less common: Genre, Tempo, Mood,
Situation, and Preference. These fields can contain any information, but
do to the interface and default set-up for the Jukebox application, they
typically are limited to a subset of possibilities.
The Genre entry differs from the ID3v1 tagging format in that it allows
a full-text genre description, whereas ID3v1 maps a number to a list of
genres. Again, the genre description could be anything, but the interface
in Jukebox typically limited most users to the standard ID3v1 genres.
The Tempo entry is intended to describe the general tempo of the song. The
Jukebox application provided the following defaults: None, Fast, Pretty
fast, Moderate, Pretty slow, and Slow.
The Mood entry describes what type of mood the audio establishes: Typical
values include the following: None, Wild, Upbeat, Morose, Mellow, Tranquil,
and Comatose.
The Situation entry describes in which situation this music is best played.
Expect the following: None, Dance, Party, Romantic, Dinner, Background,
Seasonal, Rave, and Drunken Brawl.
The Preference entry allows the user to rate the song. Possible values
include the following: None, Excellent, Very Good, Good, Fair, Poor, and
Bad Taste.
Song title length $xx xx
Song title <ASCII string>
Album title length $xx xx
Album title <ASCII string>
Artist name length $xx xx
Artist name <ASCII string>
Genre length $xx xx
Genre <ASCII string>
Tempo length $xx xx
Tempo <ASCII string>
Mood length $xx xx
Mood <ASCII string>
Situation length $xx xx
Situation <ASCII string>
Preference length $xx xx
Preference <ASCII string>
4.6.2. Non-text fields
The next group of fields is described here as "non-text". They are
probably better described as entries that are auto-created (i.e., not
entered in by a user), although this isn't entirely accurate, either, as
the track number field is determined by user input. At any rate, they've
been separated to clarify the presentation of the material.
The "Song duration" entry consists of two fields: a size and text. The
text is formatted as "minutes:seconds", and thus the size field is
typically 4 ($04 00).
The only field that is neither a string nor a LSB numerical value is the
creation date. It is 8-byte floating-point value. It can be interpreted
as a TDateTime in the Delphi programming language, where the integral
portion is the number of elapsed days since 1899-12-30, and the mantissa
portion represents the fractional portion of that day, where .0 would be
midnight, .5 would be noon, and .99999... would be just before midnight
of the next day. In practice, this field is typically unused and will be
filled with 8 null ($00) bytes.
The next field is the play counter, presumably maintained by the Jukebox
application. Most of the time this field is unused, and is typically 0
($00 00 00 00).
The next entry is a size/text combo and represents the original filename
and path. As these tags were created almost universally on Windows
machines, the entries are typically in the form of "C:\path\to\file.mp3".
The next size/text entry is the album serial number fetched from the online
CDDB when a track is ripped with MusicMatch.
The final field is the track number, usually entered automatically when
ripping, encoding, and tagging the audio off from a CD using CDDB.
Song duration length $xx xx
Song duration <ASCII string>
Creation date <8-byte IEEE-64 float>
Play counter $xx xx xx xx
Original filename length $xx xx
Original filename <ASCII string>
Serial number length $xx xx
Serial number <ASCII string>
Track number $xx xx
4.6.3. Multi-line text fields
The next three entries are typically multi-line entries. All line
separators use the Windows-standard carriage return ($0D 0A). As with the
single-line text entries, the text fields are preceded by LSB size fields
which indicate their length.
Notes length $xx xx
Notes <ASCII string>
Artist bio length $xx xx
Artist bio <ASCII string>
Lyrics length $xx xx
Lyrics <ASCII string>
4.6.4. Internet addresses
The final group of meta-data are internet addresses. As with other text
entries, the text fields are preceded by LSB size fields.
Artist URL length $xx xx
Artist URL <ASCII string>
"Buy CD" URL length $xx xx
"Buy CD" URL <ASCII string>
Artist email length $xx xx
Artist email <ASCII string>
4.6.5. Padding
The data fields are then followed by 16 null ($00) bytes. Presumably these
were intended for (up to 8) future text fields.
The remainder of this section is padded with '-' ($2D) characters.
4.7. Data offsets
This section of the tag was intended to give offsets into the file for each
of the five major required sections of the tag. The offsets, however, are
off by 1; for searching a file where the first position is offset 0, the
offset given here must be reduced by 1. In practice, however, these
offsets can often be invalid, since the data that comes before may be
increased or reduces (such as when an ID3v2 tag is appended to the file).
Therefore these offsets are best used to calculate the size of the sections
by finding the difference of two consecutive offsets. Obviously, the size
of the audio meta-data section must be calculated in a different manner.
Image extension offset $xx xx xx xx
Image binary offset $xx xx xx xx
Unused offset $xx xx xx xx
Version info offset $xx xx xx xx
Audio meta-data offset $xx xx xx xx
4.8. Footer
Unlike the header, the footer is a required section of any MusicMatch tag,
and checking for its existance is an easy way to determine if a file has a
MusicMatch tag. It is always 48 bytes in length. The first 19 bytes is
the company name "Brava Software Inc." (Note: it seems that the company
name has officially changed to MusicMatch, as "Brava Software" is not
mentioned anywhere on their website), followed by 13 bytes of space ($20)
padding. The next 4 bytes is the tag version as a numerical ASCII string
(e.g., "3.05"), and should match the version string found in the Version
section and the (optional) header. This is followed by 12 bytes of space
($20) padding.
Signature "Brava Software Inc."
Space padding 13 * $20
Tag version <4-byte numerical ASCII string>
Space padding 12 * $20
5. Identifying and parsing a MusicMatch tag
Finding and parsing a MusicMatch tag is not difficult to do, but due to
lack of foresight and questionable design decisions by MusicMatch, care
must be taken to ensure it is done correctly.
<unfinished />
6. Converting to ID3v2
As of Jukebox 4.0, MusicMatch has abandoned the MusicMatch tagging format
in favor of the open standard ID3v2. The Jukebox application will convert
old tags to ID3v2 upon request, but as this is a closed application that
serves a limited number of platforms (currently on Windows and Macintosh),
having a public specification for performing this mapping is necessary. As
ID3v2 can encapsulate all of the information found in the original
MusicMatch format while being infinitely more flexible, the decision to
convert shouldn't be a difficult one.
<unfinished />
7. Copyright
Copyright (C) Scott Thomas Haug 2000. All Rights Reserved.
This document and translations of it may be copied and furnished to others,
and derivative works that comment on or otherwise explain it or assist in
its implementation may be prepared, copied, published and distributed, in
whole or in part, without restriction of any kind, provided that a
reference to this document is included on all such copies and derivative
works. However, this document itself may not be modified in any way and
reissued as the original document.
The limited permissions granted above are perpetual and will not be
revoked.
This document and the information contained herein is provided on an 'AS
IS' basis and THE AUTHORS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
8. References
[MMTrailer] Peter "The Videoripper" Luijer,
'Description of the MusicMatch trailer in MP3 files'
<url:http://members.xoom.com/videoripper/warez/mmtrailer.txt>
[ID3v2] Martin Nilsson, 'ID3v2 informal standard'.
<url:http://www.id3.org/id3v2.3.0.txt>
[id3lib] Scott Thomas Haug, 'The ID3v1/ID3v2 Tagging Library'
<url:http://www.id3lib.org>
[ISO-8859-1] ISO/IEC DIS 8859-1.
'8-bit single-byte coded graphic character sets, Part 1: Latin
alphabet No. 1.' Technical committee / subcommittee: JTC 1 / SC 2
[JFIF] 'JPEG File Interchange Format, version 1.02'
<url:http://www.w3.org/Graphics/JPEG/jfif.txt>
[MPEG] ISO/IEC 11172-3:1993.
'Coding of moving pictures and associated audio for digital storage
media at up to about 1,5 Mbit/s, Part 3: Audio.'
Technical committee / subcommittee: JTC 1 / SC 29
and
ISO/IEC 13818-3:1995
'Generic coding of moving pictures and associated audio information,
Part 3: Audio.'
Technical committee / subcommittee: JTC 1 / SC 29
and
ISO/IEC DIS 13818-3
'Generic coding of moving pictures and associated audio information,
Part 3: Audio (Revision of ISO/IEC 13818-3:1995)'
[URL] T. Berners-Lee, L. Masinter & M. McCahill, 'Uniform Resource
Locators (URL)', RFC 1738, December 1994.
<url:ftp://ftp.isi.edu/in-notes/rfc1738.txt>
[UTF-8] F. Yergeau, 'UTF-8, a transformation format of ISO 10646',
RFC 2279, January 1998.
<url:ftp://ftp.isi.edu/in-notes/rfc2279.txt>
9. Author's Address
Written by
Scott Thomas Haug
Seattle, WA
USA