| $Id: musicmatch.txt,v 1.4 2000/09/09 23:02:37 eldamitri Exp $ |
| |
| |
| MusicMatch (TM) tag format description |
| |
| Status of this document |
| |
| This document is a description of a deprecated tagging format. The |
| information contained herein is not a specification; its intent is to |
| interpret the format based on hundreds of examples. It also relies heavily |
| on information obtained by others who have done similar investigations. It |
| is not based on any official documentation of the format, as such |
| documentation is not publicly available. Therefore the contents of this |
| document may change to adjust for newly-discovered information, but the |
| format itself is unlikely to change due to its deprecation. |
| |
| Distribution of this document is unlimited. |
| |
| |
| Abstract |
| |
| This document describes the MusicMatch tagging format present in some |
| digital audio files. This format, like other tagging specifications, |
| provides a method for storing information about an audio file within itself |
| to document its contents. This format was developed by MusicMatch and |
| used exclusively by older versions of Jukebox, their popular, "all-in-one" |
| MP3 application. |
| |
| 1. Table of contents |
| |
| Status of this document |
| Abstract |
| 1. Table of contents |
| 2. Introduction |
| 3. Conventions in this document |
| 4. Tagging format |
| 4.1. Header |
| 4.2. Image extension |
| 4.3. Image binary |
| 4.4. Unused |
| 4.5. Version information |
| 4.6. Audio meta-data |
| 4.6.1. Single-line text fields |
| 4.6.2. Non-text fields |
| 4.6.3. Multi-line text fields |
| 4.6.4. Internet addresses |
| 4.6.5. Padding |
| 4.7. Data offsets |
| 4.8. Footer |
| 5. Identifying and parsing a MusicMatch tag |
| 6. Converting to ID3v2 |
| 7. Copyright |
| 8. References |
| 9. Author's Address |
| |
| |
| 2. Introduction |
| |
| The following document describes the structure of the tagging format used |
| by MusicMatch (TM) Jukebox, prior to version 4.0 of that application. This |
| program is a so-called "All-In-One" MP3 program and provides a CD-Ripper, |
| WAV-to-MP3 converter, database, and MP3 player. |
| |
| The MusicMatch tagging format has gone through several incremental |
| iterations in its format, although the basic structure has remained fairly |
| constant throughout its history. The various formats of the MusicMatch tag |
| have been tightly coupled with the version of Jukebox that created it. As |
| such, this document will refer to the Jukebox version and tagging format |
| version interchangeably. |
| |
| As of version 4.0, MusicMatch has deprecated the use of this format in |
| their own Jukebox application, transitioning instead to ID3v2, an open |
| standard for tagging digital audio. Unfortunately, despite repeated |
| requests, MusicMatch has not provided to the public any documents |
| describing this format, and the MusicMatch Jukebox is the only |
| widely-distributed software application that can read and write these tags. |
| |
| As such, this text may not be completely accurate and is surely incomplete, |
| but it covers enough to the format to enable one to write robust software |
| to find and parse tags in this format. For example, the id3lib tagging |
| library's MusicMatch parsing routines were written solely based on the |
| information found in this document. However, the authors cannot be held |
| responsible for any inaccuracies or any harm caused by using this |
| information. One can assume that the specifition is unlikely to change, |
| given MusicMatch's own abandonment of the format. It should also be noted |
| that incoporating functionality into applications to write tags in this |
| format is discouraged, as the format has been officially deprecated by |
| MusicMatch themselves. |
| |
| 3. Conventions in this document |
| |
| This document borrows heavily from specifications written by Martin |
| Nillson, author of the ID3v2 tagging standard. Much of the structure, |
| formatting, and other such conventions used in the ID3v2 specifications are |
| carried over into this document. |
| |
| Text within "" is a text string exactly as it appears in a tag. Numbers |
| preceded with $ are hexadecimal and numbers preceded with % are binary. $xx |
| is used to indicate a byte with unknown content. |
| |
| 4. Tag overview |
| |
| The MusicMatch Tagging Format was designed to store specific types of audio |
| meta-data inside the audio file itself. As the format was used exclusively |
| by the MusicMatch Jukebox application, it is used only with MPEG-1/2 layer |
| III files encoded with that program. However, its tagging format is not |
| inherently exclusive of other audio formats, and could conceivably be |
| used with other types of encodings. |
| |
| MusicMatch tags were originally designed to come at the very end of MP3 |
| files, after all of the MP3 audio frames. Starting with Jukebox version |
| 3.1, the application became more ID3-friendly and started placing ID3v1 |
| tags after the MusicMatch tag as well. In practice, since very few |
| applications outside of the MusicMatch Jukebox are capable of reading and |
| understanding this format, it is not unusual to find MusicMatch tags |
| "buried" within mp3 files, coming before other types of tagging formats in |
| a file, such as Lyrics3 or ID3v2.4.0. Such "relocations" are not uncommon, |
| and therefore any software application that intends to find, read, and |
| parse MusicMatch tags should be flexible in this endeavor, despite the |
| apparent intentions of the original specification. |
| |
| Although various sections of a MusicMatch tag are fixed in length, other |
| sections are not, and so tag lengths can vary from one file to another. A |
| valid MusicMatch tag will be at least 8 kilobytes (8192 bytes) in length. |
| Those tags with image data will often be much larger. |
| |
| The byte-order in 4-byte pointers and multibyte numbers for MusicMatch tags |
| is least-significant byte (LSB) first, also known as "little endian". For |
| example, $12345678 is encoded as $78 56 34 12. |
| |
| Overall tag structure: |
| |
| +-----------------------------+ |
| | Header | |
| | (256 bytes, OPTIONAL) | |
| +-----------------------------+ |
| | Image extension (4 bytes) | |
| +-----------------------------+ |
| | Image binary | |
| | (var. length >= 4 bytes) | |
| +-----------------------------+ |
| | Unused (4 bytes) | |
| +-----------------------------+ |
| | Version info (256 bytes) | |
| +-----------------------------+ |
| | Audio meta-data | |
| | (var. length >= 7868 bytes) | |
| +-----------------------------+ |
| | Data offsets (20 bytes) | |
| +-----------------------------+ |
| | Footer (48 bytes) | |
| +-----------------------------+ |
| |
| This document will describe the various sections of the tag in the order |
| listed above (that is, in the sequential order that they appear when |
| reading the tag from beginning to end). However, due to the nature of the |
| tag's format, in practice the tag's sections will often be parsed in the |
| reverse order. A robust parsing algorithm will be suggested and described |
| later in the document. |
| |
| 4.1. Header |
| |
| An optional tag header often precedes the tag data in a MusicMatch tag. |
| Although the rules that determine this header's required presence are |
| unknown, the header is usually found in tag versions up to and including |
| 2.50, and is usually lacking otherwise. Luckily, its format is rigid and |
| therefore its presence is easy to determine. The data in the header are |
| not vital to the correct parsing of the rest of the tag and can thus be |
| discarded. The header is the only optional section in a MusicMatch tag. |
| All other sections are required to consider the tag valid. |
| |
| The header section is always 256 bytes in length. It begins with three |
| 10-byte subsections, and ends with 226 bytes of space ($20) padding. Each |
| of the first three subsections contains an 8-byte ASCII text string |
| followed by two bytes of null ($00) padding. |
| |
| The first subsection serves as a sync string: its 8-byte string is always |
| "18273645". |
| |
| The second subsection's 8-byte string is the version of the Xing encoder |
| used to encode the mp3 file. The last four bytes of this string are |
| usually '0' ($30). An example of this string is "1.010000". |
| |
| The third and final 10-byte subsection is the version of the MusicMatch |
| Jukebox used to encode the mp3 file. The last four bytes of this string |
| are usually '0' ($30). An example of this string is "2.120000". |
| |
| Sync string "18273645" |
| Null padding $00 00 |
| Xing encoder version <8-byte numerical ASCII string> |
| Null padding $00 00 |
| MusicMatch version <8-byte numerical ASCII string> |
| Null padding $00 00 |
| Space padding 226 * $20 |
| |
| 4.2. Image extension |
| |
| MusicMatch tags can contain at most one image. This first required section |
| is the extension of the image when saved as a file (for example, "jpg" or |
| "bmp"). This section is 4 bytes in length, and the data is padded with |
| spaces ($20) if the extension doesn't use all 4 bytes (in practice, 3-byte |
| extensions are the most prevalent). Likewise, tags without images have all |
| spaces for this section (4 * $20). |
| |
| Picture extension $xx xx xx xx |
| |
| 4.3. Image binary |
| |
| When an image is present in the tag, the image binary section consists of |
| two fields. The first field is the size of the image data, in bytes. The |
| second is the actual image data. |
| |
| Image size $xx xx xx xx |
| Image data <binary data> |
| |
| If no image is present, the image binary section consists of exactly four |
| null bytes ($00 00 00 00). |
| |
| 4.4. Unused |
| |
| This section is never used, to the best of the author's knowledge. It is |
| always 4 null ($00) bytes. |
| |
| Null padding $00 00 00 00 |
| |
| 4.5. Version information |
| |
| This section of the tag has the exact same format as the header. Unlike |
| the header, this section is required for the tag to be considered valid. |
| |
| Sync string "18273645" |
| Null padding $00 00 |
| Xing encoder version <8-byte numerical ASCII string> |
| Null padding $00 00 |
| MusicMatch version <8-byte numerical ASCII string> |
| Null padding $00 00 |
| Space padding 226 * $20 |
| |
| 4.6. Audio meta-data |
| |
| The audio meta-data is the heart of the MusicMatch tag. It contains most |
| of the pertinent information found in other tagging formats (song title, |
| album title, artist, etc.) and some that are unique to this format (mood, |
| prefernce, situation). |
| |
| In all versions of the MusicMatch format up to and including 3.00, this |
| section is always 7868 bytes in length. All subsequent versions allowed |
| three possible lengths for this section: 7936, 8004, and 8132 bytes. The |
| conditions under which a particular length from these three possibilities |
| was used is unknown. In all cases, this section is padded with dashes |
| ($2D) to achieve this constant size. |
| |
| Due to the great number of fields in this portion of the tag, they are |
| divided amongst the next four sections of the document: single-line text |
| fields, non-text fields, multi-line text fields, and internet addresses. |
| This clarification is somewhat arbitrary and somewhat inaccurate (some of |
| the fields described as "non-text" are indeed ASCII strings). However, the |
| clarification does allow for easier description of the meta-data as a |
| whole. At any rate, the actual fields in this section of the tag appear |
| sequentially in the order presented. |
| |
| |
| 4.6.1. Single-line text fields |
| |
| The first group entries in this section of the tag are variable-length |
| ASCII text strings. Each of these strings are preceded by a two-byte field |
| describing the size of the following string (again, in LSB order). |
| Multiple entries in a text field are separated by a semicolon ($3B). An |
| empty (and non-existant) text field is indicated by a size field of 0 ($00 |
| 00). |
| |
| The first three of these entries are fairly-self explanatory: song title, |
| album title, and artist name. |
| |
| The final five entries are a little less common: Genre, Tempo, Mood, |
| Situation, and Preference. These fields can contain any information, but |
| do to the interface and default set-up for the Jukebox application, they |
| typically are limited to a subset of possibilities. |
| |
| The Genre entry differs from the ID3v1 tagging format in that it allows |
| a full-text genre description, whereas ID3v1 maps a number to a list of |
| genres. Again, the genre description could be anything, but the interface |
| in Jukebox typically limited most users to the standard ID3v1 genres. |
| |
| The Tempo entry is intended to describe the general tempo of the song. The |
| Jukebox application provided the following defaults: None, Fast, Pretty |
| fast, Moderate, Pretty slow, and Slow. |
| |
| The Mood entry describes what type of mood the audio establishes: Typical |
| values include the following: None, Wild, Upbeat, Morose, Mellow, Tranquil, |
| and Comatose. |
| |
| The Situation entry describes in which situation this music is best played. |
| Expect the following: None, Dance, Party, Romantic, Dinner, Background, |
| Seasonal, Rave, and Drunken Brawl. |
| |
| The Preference entry allows the user to rate the song. Possible values |
| include the following: None, Excellent, Very Good, Good, Fair, Poor, and |
| Bad Taste. |
| |
| Song title length $xx xx |
| Song title <ASCII string> |
| Album title length $xx xx |
| Album title <ASCII string> |
| Artist name length $xx xx |
| Artist name <ASCII string> |
| Genre length $xx xx |
| Genre <ASCII string> |
| Tempo length $xx xx |
| Tempo <ASCII string> |
| Mood length $xx xx |
| Mood <ASCII string> |
| Situation length $xx xx |
| Situation <ASCII string> |
| Preference length $xx xx |
| Preference <ASCII string> |
| |
| 4.6.2. Non-text fields |
| |
| The next group of fields is described here as "non-text". They are |
| probably better described as entries that are auto-created (i.e., not |
| entered in by a user), although this isn't entirely accurate, either, as |
| the track number field is determined by user input. At any rate, they've |
| been separated to clarify the presentation of the material. |
| |
| The "Song duration" entry consists of two fields: a size and text. The |
| text is formatted as "minutes:seconds", and thus the size field is |
| typically 4 ($04 00). |
| |
| The only field that is neither a string nor a LSB numerical value is the |
| creation date. It is 8-byte floating-point value. It can be interpreted |
| as a TDateTime in the Delphi programming language, where the integral |
| portion is the number of elapsed days since 1899-12-30, and the mantissa |
| portion represents the fractional portion of that day, where .0 would be |
| midnight, .5 would be noon, and .99999... would be just before midnight |
| of the next day. In practice, this field is typically unused and will be |
| filled with 8 null ($00) bytes. |
| |
| The next field is the play counter, presumably maintained by the Jukebox |
| application. Most of the time this field is unused, and is typically 0 |
| ($00 00 00 00). |
| |
| The next entry is a size/text combo and represents the original filename |
| and path. As these tags were created almost universally on Windows |
| machines, the entries are typically in the form of "C:\path\to\file.mp3". |
| |
| The next size/text entry is the album serial number fetched from the online |
| CDDB when a track is ripped with MusicMatch. |
| |
| The final field is the track number, usually entered automatically when |
| ripping, encoding, and tagging the audio off from a CD using CDDB. |
| |
| Song duration length $xx xx |
| Song duration <ASCII string> |
| Creation date <8-byte IEEE-64 float> |
| Play counter $xx xx xx xx |
| Original filename length $xx xx |
| Original filename <ASCII string> |
| Serial number length $xx xx |
| Serial number <ASCII string> |
| Track number $xx xx |
| |
| 4.6.3. Multi-line text fields |
| |
| The next three entries are typically multi-line entries. All line |
| separators use the Windows-standard carriage return ($0D 0A). As with the |
| single-line text entries, the text fields are preceded by LSB size fields |
| which indicate their length. |
| |
| Notes length $xx xx |
| Notes <ASCII string> |
| Artist bio length $xx xx |
| Artist bio <ASCII string> |
| Lyrics length $xx xx |
| Lyrics <ASCII string> |
| |
| 4.6.4. Internet addresses |
| |
| The final group of meta-data are internet addresses. As with other text |
| entries, the text fields are preceded by LSB size fields. |
| |
| Artist URL length $xx xx |
| Artist URL <ASCII string> |
| "Buy CD" URL length $xx xx |
| "Buy CD" URL <ASCII string> |
| Artist email length $xx xx |
| Artist email <ASCII string> |
| |
| 4.6.5. Padding |
| |
| The data fields are then followed by 16 null ($00) bytes. Presumably these |
| were intended for (up to 8) future text fields. |
| |
| The remainder of this section is padded with '-' ($2D) characters. |
| |
| 4.7. Data offsets |
| |
| This section of the tag was intended to give offsets into the file for each |
| of the five major required sections of the tag. The offsets, however, are |
| off by 1; for searching a file where the first position is offset 0, the |
| offset given here must be reduced by 1. In practice, however, these |
| offsets can often be invalid, since the data that comes before may be |
| increased or reduces (such as when an ID3v2 tag is appended to the file). |
| Therefore these offsets are best used to calculate the size of the sections |
| by finding the difference of two consecutive offsets. Obviously, the size |
| of the audio meta-data section must be calculated in a different manner. |
| |
| Image extension offset $xx xx xx xx |
| Image binary offset $xx xx xx xx |
| Unused offset $xx xx xx xx |
| Version info offset $xx xx xx xx |
| Audio meta-data offset $xx xx xx xx |
| |
| 4.8. Footer |
| |
| Unlike the header, the footer is a required section of any MusicMatch tag, |
| and checking for its existance is an easy way to determine if a file has a |
| MusicMatch tag. It is always 48 bytes in length. The first 19 bytes is |
| the company name "Brava Software Inc." (Note: it seems that the company |
| name has officially changed to MusicMatch, as "Brava Software" is not |
| mentioned anywhere on their website), followed by 13 bytes of space ($20) |
| padding. The next 4 bytes is the tag version as a numerical ASCII string |
| (e.g., "3.05"), and should match the version string found in the Version |
| section and the (optional) header. This is followed by 12 bytes of space |
| ($20) padding. |
| |
| Signature "Brava Software Inc." |
| Space padding 13 * $20 |
| Tag version <4-byte numerical ASCII string> |
| Space padding 12 * $20 |
| |
| 5. Identifying and parsing a MusicMatch tag |
| |
| Finding and parsing a MusicMatch tag is not difficult to do, but due to |
| lack of foresight and questionable design decisions by MusicMatch, care |
| must be taken to ensure it is done correctly. |
| |
| <unfinished /> |
| |
| 6. Converting to ID3v2 |
| |
| As of Jukebox 4.0, MusicMatch has abandoned the MusicMatch tagging format |
| in favor of the open standard ID3v2. The Jukebox application will convert |
| old tags to ID3v2 upon request, but as this is a closed application that |
| serves a limited number of platforms (currently on Windows and Macintosh), |
| having a public specification for performing this mapping is necessary. As |
| ID3v2 can encapsulate all of the information found in the original |
| MusicMatch format while being infinitely more flexible, the decision to |
| convert shouldn't be a difficult one. |
| |
| <unfinished /> |
| |
| 7. Copyright |
| |
| Copyright (C) Scott Thomas Haug 2000. All Rights Reserved. |
| |
| This document and translations of it may be copied and furnished to others, |
| and derivative works that comment on or otherwise explain it or assist in |
| its implementation may be prepared, copied, published and distributed, in |
| whole or in part, without restriction of any kind, provided that a |
| reference to this document is included on all such copies and derivative |
| works. However, this document itself may not be modified in any way and |
| reissued as the original document. |
| |
| The limited permissions granted above are perpetual and will not be |
| revoked. |
| |
| This document and the information contained herein is provided on an 'AS |
| IS' basis and THE AUTHORS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, |
| INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION |
| HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF |
| MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. |
| |
| |
| 8. References |
| |
| [MMTrailer] Peter "The Videoripper" Luijer, |
| 'Description of the MusicMatch trailer in MP3 files' |
| |
| <url:http://members.xoom.com/videoripper/warez/mmtrailer.txt> |
| |
| [ID3v2] Martin Nilsson, 'ID3v2 informal standard'. |
| |
| <url:http://www.id3.org/id3v2.3.0.txt> |
| |
| [id3lib] Scott Thomas Haug, 'The ID3v1/ID3v2 Tagging Library' |
| |
| <url:http://www.id3lib.org> |
| |
| [ISO-8859-1] ISO/IEC DIS 8859-1. |
| '8-bit single-byte coded graphic character sets, Part 1: Latin |
| alphabet No. 1.' Technical committee / subcommittee: JTC 1 / SC 2 |
| |
| [JFIF] 'JPEG File Interchange Format, version 1.02' |
| |
| <url:http://www.w3.org/Graphics/JPEG/jfif.txt> |
| |
| [MPEG] ISO/IEC 11172-3:1993. |
| 'Coding of moving pictures and associated audio for digital storage |
| media at up to about 1,5 Mbit/s, Part 3: Audio.' |
| Technical committee / subcommittee: JTC 1 / SC 29 |
| and |
| ISO/IEC 13818-3:1995 |
| 'Generic coding of moving pictures and associated audio information, |
| Part 3: Audio.' |
| Technical committee / subcommittee: JTC 1 / SC 29 |
| and |
| ISO/IEC DIS 13818-3 |
| 'Generic coding of moving pictures and associated audio information, |
| Part 3: Audio (Revision of ISO/IEC 13818-3:1995)' |
| |
| [URL] T. Berners-Lee, L. Masinter & M. McCahill, 'Uniform Resource |
| Locators (URL)', RFC 1738, December 1994. |
| |
| <url:ftp://ftp.isi.edu/in-notes/rfc1738.txt> |
| |
| [UTF-8] F. Yergeau, 'UTF-8, a transformation format of ISO 10646', |
| RFC 2279, January 1998. |
| |
| <url:ftp://ftp.isi.edu/in-notes/rfc2279.txt> |
| |
| |
| 9. Author's Address |
| |
| Written by |
| |
| Scott Thomas Haug |
| Seattle, WA |
| USA |