The program handin33 will only submit files in the cs33/lab/09 directory. (You should run update33 first to set up the directory and create any necessary files.)
This program will probably take you longer than the average program in the class. There is lots of text to read below, and the program has lots of little details that you can mess up. Start early and consider working with a partner. You will have 10 days to complete the lab, but this includes the 4 days of Thanksgiving break, so plan accordingly.
Your program must follow these following guidelines:
The goal of this week's assignment is to read in an MP3 file and display the contents of its ID3 tag. For those of you who are unfamiliar, MP3 files are files that contain compressed audio and may optionally contain information specifying the artist, song title, album name, etc. This additional non-audio data comprises the ID3 tag.
In 1996, the first specification for ID3 tags, ID3v1, was created. While this specification was simple, there were complaints about the format. One problem with the specification was the small amount of space (only 128 bytes) that was made available for storing non-audio data. Artist names, song titles, and album names all had to be 30 characters or less. This meant that you couldn't store "Elvis Costello and the Attractions" as the artist for their album "This Year's Model". Since the maximum length of the artist name was 30 characters, you'd only be able to store "Elvis Costello and the Attract".
In 1998, the ID3v2 specification was created. This allowed much more non-audio data (256 megabytes instead of 128 bytes) to be stored, without the unreasonably small restrictions on the number of characters that could be stored for any of individual piece data. (The new format allows each piece of data to be 16 megabytes; plenty big for the artist's name.) One new piece of non-audio data that could be stored was an image (of the album cover, for example). Along with information such as artist and album name, the program you write will be able to extract this image.
The most popular version of the ID3 specification is ID3v2.3. However, if you import a CD in iTunes, iTunes uses the older ID3v2.2 specification. iTunes can read the new specification, and can even convert your MP3 to use any one of 5 different specifications; but, if you don't do anything to your MP3's after you've imported them, you've got ID3v2.2 tags. (At least, this was true for me using iTunes 10.5.1 on a Mac, and I'm fairly certain I haven't messed with any settings. To convert a song from one ID3 format to another, control-click/right-click on the file name in iTunes and choose "Convert ID3 Tags".)
Since specifications have important differences between them, and since it doesn't make sense for you to write a data extractor for each of the 5 available ID3 versions, you're going to write the extractor for the ID3v2.2 specification. This means that if you use iTunes to import CDs, you've already got a bunch of MP3s which you can try this on. If you don't use iTunes (or you don't use the MP3 format), you're not out of luck: I have provided each of you with seven MP3 files to play with. Since these songs are copyrighted, and I'd like to avoid the wrath of the RIAA, I have snipped them down to 30 seconds each.
You can find the sample music files in the following directory, where username is your username on the CS machines: /scratch/richardw/music/username/. The files are located there (instead of via update33) to avoid issues with your quota, and they will be deleted shortly after the assignment is due.
You can work out the details of how to do the extraction by reading the ID3v2.2 reference. But, you'd probably do much better reading my distilled version below.
Your program will be called id3.c. To run your program, you will type ./id3 <filename.mp3> on the command line, substituting <filename.mp3> with the actual name of the MP3 file. When you run the program, you will print out the information in the ID3 tags, skipping over some fields which you won't worry about, and write the image file to the disk for later viewing.
You will need to store lots of variable-sized strings and byte-arrays in this assignment. All strings and arrays should be dynamically allocated. There are almost no cases in the assignment where using a statically allocated array makes sense, and even if you think it does, use a dynamically allocated array anyway (for practice). You won't have to allocate any multi-dimensional arrays.
Here are the new C things you'll need to know for the lab:
int main(int argc, char **argv) { ... return 0; }
Your program will exit with an informative error message if the user runs the program without specifying an MP3 file as an argument.
for (i = 0; i < 5; i++) { fprintf(stderr, "Message %d sent to standard error.\n", i); }
for (i = 0; i < 5; i++) { fprintf(stderr, "Message %d sent to standard error.\n", i); if (i == 3) { exit(1); /* Exit, indicating a failure */ } }
FILE *fptr; /* fptr is a file pointer */The file pointer contains information such as how to find the file on disk, and where in the file the next byte you'll read will come from.
unsigned char uc;You use unsigned characters just as you would regular characters.
fptr = fopen(filename, "rb"); /* filename is a string storing the name of a file */If fptr equals NULL after calling fopen, there was an error. If that happens, print an informative error message and exit. The string "rb" indicates that the file will be open for reading (r) and that the contents of the file should be interpreted as binary data (b), not text data.
uc = fgetc(fptr);If fgetc(fptr) reaches the end of the file, the return value is a constant called EOF. For example:
int main() { FILE *fptr; char c; fptr = fopen("test.c", "r"); while ((c = fgetc(fptr)) != EOF) printf("%c", c); } return 0; }Unfortunately, EOF is signed and it's value in C is -1. To ensure that you match EOF properly while reading into an unsigned char, you might want to replace the line above as follows, where uc is an unsigned char:
while ((uc = fgetc(fptr)) != (unsigned char)EOF)Remeber that each time you call fgetc, the file pointer moves one byte forward in the file.
fclose(fptr);The fclose function ensures that the operating system is told that the file is no longer in use, and ensures that if the file was being written to, that the write completes before the memory is freed.
int position; position = ftell(fptr); /* number of bytes read from the start */
unsigned char uc, result; FILE *inptr, *outptr; int i; inptr = fopen("first", "rb"); outptr = fopen("second", "wb"); /* "w" == "write" */ for (i = 0; i < 10; i++) { uc = fgetc(inptr); result = fputc(uc, outptr); /* if result == EOF, there was an error */ } fclose(inptr); fclose(outptr);
The letters a-g are each one byte (bytes 3-9, counting from 0) and represent the following:
d e f g 0x00 0x00 0x02 0x01 (HEX) 00000000 00000000 00000010 00000001 (BINARY) (--> removing topmost bit from each byte) _0000000 _0000000 _0000010 _0000001 (--> rewriting) 0000 00000000 00000001 00000001 = 256 + 1 = 257(HINT: Use left shift and addition. This should take no more than a few lines to convert.)
Here is how you will deal with the frames:
int numbytes = 513; fseek(fptr, numbytes, SEEK_CUR); /* move 513 bytes from the current (SEEK_CUR) location */
TP1: Elvis Costello and The AttractionsYou should not do any post-processing of the frame data (such as changing TP1 into "Lead Artist" as stated in the ID3v2 specification).
There is one small complication. If the text frame's tag is TXX, it's a "User defined text information frame". In this case, the data stored in the frame includes both the kind of information being stored and the contents of the frame, separated by '\0'. For example, in the Second-Song.mp3 file, there are two TXX frames. The first contains the string "Encoded by\0iTunes 10.0.1.22". For TXX frames, you'll want to print out the kind of information, followed by a colon, followed by the contents. In this example, you'd print out:
Encoded by: iTunes 10.0.1.22
When you're done with memory, be sure you free it. When you're done with a file, be sure you fclose it.
NOTE: If you run the program from your cs33/labs/09 directory on an MP3 in the /scratch/richardw/music/username/ directory, the output image will also be saved in that directory. To verify that your image saved properly, you can run display filename from the command prompt.
To view the actual binary contents of the MP3 file, you can use the program hexedit. For example, this command will open the specified MP3 file:
hexedit Pump-It-Up.mp3Use page-up and page-down to look around the file, and Ctrl-C to quit. Despite the message, pressing F1 does not seem to provide Help.