Protocol Design Document Due: Monday, Nov 3rd, 11:59 PM
Due: Monday, Nov 17th, 11:59 PM
1. Overview
For this lab, you will be designing your very own music-streaming application! Since the protocol is your own, this time, you will be developing the functionality of both the client and the server.The protocol you design will support a few basic operations that the user can request from a client, including retrieving a list of songs, getting information about a song, playing a song, and stopping playback.
1.1. Lab 4 Goals
-
Design and implement your own application-layer client-server Jukebox protocol.
-
Develop both the client and server protocol communication formats.
-
Develop a persistent server to connect with multiple interactive clients using event-based concurrency.
-
Write a client that uses producer-consumer threading model to receive song data, and play song data.
1.2. Weekly Deliverables
Week 1
By the end of the first week, you should have written down:
-
A specification for your protocol, including header formats, field sizes, and anything else someone might need to know to implement your protocol.
-
A list of the states (or actions) you intend to perform as the server and what each state represents.
These documents don’t have to be super long or formatted like RFCs, but they should provide enough detail for you to refer to as you work on your implementation. They also aren’t set in stone — you may have to change the design slightly as you encounter implementation challenges, but my expectation is that that they won’t change too drastically.
I would like to have a brief (15-minute) protocol specification and progress meeting with each group before labs start before or on November 4. Please sign up for a time. For this meeting, you should come prepared to discuss the details of your protocol and the per-client state you intend to keep at the server.
Week 2
By the end of the second week, you should:
-
Be able to accept clients at the server.
-
Be able to exchange text (list and info) requests and replies between multiple concurrent clients and the server.
-
Be able to exchange music with one client, initially, and then multiple clients.
Week 3
-
Be able to interleave music and text inputs.
1.3. Handy references
-
Demo sign-up sheet (TBD)
-
Server side:
-
select() manual
-
select() tutorial manual page
-
Beej’s select() info.
-
byte order in C (for binary protocols)
-
-
Client side:
2. Requirements
#run your server on a different machine from the client $ ssh lab_machine_1 $ ./server 5000 /home/rware/music/ $ #Song Listing# # Display on the server side: New connection (4)! Client 4 request file list. Client 4 disconnected: recv: Success New connection (5)! Client 5 now playing Cowboy Junkies - Sweet Jane.mp3 Client 5 requested stop Ctrl+C # kill the server # separately ssh into a client. PLEASE BE AT THIS LAB COMPUTER IF YOU PLAN TO # PLAY SONG DATA $ ssh lab_machine_2 $ python client.py lab_machine_1.cs.swarthmore.edu 5000 $ >> list #enter command here $ >> exit $ python client.py lab_machine_1.cs.swarthmore.edu 5000 $ >> $ >> play 1 $ >> stop
2.1. Protocol Design & Server State
Thus far, we’ve seen a few different types of application-layer protocols that follow the client-server architecture. Having discussed various protocol design trade-offs, consider at least the following when designing your protocol:
-
Header information: what information must be in the header of each message to ensure correctness?
-
Field format and size: for each piece of information that you exchange in a message, how large will it be and how will the receiver know the size? How will it be formatted (e.g., text vs. binary, byte order, etc.)
-
Message delimiters: how will each side of the communication recognize where one message ends and the next one begins?
-
Per-client state: what does the server need to maintain to keep track of the state of a client? What variables does it need to store and what do they mean?
|
Common Gotchas
Students often neglect to consider these cases:
|
2.2. Server Requirements: Event-Driven Concurrency
$ ./server [port] /home/rware/music/
Found 15 songs.
-
You must use C to implement your server, and it must use
select()(rather than threading) to support multiple concurrent clients. -
To simplify the file
I/O, it’s fine for the server to keep its data, including the audio files, in memory, but the client should not store data that it isn’t actively using. -
Your server will receive two command line arguments: a port number on which to listen for incoming connections and the name of a directory with music files. For example:
-
See reference material: Event-based programming and
select()
2.2.1. Workflow
-
Start up and read
.mp3and.infofile data into in-memory storage (e.g., a global array). You may also want to build a string to handlelistmessages, since that won’t change. -
Begin listening for incoming client connections.
-
Infinitely execute the main event loop, which uses the
selectfunction to decide:-
Is there a new client that is connected for us to
accept? If so, accept the new socket and initialize the state associated with that new client. (This is true whenselectsays that the server socket is ready for reading.) -
Have any clients sent data to the server? If so, for each client, receive the data and update the client’s associated state to reflect what they’ve just requested.
-
Do any clients that the server would like to send a message to (i.e., the state of the client indicates that it has messages pending) have space available in the outgoing socket buffer? If so, for each client, look at the state of the client and send the appropriate data to it.
-
|
When sending song data, it’s helpful to send relatively large chunks at a time (e.g., 16 or 32 kilobytes). In my solution, I initially sent 4-kilobyte chunks, and song playback stuttered at the start. Switching to 16-kilobyte chunks seems to work much better. |
2.2.2. Expectations
-
After it starts up and begins serving clients, your server should never block on any call other than
select. Blocking on asend,recv, oracceptis a bug! -
If a client disconnects from your server, for any reason, your server should not crash. Instead, it should reset the disconnected client’s such that it will ignore it until some other client comes along that happens to use the same socket descriptor number. You might detect a client leaving in two ways:
-
When a client disconnects, if you’re blocked on a call to
select, theselectcall will return and the disconnected client’s socket will show up as readable. Callingrecvwill return0to indicate "end-of-file". If you see a return value of 0 from a client socket, that client is gone. -
When you call
sendon a client socket, if the client has disconnected, your call tosendwill fail and generate aSIGPIPEsignal, which by default will terminate your entire server process. If you pass theMSG_NOSIGNALflag tosend, it will not deliver the signal and instead return-1. If you see a return value of-1from send, the client is gone.
-
-
When using a language without garbage collection (e.g., your server in C), you should generate no
valgrindwarnings. Because the server is still active when you hit CTRL-C, it’s fine for memory to bestill reachable, but none should be lost.
2.3. Client
$ ./client.py [hostname].cs.swarthmore.edu [port]
>>
Your client should be interactive, with users typing any of the following commands to request jukebox behavior at any time:
-
list: retrieve a list of the songs available at the server.
-
info <song number>: retrieve text information about the specified song number.
-
play <song number>: retrieve the bytes of the specified song file (in MP3 format) for the client to play the song.
-
stop: end the song data file transfer (if data is being transferred) and stop playing the current song (if one is playing).
-
exit: disconnect the client.
The client should not cache data. In other words, every time a user enters a command, the client needs to send a request to the server for the corresponding data.
2.3.1. Workflow
When you start the client, it will create two additional threads (for a total of three threads):
-
The main thread, which reads user input commands and, when the user enters a command, sends the corresponding request to the server. Its sequence is:
-
Create the other two threads.
-
Enter an infinite loop that waits for user input. When the user inputs a command, build a request and send it to the server.
-
-
The receiving thread, which always tries to receive and process data from the server. Its sequence is (in an infinite loop):
-
Receive a message header.
-
Receive a message body.
-
If the message contains a text response (e.g., in response to
listorinfo), print it. -
If the message contains song data, add the new song data to any other song data you’ve already buffered and inform the playing thread that new data is available. Because the song data is shared with the playing thread, you need to protect access to it so that both threads don’t attempt to access it at the same time.
-
If your protocol specifies other message types, handle those as needed. (You may find it helpful to get
stopmessages too.)
-
-
The playing thread, which waits for song data and plays it when available. Its sequence is (in an infinite loop):
-
Wait for song data to become available. When data arrives, process it and play it.
-
|
The receiving thread and playing thread have a producer/consumer relationship. As the receiving thread receives song data, it produces data by appending it to the outstanding data that needs to be played. The playing thread consumes the data by working through from start to end. This relationship means you need to guard against two potential problems using thread synchronization:
In Python, a
condition
variable implicitly contains a mutex lock. You should create one condition
variable and share it with both threads. Threads can call While holding the lock, a thread can call While holding the lock, a thread can call |
2.3.2. Expectations
-
The client should not cache data. In other words, every time a client would like to either play/list/info, the item needs to be requested from the server. Each retrieved item should not be stored on the client side and repeated on subsequent requests!
-
A user should be able to request that a song start playing and then immediately (while the song data is still transferring) be able to request text via list or info without having to wait for the file transfer to complete.
-
The protocol you design (and your implementation of it) must allow list and info messages to be interleaved with song data. That is, while the server is sending data in response to a play command, the user should still be able to request the list of songs or information about songs without needing to wait for the entire song file to transfer first.
2.3.3. Sending requests to the server
You can use the Python readline module to get user-friendly command line behavior for reading commands from the user. (Done for you in provided client example.)
2.3.4. Python ao and mad audio libraries
If you want your client to begin playing a new file, you need to create a new MadFile object. This tells mad (our audio library) to interpret the next bytes as the beginning of a new file (which includes some metadata) rather than the middle of the previously playing file.
2.4. Assumptions / Simplifications
-
To simplify the file
I/O, it’s fine for the server to keep its data, including the audio files, in memory, but the client should not store data that it isn’t actively using. -
You may assume that the client has infinite buffering capacity for song data. That is, unlike "real" streaming protocols, you don’t have to keep track of how much data the client has processed before sending them more data. Once you start sending song data, just keep sending more until you reach the end (unless the client requests something else in the meantime).
-
One of the parameters to your server is a path to a directory that contains audio files and their corresponding information. Within this directory, you may assume that any file ending in
.mp3is an mp3 audio file. For each mp3 file, there will be a corresponding information file that is identical to the file name with.infotacked on to the end. For example, if there were a file in the directory namedsong1.mp3, there will also be a file namedsong1.mp3.infocontaining human-readable plain text information about that song. This info file is what should be supplied when the client issues an info command. I’ve made my music directory publicly available at/home/rware/music/, and I’ve provided code to read the names of these files. Feel free to use that or your own mp3 files for testing, but please do not submit audio files to GitHub!
2.5. Sending and Receiving Data
-
Use
send()andrecv()in as few places as possible. Never call either one on a socket unless select has told you that it’s safe to do so, otherwise, you’ll block (and potentially deadlock!). -
Set your sockets to non-blocking mode for debugging: You can do this by calling
set_non_blocking(int sock)in your starter code. This is not required (you should neverblockregardless because you should never try tosend,recv, oracceptunless select tells you it’s ok), but it will prevent deadlocking during your testing. Check their return values and errno forEAGAIN/EWOULDBLOCK, which indicate that you made a syscall that you shouldn’t have. That’ssend/recv/accept's way of telling you "I would have blocked on this call had this socket not been set for non-blocking mode." -
Client closes a connection just before your server sends data: In this case, by default, two things will happen:
-
your process will receive the
SIGPIPEsignal, which by default, kills your process (you don’t want this to occur, since you still want to service other clients) -
sendwill return an error and seterrnotoEPIPEto indicate the connection (pipe) was broken.
-
-
Luckily, we can easily prevent the
SIGPIPEsignal by using the extraflagsparameter insendthat we’ve been ignoring thus far. By setting the flags toMSG_NOSIGNAL, the kernel will only do (2), which is a much more convenient way for you to detect and handle a client disconnection.
3. Tips / FAQ
-
START EARLY! The earlier you start, the sooner you can ask questions if you get stuck. Test your code in small increments. It’s much easier to localize a bug when you’ve only changed a few lines.
-
Wireshark will not help you for this lab, since you’re designing the protocol this time. Wireshark knows nothing about how to decode your protocol!
3.1. Testing: Slowing Down the Server
To test message interleaving (handling text messages for a client that is also receiving song data), it helps to have a server that’s sending slower than our typical gigabit network. I’ve set up the machine staryu.cs.swarthmore.edu to artificially slow done its sending rate when sending from ports in the range 5001 - 5099.
Execute your server on staryu and bind to your favorite number within that port range. Try playing a song and then immediately making a list or info request. How long do you have to wait? If you’re properly interleaving messages, the delay should be short!
Sending from those ports, you’ll get a maximum throughput of approximately 256 kilobytes per second, so it will take about 20 seconds to fully transfer a 5-megabyte file. The files in my public music directory range from 2.5 MB to 13 MB, so that should give you plenty of time to test a play command followed by list or info.
4. Other Reference Materials
4.1. Event-based programming and select()
Event-based concurrency uses:
-
a single thread for all client requests.
-
we will use select() for event-based concurrency. Using
select()the program is only notified when there is space in the socket buffers to send/receive across the list of available clients.
Look up the man page on select() and take a look at the input parameters:
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
-
nfds: The first argument to the select function is (the largest numerical socket descriptor value in any of the subsequentFD_SETparameters) plus one. For example, if you’re populating yourFD_SETswith socket descriptors 7, 9, and 10, your first argument to select should be 11 (10 + 1). -
fd_set:select()takes as input variables that by convention are named,rfdsandwfdsof typefd_set. These are file or socket descriptors (fd stands for file descriptor). As far as the OS is concerned file and socket descriptors are equivalent. -
FD_ZERO(input_var): clears out the set (input_varis usually eitherrfds- read/recv file descriptors orwfds- write/send file descriptors). -
FD_SET: allows you to put a socket in the file descriptor set. For e.g.,FD_SET(serv_sock, &rfds)puts the server socket (the one on which we will call bind, listen and accept on) into the read set. We will always put the server socket in this set. -
When
select()returns, make sure that as you check theFD_SETs , you differentiate between your server socket (that you accept connections on) and client sockets. If select tells you that your server socket is ready for reading, it means you can safely call accept. -
When
select()leaves a socket descriptor in your read/writeFD_SETafter it returns, it means that you can safely (without blocking)recv/sendon that socket once, but not more than once. -
If a client closes a connection while your server is blocked on
select(),selectwill return that client’ssocketin the set of file descriptors that are available for reading (usually namedrfds). Then, when your server goes torecv()on that socket, it will get a return value of0, which as we saw in lab 1, isrecv()'s way of telling us that the connection is closed and no more data is coming (we’ve reached theEOF).-
Thus, as I’ve been harping on all semester, you should always check the return value of your system calls to check for these types of conditions so that your server can detect and account for disconnecting clients.
-
4.2. Packing/Unpacking in C
To pack your data, you can create a buffer like you normally would: char data[] or char * data = malloc(#num bytes).
-
To pack or unpack a single byte you can just index into the array using
data[idx]. -
To pack or unpack more than one byte the byte-ordering functions are as follows. Look up their man pages for more info.
htons() - convert a short (2-byte int) from host byte order to network byte order ntohs() - convert a short (2-byte int) from network byte order to host byte order htonl() - convert a long (4-byte int) from host byte order to network byte order ntohl() - convert a long (4-byte int) from network byte order to host byte order
-
For example, if you know that you have a short contained in positions 3 and 4 of a character array, you can call
ntohs()on it as:short result = ntohs(*((short *) &buf[3]));Which says: take the address of the third offset in
bufand cast the next two bytes to be ashort. Then, dereference the pointer to the short data, to get a short value unpacked - i.e., 2 byte value starting at position 3.
5. Grading Rubric
This assignment is worth 10 points.
-
1 point for completing the protocol design and meeting.
-
1 point for designing a reasonable protocol to solve the problem as described in the protocol design document.
-
1 point for accepting clients
-
1 point for successfully replying to list or info requests.
-
2 points for successfully streaming an audio file over the network.
-
1.5 points for the server handling multiple concurrent connections, without blocking, via
select(). Your server should NOT use threading. Using threads in your client is fine. -
1.5 points for interleaving different messages (
play,list,info,stop) between the client and server. That is, a client should be able to request and receive info while it is currently playing a song. -
1 point for server resiliency - the server should be able to cope with clients joining and departing at any time. Be sure to check that your server does not crash when trying to both receive from and send to a disconnected client.
6. Submitting
Please remove any debugging output prior to submitting.
Please do not submit audio files to GitHub!
To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in your lab directory.