CS43 Lab 3: Talking to DNS

Due: Tuesday, October 8, 11:59 PM

Handy references

The DNS RFC (Sections 3 and 4 are probably the most helpful.)
Python Sockets.
Python Struct (byte ordering).
The dig utility.
Wireshark documentation.
Data Representation.

Lab 3 Goals

Implement an iterative DNS client.
Understand how to parse a binary protocol.
Use Wireshark to walk through packet headers.
Use UDP sockets to send and receive data.

Overview:

For this lab, you’ll be implementing the DNS protocol to build your very own iterative name resolver!

Figure 1. The figure shows a host cs.swarthmore.edu performing an iterative DNS request for the IP address of gaia.cs.umass.edu. The host first goes to the root servers and they map the .edu portion of the host name to the IP address of the Top Level Domain (TLD) servers. The host then sends the same query to the TLD servers, which resolve .umass.edu to the IP address of the authoritative address of the UMass DNS servers. And finally, the host queries UMass DNS server dns.cs.umass.edu which returns the IP address for gaia.cs.umass.edu.

Lab Requirements

You may use any programming language you want for this lab.
- I encourage you to try Python so that you can see the similarities and differences between high- and low-level languages with respect to network programming.
- Regardless of which language you choose, you must NOT use any libraries that simplify DNS or hide the details of socket programming! Don’t make any calls to gethostbyname() / getaddrinfo() or the equivalent functions in the language you choose. If you have any doubt about which functions you may use, please ask!
Query Timeout: If you attempt to query a server and get no response after waiting a short time (approximately 5 seconds), your program should move on to the next server and attempt to query that instead.
Query Types: Your program should query for host name to IP address mappings (Type A, decimal value 1) unless given the -m flag, in which case it should query for mail exchanges (Type MX, decimal value 15).
Query status messages: Your program should print short status messages regarding its intermediate steps as it traverses the DNS hierarchy. For each request you make, you should output the server you’re querying and a brief summary of the response you got back. If you didn’t get a response (because you timed out), say so. You should print:
1. who you’re querying (IP address or, name + IP if available).
2. result of the query (success, failure, timeout, etc.).
3. If successful and the final query, print the final result.
4. If asked to resolve an invalid name, your program should print an error message.
5. You should never ask a DNS server to perform a recursive query for you.

You may assume that the additional records section will contain the A records for any server names listed in the NS records of the authority record section.

Running your program

Your client program will receive one optional flag (-m) and one argument: the host name we’d like to resolve. If the flag is absent, you’re being asked to resolve a host name’s IP address. If the flag is present, you’re being asked to find the mail exchange for a domain. For example:

$ ./lab3 demo.cs.swarthmore.edu
-Querying 198.41.0.4 (root server) to look up demo.cs.swarthmore.edu (MX:False)
-Querying 192.54.112.30 (h.edu-servers.net) to look up demo.cs.swarthmore.edu (MX:False)
-Querying 130.58.64.20 (ibext.its.swarthmore.edu) to look up demo.cs.swarthmore.edu (MX:False)
The name demo.cs.swarthmore.edu resolves to: 130.58.68.26

You should also be able to look up the mail server that a machine should use, e.g.,

$ ./lab3 -m cs.swarthmore.edu
-Querying 198.41.0.4 (root server) to look up cs.swarthmore.edu (MX:True)
-Querying 192.5.6.30 (a.edu-servers.net) to look up cs.swarthmore.edu (MX:True)
-Querying 130.58.64.20 (ibext.its.swarthmore.edu) to look up cs.swarthmore.edu (MX:True)
---MX Answer: 0, cs.swarthmore.edu
-Querying 198.41.0.4 (root server) to look up cs.swarthmore.edu (MX:False)
-Querying 192.5.6.30 (a.edu-servers.net) to look up cs.swarthmore.edu (MX:False)
-Querying 130.58.64.20 (ibext.its.swarthmore.edu) to look up cs.swarthmore.edu (MX:False)
Answer: cs.swarthmore.edu resolves to 130.58.68.9

Here, you’ll get an MX answer telling you that cs.swarthmore.edu is the name of the mail server. You’ll then need to do an additional query to resolve its name to an A record of 130.58.68.9.

You should assume that there will be a file named root-servers.txt in your program’s current working directory and that it contains a list of IP addresses for root DNS servers. Your program must use this file to find a root server. It should iteratively work its way down the DNS hierarchy, querying the root, then the TLD, then authoritative server(s) until resolves the requested host name.

Workflow of your program

Roughly, your server should follow this sequence:

Check the arguments to determine if it’s being invoked for an A or MX lookup.
1. Populate a collection of root DNS server IP addresses from root-servers.txt.
Build a query.
1. Build your DNS request, according to RFC 1035, Section 4.
2. Pack your header and message query using struct.pack().
UDP Socket Calls
1. After your request is built you will need to use s.sendto() and s.recvfrom() to send and receive from the server. DNS uses UDP port number 53.
2. Send that query to a root server and wait for a response. If you wait too long, move to the next root.
3. Your request will start at the root server, but note that the server you send to will change depending on which level of the DNS hierarchy you are at!
Receive responses per request from the DNS server hierarchy
1. When you receive your response from the call to recvfrom(), you will have to unpack the response using struct.unpack().
2. When you are unpacking results from struct_unpack use one of the following formats:
  result, = struct.unpack() //NOTE THE COMMA AT THE END of result result1, result2, result3 = struct.unpack()
  Hint: Since the header is always the first 12 bytes [0:12] of your response, you might want to unpack the header first by calling struct_unpack().
Parse response message: Now that you have your response headers, you can parse the rest of the Resource Record in the response, to figure out whether you have received an NS record, an A record, an AAAA record, or an MX record.
Continue this process as you work your way down the hierarchy, only instead of using the root servers for subsequent queries, use the NS record results from previous query’s response.
Returning the resolved IP address for an A record: Once you’ve made it down to the final authoritative server, inform the user of the result and exit.
1. Once you receive an A record: you can use socket.inet_ntoa() at the offset of your response where the IP address is located to return the IP address of the hostname.
You may find that structuring your program with recursion is helpful. For example, in processing one lookup, you might need to start another. If your code can call itself again, it’ll be easier!

Weekly Deliverables

For week 1 your DNS client should be able to successfully deliver part 1 of the lab requirements, i.e., hostname to IP resolution.
For week 2 your DNS client should be able to resolve hostname to mailserver, and mailserver to IP.

`dig` queries

To get an idea of how your DNS client is supposed to function, try out a dig query on the terminal! dig is a command-line utility that allows you to construct DNS queries. Use man dig to find out what input parameters it takes.

dig @8.8.8.8 demo.cs.swarthmore.edu

8.8.8.8: Google’s DNS server
demo.cs.swarthmore.edu: hostname for which we want to find the IP address.

Wireshark and tshark

You can run Wireshark from the terminal by typing wireshark. It is a packet capture utility to observe the packets that are incoming and outgoing from your machine. The steps you need to run Wireshark are:

On the terminal type in wireshark
In wireshark specify the interface eth0
Next, in the bar on top, type in dns to filter only for dns packets
Run the dig query above, and observe these packets being captured by wireshark.

Using tshark

You can also use tshark the command-line equivalent of wireshark. To setup tshark to capture DNS packets run the following command, and run the dig query above.
```
tshark -i eth0 -f "port 53" -O dns -x -T jsonraw -J "dns" > dns.json
```
You can find a whole list of tshark commands in your github folders as well.
After a few seconds type in Ctrl+C to kill tshark. Open dns.json to view the output.

UDP Socket programming

DNS uses UDP rather than TCP. So there are no gaurantees about connections, and we are not going to establish a connection in advance at all this time.
Rather than explicitly connecting a socket to one particular desgination, this is a socket you can send to any destination!
We will use sendto() and recvfrom() socket calls. Look at their definitions in Python Sockets. What other inputs other than the buffer you are sending do these socket calls take as input?
Unlike TCP, UDP will not do partial sends and receives. If you call send on a message and the socket buffer is full, UDP will drop the message!

Grading Rubric

This assignment is worth five points.

0.5 point for completing weekly-lab questions and lab assessment.
1 point for sending a request to and correctly parsing a response from an authoritative server (e.g., sending a query directly to our local department’s server for a *.cs.swarthmore.edu host name).
1 point for traversing the DNS hierarchy down from the root to an authoritative server and letting me know which servers you’re querying and what they’re telling you along the way.
0.5 point for timing out and moving on to the next server in your list when you do not receive a response.
1 point for correctly detecting invalid host names and printing a reasonable error message.
1 point for resolving MX records.

When submitting, please provide a small executable script named lab3 along with your program. This script should take the same arguments as your program (described above) and it should call your program with those arguments. This helps me to account for various ways of invoking programs in different languages when grading your assignments.

FAQs

Try to structure your program in a modular way. You’ll have a much better time if you create one function, that you can call whenever necessary, to handle a task that comes up repeatedly (e.g., interpreting a DNS response message). Duplicating code leads to more difficult debugging!
The DNS protocol uses UDP rather than TCP. This means you only need to create one socket (make sure to use SOCK_DGRAM rather than SOCK_STREAM!), and you don’t need to connect() it to anything. Instead, you specify the destination every time you want to send, using a variant of the send() call named sendto(), which takes additional arguments to specify the destination. (Python: socket.sendto(), C: sendto())
In your queries, you can expect to encounter resource records of type A, MX, and NS. You’re likely to also come across CNAME (in the case of a name alias), SOA (if you’re asked to resolve a name that doesn’t exist), and AAAA (IPv6 answer). You don’t need to handle the first two in a special way, just print what you got and exit. When you get an AAAA response, look to see if you got other answer records of type A.
Unlike previous labs, this lab will require you to send binary integer values, which means you need to worry about byte ordering. In C, the functions htonl() and ntohl() (32-bit integers) and htons() and ntohs() (16-bit integers) will help you convert back and forth between h ost (your local machine’s integer format) and n etwork (the general standard for integers transmitted over the network) byte orderings. In Python, you’ll want to use the struct module.
When waiting for a response (while blocked on recvfrom()), you’ll need to tell the OS that you don’t want to block indefinitely, otherwise you might deadlock. Python makes this easy with the settimeout() socket method. In C, you can set the SO_RECVTIMEO option with setsockopt().
Since DNS is not a text-based protocol, Wireshark is a very useful tool for interpreting the data that you’re sending and receiving.
If you need to check for the presence of a single bit or set a single bit in a larger integer field, recall the bitwise operations you learned at the beginning of CS 31. If you bitwise and (&) a variable with a value that has the bit you want to test, you’ll get either 0 (it wasn’t set) or the value (it was set). With bitwise or (|), if you do variable = variable | value, you will set any of the bits that are 1’s in value.

Test your code in small increments. It’s much easier to localize a bug when you’ve only changed a few lines.

Submitting

Please remove any debugging output prior to submitting.

To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in your lab directory.