Data Structures and Algorithms

Lab 11: Social Meatia

Due on Sunday, May 1st at 11:59 PM. This is a team lab. You and your assigned lab partner(s) will complete this lab together. Make sure that you are familiar with the Partner Etiquette guidelines. You may discuss the concepts of this lab with other classmates, but you may not share your code with anyone other than course staff and your lab partner(s). Do not look at solutions written by students other than your team. If your team needs help, please post on the Piazza forum or contact the instructor or ninjas. If you have any doubts about what is okay and what is not, it’s much safer to ask than to risk violating the Academic Integrity Policy.

Overview

In this lab, you will write an implementation of the graph ADT using the data structures we discussed previously in this class. You will then write a collection of algorithms for that data structure. These algorithms will be used by several small programs to perform social-media-related tasks for the people of the fictional town of Hampton.

Note that this lab relies upon many of the previous data structures we have discussed in class. The algorithms at hand are not overly complex, but this lab may require a greater debugging effort than previous labs and will test your knowledge of how to use the data structures we have discussed thus far. You can find your starting code in the usual place.

Starting Code

Pig

Your starting code involves the usual data structure template and support files. It also includes implementations of several previous data structures as well as several small programs which make use of the graph ADT. Your task is to implement the graph data structure and its corresponding algorithms so that these small programs produce the expected output.

As usual, the starting code appars in your team’s repository. Notable files from that repository appear below. Those files which you may need to edit are bolded.

You will not have to edit any files other than the definition of the Graph-related classes and algorithms. You may not need to edit their definitions (if you are comfortable working with the fields and methods that are already present on the graph data structure), but you are permitted to do so.

C++ Standard Library

For this assignment, we are not using the STL. It is valuable to be familiar with the C++ STL, but the STL is quite complex and somewhat differently structured than the ADTs we’ve discussed in class. These differences are significant for engineering purposes only; the STL data structures’ theories are the same. The burden of learning this new interface, however, is significant and not the focus of this lab.

Step 1: Writing the Graph

Pig

Your first step should be to implement the basic features of the graph data structure: adding vertices and edges, finding the neighbors of a given vertex, and so on. Think carefully about memory management in this stage and make sure to test your data structure thoroughly! Any bugs in your graph are likely to cost you significant effort in the future.

The starter code uses two fields for the DirectedGraph class: one to store all of the vertices and the other to store all of the edges. We store the edges as a mapping from each source vertex to a pointer to a list of Edge objects from that source vertex; the Edge class is defined in directed_graph.h alongside the DirectedGraph declaration. You are not required to keep these fields – you may change them if you like – but your graph must have significantly better performance than a linear search through a list of edges. We recommend that you use the fields that have been defined for you.

Debugging with GraphViz

In addition to the usual testing tools, this lab provides a utility file called graph_io.h. That file contains a function render_graph which is likely to be extremely helpful to you. Given a Graph object, the render_graph function can render that graph into a file written in the language DOT. DOT is a language used by GraphViz, a graph visualization software package, to draw graphs into file formats like PNG or PDF.

For example, suppose we wanted to create a graph with two vertices and a single edge between them. We might write the following test:

Example Graph
TEST(create_single_edge_directed_graph) {
    Graph<int,float>* graph = new DirectedGraph<int,float>();
    graph->add_vertex(1);
    graph->add_vertex(2);
    graph->add_edge(1,2,0.5);
    render_graph<int,float,string>(
        "create_single_edge_directed_graph.dot", // filename
        graph, // the graph object to render
        true); // because this is a directed graph
    delete graph;
}

Once this test runs, a file named create_single_edge_directed_graph.dot will be created in the root directory of the project. This file can then be converted to a PDF using the following command:

dot -Tpdf <create_single_edge_directed_graph.dot >create_single_edge_directed_graph.pdf

Here, the -T tells dot which file format to use. The < tells your shell to give the program the contents of create_single_edge_directed_graph.dot as its input (instead of taking input from your terminal). The > tells your shell to write its output to the PDF file rather than to your terminal. The resulting PDF file will look like the figure to the right of the code above.

The code above gives explicit type parameters to render_graph, which is defined as a template function. To call render_graph, you should give type arguments in the following order:

Further Features

The render_graph function provides additional behavior to name and highlight vertices as well as highlight paths. This behavior occurs when you pass additional parameters to the render_graph function. For instance, the following test creates a three vertex graph and then highlights one of the edges it contains. It also provides names for the vertices.

Example Graph
TEST(create_three_vertex_directed_graph) {
    Graph<int,float>* graph = new DirectedGraph<int,float>();
    graph->add_vertex(1);
    graph->add_vertex(2);
    graph->add_vertex(3);
    graph->add_edge(1,2,0.5);
    graph->add_edge(1,3,0.75);
    graph->add_edge(2,3,0.1);

    Dictionary<int,string>* vertex_data = new HashTable<int,string>();
    vertex_data->put(1, "A");
    vertex_data->put(2, "B");
    vertex_data->put(3, "C");

    List<int>* highlight_vertices = new CircularArrayList<int>();

    List< Edge<int,float> >* highlight_edges =
            new CircularArrayList< Edge<int,float> >();
    highlight_edges->insert_at_back(Edge<int,float>(1,2,0.5));
    highlight_edges->insert_at_back(Edge<int,float>(2,3,0.1));

    render_graph<int,float,string>(
        "create_three_vertex_directed_graph.dot", // filename
        graph, // the graph object to render
        true, // because this is a directed graph
        vertex_data, // names for the vertices
        highlight_vertices, // the vertices to highlight (here, none)
        highlight_edges, // the edges to highlight
        true); // additionally, highlight vertices on the highlighted edges

    delete graph;
    delete vertex_data;
    delete highlight_vertices;
    delete highlight_edges;
}

Generating these graph images takes a not-insigificant amount of code but is ultimately worthwhile: it’s usually easier to spot a problem in a picture than it is in text. You should also note that the main functions for each of the programs you will be working on generates these data structures in the course of its execution, so you can easily modify those functions to generate debugging output for you!

UndirectedGraph

We stated in class that an “undirected graph” is a graph in which edges have no particular direction: an edge from vertex A to vertex B is equivalent to an edge from vertex B to vertex A. We can approximate that behavior defining a subclass of DirectedGraph called UndirectedGraph. UndirectedGraph overrides the methods used to add edges to the graph, giving them different behavior. Specifically, whenever a caller adds an edge from A to B, we simply add both edges to our DirectedGraph data structure. The code for add_edge has been written for you. You must write the code for get_edges that makes the UndirectedGraph object seem as if it only has one such edge (even though the DirectedGraph implementation actually has two).

You are also responsible for understanding these concepts of polymorphism and overriding. The graph algorithms we write will operate on any Graph* as long as the object maintains the invariants of the Graph interface. This means that we can write algorithms (such as depth-first search) without knowing whether the graph is directed or not.

Step 2: Depth-First Searching for Oinked In

Pig

With implementations of the Graph ADT complete, we can begin writing the graph algorithms that will make our programs work.

Those familiar with the social networking site Oinked In know that it primarily provides the service of guessing whether or not two people know each other. You will write a simple depth-first search algorithm which, given a graph, will be able to answer this pressing question.

The graph algorithm is_reachable is called by the program oined_in. You can build oined_in either by running make as usual or by compiling it specifically (make oinked_in). That program expects to receive four command-line arguments:

The oinked_in program expects to receive a social network graph; several such graphs can be found in the test_data directory with the prefix social_network. Social network graphs are undirected. Although their edges technically have integer weights (so that they fit into our framework), those weights are always 1. You should begin by testing your program on some very small social networks to see if your depth-first search is working correctly.

This simple tip may help you run the program from the command-line more effectively: instead of providing a normal filename to contain the results of the search, give /dev/stdout as the filename. This will cause the results to be printed to the terminal instead of a file, allowing you to skip the intermediate file step for one-off tests.

Using Dictionaries as Sets

Remember that a depth-first search on a graph must contend correct with cycles. This is especially important here, since the graph is undirected (and so every edge represents a cycle!). In order to avoid revisiting a node, you’ll need some way of keeping track of the nodes you’ve visited before.

Fortunately, dictionary data structures can be employed as sets! To keep track of a set of V, for instance, you can create a dictionary mapping V to any other type (e.g. Dictionary<V,bool>). Whenever an element is added to the set, you put that element with an arbitrary boolean. When an element is removed from the set, you remove that element.

A better idea would be to define a HashSet class which uses a HashTable field to perform these tasks. Then, you could write a series of operations such as add, remove, and even union so that the code which uses the Set would be more readable. You may do this, but you are not required to do so; just using a dictionary will work for this assignment.

Step 3: Breadth-First Searching and the Degrees of Kevin Bacon

Kevin Bacon

The Six Degrees of Kevin Bacon is a simple parlor game in which, given an entertainment personality, one must use six or fewer steps to connect that person to Kevin Bacon, a particular actor. This problem can be solved by representing the relationships between people as a social network graph and then performing a breadth-first search to find the path between them. The graph algorithm shortest_path will fulfill this requirement.

Once you have written shortest_path, you can utilize it by running the program bacon. This program expects three command-line arguments:

Kevin Bacon appears in all of the social network graphs that you have been provided. His ID is always 0. As with the oinked_in program above, bacon expects to receive a social network graph.

Step 4: Network Latency Testing via Dijkstra’s Algorithm for Instaham

Instaham is a social network designed primarily for the purpose of distributing pictures of food. A user in this network subscribes to a feed and receives these images when they are posted by the feed’s owner. In order to maximize food image distribution potential, the network engineers of Hampton need to determine the speed with which an image can be sent from one point in town to another.

Your test_data contains a collection of graphs which describe the town’s network. Town network graphs are undirected graphs weighted by floats describing the delay of the connection. Given a point in town, the network engineers need a list of other town locations and the minimum time to reach that location. To accomplish this task, you will implement the single_source_shortest_path_costs graph algorithm using Dijkstra’s algorithm.

Once you have implemented that function, you can use the program instaham to test it. This program expects three command-line arguments:

Step 5: Software Compilation for Facepork Using Topological Sort

Piglet

In many programming languages, source code must be compiled in a particular order: a source file cannot be compiled until the libraries that it uses have been compiled. You will assist the developers of Facepork, a social networking site, in compiling their newest rendition of their software. (They may have recently discovered some minor security flaws.) Given a graph describing the dependencies of various source files on other source files, you will produce an order in which these files can be compiled.

The topological_sort graph algorithm can complete this task. You should implement the topological sort using the recursive depth-first search algorithm we discussed in class. Topological sort only works on graphs without cycles, so you will test your algorithm on software project graphs, which appear in your test_data with the prefix software_project. Once you’ve implemented topological_sort, you can test it using the facepork program. This program only expects two arguments – the input graph and the output file – since a topological sort is a graph-wide operation which starts from an arbitrary position.

Note that the topological sort test results are generated using the approach we described in class, looping over the vertices in order from first to last and exploring each one. If you perform your operations in a different order, it is possible for you to produce a valid topological sort which does not match the provided test data.

Step 6: Network Design for Stynet Using Minimum Spanning Trees

For your final task, you must assist Hampton in designing Stynet, a new fiber-optic network for their town. The network should be able to reach all places in the town but should be as cheap as possible. You will load a town map (as in Instaham above) and take the weights of the graph to represent the cost of laying the cable between two points. Your objective will be to identify how to connect the nodes in the cheapest way possible.

Thankfully, you can complete this task by implementing the minimum_spanning_tree graph algorithm. You may use any MST algorithm, although it’s probably easiest to use Prim’s algorithm from lecture. Once you’ve completed this task, you can test your implementation by running the stynet program. As with facepork, this program only requires an input graph file and the file to which results should be written.

Testing

In previous assignments, you were provided a single RUN_MAIN routine. For this assignment, you have been provided one such routine for each of the above programs: RUN_OINKED_IN_MAIN, RUN_BACON_MAIN, etc. You’ve also been provided CHECK_FILES_EQUAL, as usual.

Make sure you write thorough tests for your Graph implementations before writing the graph algorithms; it will be extremely difficult to debug graph logic errors and your graph algorithms simultaneously! While you are not required to test the main functions for each program, you are required to find some way to test your graph algorithms; running the main functions in numerous different ways is sufficient to accomplish this (and is fairly terse).

Memory

Your program is required to run without any memory leaks or errors; you must free up all of the memory that you allocate and you must only use memory that you have allocated and initialized. We will use valgrind to determine if you have done this correctly, so you should as well. You should also consider using valgrind to find bugs in your program as you proceed; it’s one of two very powerful C++ debugging tools that we’ve discussed.

Coding Style Requirements

You are required to observe some good coding practices:

Well-dressed people (style)

Peer Review

As this is a team assignment, you are required to complete a peer review of your lab partner or partners. You must do this even if everything is fine. You should complete this peer review after you have turned in your assignment. The peer review comes in the form of a simple Google Form and can be accessed here; in most cases, it will likely take less than a minute to complete. If you have trouble accessing this peer review, please make sure that you are logged into Google using your Swarthmore credentials.

Your peer review will be kept private; only course staff will have access to this information. You can also update your peer review after you have submitted it. However, if you do not submit a peer review, you will lose points on your lab. Please don’t forget!

Summary of Requirements

Sleeping Pig
Source: Pixabay.com

When you are finished, you should have