mpigraph -- mpi communication diagnosis project


the more-or-less current sourcecode can be downloaded here, or can be browsed here
(note: this is a work in progress)

mpigraph is a project that aims to be used to graphically visualize aspects of an MPI system's topography. Primarily, inter-node and intra-node latency are the focus of this program, though it could easily be expanded to include other factors, such as bandwidth, with only a few modifications.

The Main Idea

The main idea behind the program is to create a NxN matrix (for a system with N nodes), where each location i,j represents the latency/bandwidth between nodes i and j. Next, these points are laid out on a x,y coordinate system with more or less random placement, and are put through a process of simulated annealing, where the lowest possible energy state is when their relative distances to one another are the same as the values stored in the NxN matrix.

How It Works (ideally ;p )

First, an MPI program is run in which the latencies and/or bandwidths are measured and recorded to a file. For the sake of simplicity, and due to my relative newness to MPI, I wrote a separate program to take over the task of analyzing/shaping the data after this point which did not use MPI. The process of simulated annealing follows more or less the following steps:

  1. Points (nodes or processors) are placed randomly in an x,y grid
  2. The "total energy state" is calculated
  3. A small random change is made
    1. if the change results in a lower energy state, take it
    2. else, if a certain random probability is met, take it
    3. else, don't take the change
  4. Repeat Process Many, Many, Many Times

If the locations of the points are recorded before and after being sent through the previously described process (plotted with gnuplot), and the calculations are based on the latencies of both inter and intra node tasks in an MPI program, the following can be witnessed:


Various Simulated Annealing Test Runs
(click on an image to enlarge)
Process TopographyBeforeAfter
4 nodes, 2 tasks per node
3 nodes, 6 tasks per node
4 nodes, 5 tasks per node

To Be Improved, Enhancements, et c...

There's still a lot that I'd add on if I had the time. A few ideas: