CS31: Lab 4 Extra

As an extra challenge, add these two features to your Part 1 solution:

Support for variable-width histogram buckets.
Support for a scaled histogram display.

If you do the extra challenge, copy your grades.c to a new file, extra.c, and add the extras there.

cp grades.c extra.c

Add extra.c to your git repo:

git add extra.c
git commit -m "Starting extra challenge"
git push

Compile extra.c with:

gcc -g -Wall -o extra extra.c readfile.c -lm

You could try to modify the Makefile to add this command, but you do not have to. Here is some information about writing Makefiles if you want to try.

Implement your extra challenge solutions in a modular and generic way, i.e. it should work correctly for any sized data set and any sized histogram buckets.

Variable-Width Buckets

Modify your program so that after reading in the file, it prompts the user for a bucket width and then reads it in as an int (you can use scanf). Use this bucket width value to change how your program prints the histogram. Dynamically allocate a histogram array based on the bucket width and the min and max values in the data set. When you print the histogram, start from the first bucket that actually has values, and stop printing buckets after reaching the largest exam value. For example, if the bucket width is 20, and the set of grade values is (47.5, 51.0, 53.5, 56.0, 59.5, 67.25, 78.0), then your histogram when print out would look something like this:

 40-59: *****
 60-79: **

If you run your program on the same data set with a bucket width of 5, your histogram will look something like:

45-49: *
50-54: **
55-59: ** 
60-65: 
65-69: *
70-74:
75-79: *

If one bucket's range needs to be smaller than the others, it should contain the largest grade values, e.g. [..., 80-87, 88-95, 96-100].

Scaled Histogram

Modify your histogram printing function so that if the data set is large, each histogram bucket can still be represented in one line. You should define a constant, e.g.

#define HISTWIDTH 50

for the maximum number of * characters allowed. Then, scale the number of stars for each bucket by the smallest power of 10 that keeps each line's length below that constant (rounding the number of stars up to ensure that all grades are represented). You should also print the scale so that users can interpret your histogram correctly. If the file contains 1000 grade values in the 40-59 range, 1101 in the 60-79 range, and 25 in the 80-99 range, your histogram might look like:

 40-59: **********
 60-79: ************
 80-99: *
scale: * = 100 grades

Generating Large Input Files

You can copy a python program for generating a large number of grade values:

cp ~mauskop/public/cs31/makegrades.py .

This program takes one command line argument specifying the number of values to generate. Here are two runs, one to generate 10 values, the other to generate 1,000 values:

python makegrades.py 10
python makegrades.py 1000

You can redirect the output to a file and then use this file as input to extra.c (or grades.c).

python makegrades.py 10 > file_10.txt
python makegrades.py 100 > file_100.txt
python makegrades.py 1000 > file_1000.txt

Submit

The extra challenge is due at the lab 4 submission deadline. If you are submitting extra.c as a complete solution to the challenge part, add a CHALLENGE file to your git repo (so we can distinguish between attempted and submitted extra.c's).

echo "we did it" > CHALLENGE
git add CHALLENGE extra.c
git commit -m "Completed challenge"
git push

CS31 Lab 4

Variable-Width Buckets

Scaled Histogram

Generating Large Input Files