Introduction to C Programming for CS31 Students

Part 1: variables, functions, arrays, strings


Contents:

  1. Getting Started simple program, compiling
  2. Variables
  3. Input/Output
  4. Branching
  5. Loops
  6. Functions
  7. Arrays
  8. Strings
Part 2 contains information about structs and pointers (it will be covered later in the semester)
Overview and Resources

This page includes a brief overview to C programming for students who have taken CS21 or an equivalent introductory CS course. We will start with some of the C basics, which is much of the C programming language, and then will add more C programming features as the semester progresses. As you are implementing C programs for lab assignments, make use of:

  1. The information on this page
  2. My C programming Documentation and C Programming Resources. This contains all kinds of C programming documentation including:
    • How to compile and run C programs on our system
    • The C Code Style Guide (read this and follow it)
    • Documentation about different C types and C data structures (char, strings, file I/O, pointers, arrays, linked lists, ...).
    • Links to C programming tutorials and C language documentation.
    • How to use C debugging tools: gdb and valgrind
  3. Example C code that I give you in class and weekly lab.
Code examples on this page can be copied over from my public/cs31/C_examples directory:
   # if you don't have a cs31 subdirectory, create one first:
   mkdir cs31
   # copy over my C example files into your cs31 subdirecory:
   cd cs31
   cp -r /home/newhall/public/cs31/C_examples  .
   # cd into your copy, run make to compile
   cd C_examples
   ls
   make

Getting Started Programming in C
Below is the hello world program in C with a lot of comments. I would put it in a file named hello.c (.c is the suffix convention used for C source code files).
/* 
  The Hello World Program in C
  (this is also an example of a multi-line comment)
*/
#include <stdio.h>   // include the C standard I/O library

// Any executable program must have exactly one function called main
int main() {
  printf("Hello World\n");
  return 0;
}
Note the following features of the basic program:

To run a program, we must first save the code using vim or another editor on our system, then compile the source to an executable form and run the executable form of our program. The syntax for compiling is

 $ gcc -o <output_executable_file> <input_source_file> 
for example, gcc compiles hello.c into an executable file named hello:
 $ gcc -o hello hello.c
We run the executable program using ./hello:
 $ ./hello
If we change the source (hello.c file), we must recompile with gcc before running ./hello. If there are any errors, the ./hello file will not be created/recreated (but beware, an older version of the file from a previous successful compile may still exist). If you do not include the -o outputfile, gcc creates the executable in a file named a.out.

Variables
Variables are named containers for holding data. In C all variables must be declared before use. To declare a variable, use the following syntax:
type_name variable_name;
A variable can only have a single type. Valid basic types include int, float, double, char. Examples for declaring variables are shown below. In C, variables must be declared at the beginning of their scope (top of a { } block) before any C statements in that scope (this is not true in C++, so if you are coming from CS35, be sure to follow C variable declaration convention).
{
 /* DECLARE ALL VARIABLES OF THIS SCOPE AT THE TOP OF THE BLOCK { */
 int x;         // declaring x to be an int type variable
 int i,j,k;     // can declare multiple variables of the same type on one line
 char letter;   // a char stores a single ASCII value 
                // a char in C is a different type that a string in C
 float winpct;  // winpct is declared to be a float type 
 double pi;     // the double type is more precise than float

 /* AFTER DECLARING ALL VARIABLES YOU CAN USE THEM IN C STATEMENTS */
 x = 7;         // x stores 7, initialize all variables before using them
 k = x + 2;     // use x's value in an expression

 letter = 'A';      // a single quote is used for single character value
 letter = letter+1; // letter stores 'B' (its ascii value is one more than 'A's)

 double pi = 3.1415926; // the double type is more precise than float

 winpct = 11/2;  // winpct gets 5.5, winpct is a float type
 j = 11/2;       // j gets 5: int division truncates anything after the decimal
 x = k%2;        // % is C's mod operator, so x gets 9 mod 2 (1)
}
Note the semicolons galore. C expects one after every statement. You'll forget them. gcc almost never says "You missed a semicolon" even though that might be the only thing wrong with your program. As you program more in C, you will learn to translate gcc errors to the error in your program.

On most variable types, you may use the following operators. Some may not apply depending on the operand type.



Input/Output (printf and scanf)
C uses the printf function for printing to standard out (the terminal), and scanf is one function for reading in values (usually from the keyboard). scanf is similar to printf, and it is the first way we will do program input. However, it is not very resilient to users entering bad values, so later we will learn better ways to read in values.

printf and scanf are part of the stdio.h library that needs to be #included at the top of the .c file using them.

printf is very similar to formatted print statements in Python, where you provide a format string to print and then values to fill the placeholders in the format string. Here are some printf examples:

  int x = 5, y = 10;
  float pi = 3.14;

  // print the values of x and y followed by a newline character:
  printf("x is %d and y is %d\n", x, y);  

  // print a float value (%g) a string value (%s) and an int value (%d)
  // separated by tab characters (\t) followed by a new line character (\n):
  printf("%g \t %s \t %d\n", pi, "hello", y); 
The following are the placeholders for different types:
------------------------------------------------------
 %g, %f: placeholders for a float or double value
 %d:     placeholder for an int value
 %c:     placeholder for a single character
 %s:     placeholder for a string value, in C a string is any chars between 
           double quotes (e.g.  "hello there"  is a string literal)

The following are special formatting characters:
-----------------------------------------------
\t: print a tab character
\n: print a newline character

You can also specify field width for the values:
------------------------------------------------
%5.3f: print float value in space 5 chars wide, with 3 places beyond decimal
%20s:  print the string value in a field of 20 chars wide, right justified 
%-20s: print the string value in a field of 20 chars wide, left justified 
%8d:   print the int value in a field of 8 chars wide, right justified 
%-8d:  print the int value in a field of 8 chars wide, left justified 

Here is an example full program using a lot of formatting:
#include <stdio.h> // library needed for printf

int main() {
  float x=4.50001;
  float y=5.199999;
  char ch = 'a';
  printf("%.1f %.1f\n", x, y); // prints out x and y with single precision 
  // nice tabular output
  printf("%6.1f \t %6.1f \t %c\n", x, y, ch);  
  printf("%6.1f \t %6.1f \t %c\n", x+1, y+1, ch+1);  
  printf("%6.1f \t %6.1f \t %c\n", x*20, y*20, ch+2);  
  return 0;
}

scanf

scanf is one way in which your program can read in input values entered by a user. It is very picky about the exact format in which the user enters data, which makes it not very robust to badly formed user input. For now we will use it, later we will use a more robust way of reading input values from the user. For now, just remember that if your program gets into an infinite loop due to badly formed user input you can always type CNTRL-C to kill it.

A scanf call looks a lot like a printf call, it has a format string followed by variable locations into which the values read in should be stored. To specify the location of a variable, you need to use the & operator, which evaluates to "the memory location (or address) of the associated variable". Here are some examples:

  int x;
  float pi;

  // read in an int value followed by a float value ("%d%g")
  // store the int value at the memory location of x (&x)
  // store the float value at the memory location of pi (&pi)
  scanf("%d%g", &x, &pi);
The scanf will skip over leading and trailing whitespace characters (e.g. ' ', '\t', '\n') as it finds the start and end of each numeric literal. Thus, a user could enter the value 8 and 3.4 in any of the three ways listed below and the call to scanf above would assign 8 to x and 3.4 to pi:
8 3.4
         8             3.4
8
3.4
The format string for scanf is a bit different than for printf in that you often do not need to specify white space chars in the format string for reading in consecutive numeric values:
// reads in an int and a float separated by at least one white space character
  scanf("%d%g",&x, &pi);  
scanf can seem to behave very strangely for format string with different type placeholders, so if you get some odd behavior play around with the format string a bit and try different types. My documentation about file I/O has some example scanf format strings.

Branching with if/else
The syntax for branching in C is very similar to in Python. The main difference is that where Python uses indenting to indicate "body" statements, C used curly braces (but you should also use good indenting to in your C code). Here is the basic if-else syntax (the else part is optional):
//a one way branch
if ( <Boolean expression> ){
  <true body>
}

// a two way branch
if ( <Boolean expression> ){
  <true body>
}
else{
  <false body>
}

// a multibranch:
if ( <Boolean expression 1> ){
  <true body>
}
else if( <Boolean expression  2>){
  //first expression is false, second is true
  <true 2 body>
}
// can have more else if's here 
// ...
else{
  // if all previous experessions are false
  <false body>
}

Boolean Values in C

C does not have a Boolean type with true or false values, instead int values are used to represent true or false in conditional statements:

The set of operators you can use in constructing boolean expressions are the following (listed in precedence order):

Here is an example conditional statement in C (it is always good to use parens around complex boolean expressions to make them easy to read):
if (y == 10)) {
  printf("y is 10");
} else if((x > 10) && (y > x)) {
  printf("y is bigger than x and 10\n");
  x = 13;
} else if ((x == 10) || (y > x+20)) {
  printf("y might be bigger than x\n");
  x = y*x;
} else {
  printf("I have no idea what the relationship between x and y is\n");
}
Loops

for loops

For loops are different in C than they are in Python. In python for loops are iterations over sequences, in C for loops are more general looping constructs. The C for loop syntax is:
for( <initialization>; <boolean expression>; <step> ){
 <body>
}
The rules for evaluation are:
  1. Evaluate initialization one time when first evaluate the for loop.
  2. Evaluate the boolean expression, if it is false (0), then drop out of the for loop (you are done repeating the loop body statements).
  3. Evaluate the statements inside the loop body
  4. Evaluate the step expression
  5. goto step (2).

Here is a simple example for loop to print out the values 0 through 9:

for (int i=0; i<10; i++){
   printf("%d\n", i);
}
See forLoop1.c and forLoop2.c for more examples.

while loops

While loop syntax in C is similar to in Python, and is evaluated similarly:
while ( <Boolean expression> ){
  <true body>
}
The while loop checks the Boolean expression first and executes the body if true. A similar do-while loop executes the body first, then checks a condition and runs the loop again if the condition is true:
do{
  <body>
} while ( <Boolean expression> );
In C, for loops and while loops are equivalent in power (this is not true in Python), thus C would only need to provide one of these looping constructs. However, for loops tend to be a more natural language construct for definite loops (like iterating over values in a list), and while loops tend to be more natural language construct for indefinite loops (like repeating until the user enters an even number). Therefore, C provides both.

See whileLoop1.c and whileLoop2.c for examples.

Functions
Use functions to break code into manageable pieces and reduce code duplication. Functions may take parameters as input and return a single value of a specific type. A function declaration specifies the function's name, return type, and the parameter list (the number and type of all parameters). A function definition includes the code to be executed when the function is called. All functions in C must be declared before they are called. This is typically done using a function prototype, but it can also be acomplished by having the function definition appear before it is called in a file.
function definition format:
---------------------------
<return type> <function name> (<parameter list>)
{
  <function body>
}

parameter list format:
---------------------
<type> <parm1 name>, <type> <parm2 name>, ...,  <type> <last parm name> 

A function that does not return a value has a void return type.

Arguments are passed to C functions by value. Thus a copy of the variables value is made before the body of the function executes. Any modifications to the parameters in the function are not visible to the callee.

Here is an example function definition followed by a call to it:

int max(int x, int y) {
  int bigger;
  bigger = x;
  if(y > x) {
    bigger = y;
  }
  return bigger; 
}
int main() {
   int a, b;
   printf("Enter two integer values: ");
   scanf("%d%d", &a, &b);
   printf("The larger value is %d\n", max(a,b));
}

See function1.c for this and another example.

Exercise: Implement and test a power function (for positive integer exponents only).

Arrays
Arrays are like C's version of lists. Python provides a high-level list interface to the programmer that hides much of the low-level implementation details. In C, however, the programmer has to implement this low-level list functionality; arrays are just the low-level data storage without higher-level functionality like size, insert, append, etc.

Arrays can store multiple items of the same type. For now, we will use only statically declared arrays, meaning we must know the total capacity (number of buckets) of the array at compile time, and we declare the array to be of that capacity. We cannot shrink or grow the array at run time (at least not yet).

To declare an array, specify its type, name and total capacity (number of buckets):

int  arr[10];  // an array of 10 ints
char str[20]; // an array of 20 char...could be a C-style string
Individual array elements may be accessed by indexing:
int i, num;

num = 5;
for(i=0; i < num; i++) {  // initialize the first 5 buckets of arr
   arr[i] = i;
} 
arr[5] = 100;
num++;
Notice that we declared the array to have 10 buckets, but we are only using 6 of them (our current list is of size 6 not 10). It is often the case when using statically declared arrays that there is unused capacity. Thus, we need to have a program variable that keeps track of the actual size of the list (num in this example).

Arrays and Functions

To declare an array function parameter we must use the syntax int a[] (or int *a, but we will use this syntax later). Note we do not specify the capacity of the array parameter in the parameter list (the function can accept an int array of any capacity). Arrays also do not know their size, so if we want the function to know how many buckets are in use, we should also pass the size value as a parameter. For example:

 
void printArray(int a[], int size) { 
  int i;
  for(i=0; i < size; i++) {
      printf("%d\n", a[i]);
  }
}
To call a function with an array parameter, pass only the name of the array as the argument, omitting the brackets. For example:
printArray(arr, num);
The name of the array variable is equivalent to the base address of the array (the memory location of its 0th bucket). This means that the argument's array buckets are NOT passed by value to the function (i.e. the function's parameter DOES NOT get a copy of every array bucket of its argument). Instead, the parameter gets the value of the memory location of the first bucket in the argument array (the base address of the array). The implications of this are that when array buckets are modified inside the called function (e.g. a[2] = 8), they also modify the contents of the corresponding bucket in the argument (i.e. arr[2] is now 8). This is becuase the parameter REFERS TO the same array storage locations as its argument.

Question:What happens if you go beyond the bounds of an array in C?

int array[10];
array[10] = 100;  // 10 is not a valid index into the array of 10 int buckets
Answer: Unexpected program behavior. It could lead to your program crashing, it could change another variable's value, or it could have no effect on your program's behavior; it is a program bug that may or may not show up as buggy program behavior. It is up to the C programmer to ensure that index values are valid and to avoid accessing array buckets beyond the bounds of an array.

The files array1.c and array2.c have some example uses of arrays.

Exercise: complete and test the function minimum in array2.c.

Strings
Strings in C are just arrays of characters terminated by a special null character value '\0'. Not every array of char is used as a C string, but every string is an array of char. C has a string library that contains functions for manipulating C strings. One thing to keep in mind as you use the string library is that you are responsible for allocating the space for the underlying char array, and that the terminating '\0' character needs to be included in that space. For example, to store the string "hi", you need an array of at least 3 chars (one to store 'h', one to store 'i', and one to store '\0'). The string library functions will determine the end of a string by searching for the '\0' character, they also will add that character to the end of any string they initialize for you (e.g. strcpy will null terminate the destination string). Here is a very simple example:
#include <string.h>

int main() {
  char str1[10];
  char str2[1];
  str[0] = 'h';
  str[1] = 'i'; 
  str[2] = '\0'; 
  printf("%s %d\n", str1, strlen(str1));  // prints hi 2 to stdout
  strcpy(str2, str1);    // strcpy copies the contents of str1 to str2  
  printf("%s\n", str2);  // prints hi to stdout
}
See my Strings in C documentation for more string and string library examples. In particular look at the string library functions strlen, strcpy and strcmp. (note: some of the example code here use dynamically allocated strings, which we have not yet learned).

lvalues

An lvalue is an expression that can appear on the left hand side of an assignment statement. In C, single variables or array buckets are lvalues, but arrays are not. The following example illustrates valid and invalid C assignment statements based on lvalue status:
int x;
char arr[10], ch;
x = 10;                 // valid: x is an lvalue
ch = 'm';               // valid: ch is an lvalue
arr[3] = ch;            // valid: arr[3] is an lvalue
x + 1 = 8;              // invalid: x+1 is not an lvalue
arr = "hello there";    // invalid: arr is not an lvalue
ch = "h";               // invalid: ch is an lvalue, but "h" is not a char value
ch = 'h';               // valid: ch is an lvalue, 'h' is a char value