CS 45 Lab 5: Building a File System

Checkpoint: Monday, April 9 (3-minute demo during lab)
Due: Wednesday, April 18 @ 11:59 PM

Handy References:


Lab Audio, Week 1
Lab Audio, Week 2
Lab Audio, Week 3

Lab 5 Goals:

Overview

You've probably taken file systems for granted your whole life: you click save, and the file is safely stored on the disk for you to retrieve later. For this lab, you'll develop your very own file system so you can see what goes in to making that 'save' work. Unlike a real FS though, we'll be focusing only on correctness rather than performance too. It's difficult enough making the FS be reliable!

To interface with the OS, we'll use the File System in Userspace (FUSE) library, which allows you to write a userspace program that manages files and integrate them into the OS's file structure. That is, when your userspace program is running, you'll be able to interact with your file system from the terminal using standard commands (ls, mkdir, rm, etc.). FUSE is the same mechanism that SSHFS uses to provide files over an SSH connection.

To mount your file system, you'll need to choose a location in the system's file tree to "graft" your files. Because of the way FUSE works, you cannot mount on top of an NFS file system (unless you're the root user, which you aren't). Thus, you will not be able to mount within your home directory. Luckily, you still have /local available to you. I would suggest making a new empty directory in /local: /local/fuse-username to use for testing. To be clear, you can still store, edit, build, and run your swatfs code from your home directory. When you execute it, you need to tell it to mount your files in a non-NFS location like /local.

You can mount the FS with this command:

# The -d and -s are FUSE parameters:
# -d: debug mode - provide output about the operations that are happening
# -s: single-threaded mode - you probably don't want to worry about concurrency right now...
./swatfs -d -s /local/fuse-username [disk image]

Replace the "fuse-username" above with the directory you created in /local for testing. The disk image will be the path to a file that represents a "disk" that has been formatted with the swatfs format. I've provided an example disk image, named test-disk.img that is pre-populated with a few directories and files. Use this to test your FS's read functionality before you have writing implemented. You can create a new disk image with a combination of dd and swatfs_mkfs:

# Create a 100-block disk image file, named disk.img, with a block size of 4096 bytes.
dd if=/dev/zero of=disk.img bs=4096 count=100

# Format the file system for use with swatfs:
./swatfs_mkfs disk.img

If you break a file system image (e.g. you attempt to write and something goes wrong), you can always reformat it to bring it back to a clean slate (no files, all blocks free, etc.).

To unmount your file system, you want to cleanly remove it from the system's file tree. Don't just Ctrl-C your swatfs program. Instead, run fusermount -u [mount point]. That should terminate your userspace process and unmount the FS cleanly.

Requirements

Warning: The amount of code you'll need to write for this lab is likely to be lager than previous labs in this course. You should start this lab early.
  1. Your file system should support creating, reading, writing, and removing both regular files and directories. It should correctly report file attributes (e.g., if the user calls stat(), which triggers your FUSE getattr()). To make all this happen, you should provide implementations for all of the empty swatfs_ functions in swatfs.c. Ultimately, typical commands like ls, cd, rm, rmdir, I/O redirection (< and >), and text editors should all be able to interact with the files and directories in your file system.

    When dealing with file attributes, you only need to worry about those that exist in the inode struct defined in swatfs_types.h. For example, your FS should properly update a file's modification time, but it does not need to worry about access time, since there is no access time field in your inode struct.

  2. Your file system should detect errors when they happen, and when they do, set errno and return an appropriate error value. It's not feasible for me to dictate every single possible error that might occur, so please choose error constants that describe the error condition as closely as possible. You can see a list of error constants using errno --list or search for specific words using errno --search [word]. Try to be as specific as possible. For example, if the user asks to remove a directory that isn't empty, don't use EIO (I/O error), even though this is an I/O related problem, because ENOTEMPTY is a much better fit.

  3. You should use the data structures defined in swatfs_types.h without modifying them. They're designed to be a particular size that evenly divide blocks. The function prototypes for the swatfs_ functions in swatfs.c are dictated by the FUSE library, so it will fail to compile if you change those. Otherwise, I don't care too much how you structure things or where you put code. I would strongly suggest factoring out common functionality into helper functions though. There are several places where functions have major overlap.

  4. Your implementation should use the #defined constants from swatfs_types.h when dealing with numerical constants, and it should continue to function properly if those constants change.

    You may always assume:
    • The root directory (path "/") will always be described by the inode whose number is 0.

    • There will not be multiple hard links for a file. That is, for every inode in the system (other than inode 0, which is special), the links field will be either 0 (inode is free) or 1 (inode is being used). It will not be larger than 1.

    • No path parameter given to your swatfs_ functions will be larger than SWATFS_NAMELEN.

    • There will be no gaps in the files you are asked to write. For example, if you have an existing file that is 100 bytes long (bytes 0 through 99), the user can ask to do a write starting at any offset from 0 through 100, where "100" means "add new data to the end of the file". They would not be allowed to request a write to an offset >=101 without first writing to 100.

    • Directories will not need to use indirect block pointers. That is, there will not be more than 64 entries in a directory when using a 4096-byte block size.

    • The size of a struct inode and the size of a struct dirent will evenly divide the size of a block.

  5. Your file system implementation should get a clean bill of health from valgrind: no invalid memory accesses, uninitialized variables, or memory leaks.

Checkpoint

For the checkpoint, your file system should be able to read directories (e.g., ls should work) and read regular files. Your GitHub repository contains a disk image named test-disk.img that I've pre-populated with a few directories and files of varying sizes for testing purposes. You should be able to list and read all of those. For the .jpg image file, you can use the eog application ("eye of GNOME") to open it from the command line.

Starter Code Map

This lab uses several source files to divide up the functionality across logical boundaries. With the exception of the structs in swatfs_types.h, you're allowed to modify any code you'd like. Most of your changes are expected to go in swatfs.c and swatfs_directory.c:

While you're allowed to modify the other files, I don't think you'll need to. They're there to provide you with basic functionality like retrieving/storing data to the disk. The corresponding headers for each of these has more details about the functions, their arguments, and their return values:

With all the default settings, the block layout and inode block pointers will look like these figures:

The layout of block numbers in SwatFS. The structure of the block pointers in a SwatFS inode.

Tips

Submitting

Please remove any excessive debugging output prior to submitting.

To submit your code, commit your changes locally using git add and git commit. Then run git push while in your lab directory.