Week 4: 3D Space/LookAt

Reading

The Immersive Linear Algebra website has good introductory material on linear algebra topics related to computer graphics

Chapter 3: Dot Product (Sections 3.1-3.3, Ex 3.6, 3.7)
Chapter 4: Vector Product (Sections 4.1-4.4, Ex 4.2, 4.5)
Chapter 1: Introduction
Chapter 2: Vectors (Sections 2.1-2.5)
Chapter 6: The Matrix (6.1-6.4, 6.8)

Some LookAt references:

gluLookAt() Song Ho Ahn (안성호) good math and animations.
LearnOpenGL LookAt Notes and camera controls. Uses glm and variable names a bit confusing, but otherwise good.

Quiz Friday

The first quiz will be this Friday, 26 September in class. Please bring something to write with.

You should be familiar with basic geometric concepts of
- Point
- Vector
- Triangle/Square shapes
- How to combine points/vectors
How to use (but not necessarily compute) the dot product and cross product
- Derive distance to line
- Derive leftOf operation
The basics of shaders
- difference between fragment/vertex shader
- in/out/uniform qualifiers
- where do shaders get input, where do they send output
An overview of the OpenGL pipeline. What are the primary steps in drawing a single triangle?
- Some basic frames: clip space, screen space, grid space.

Transforms

In 3D, we will manipulate geometry including points and vectors primarily using 4x4 matrix multiplication. Some common transformations are listed below.

Translation

Translation in three dimensions can be expressed as a matrix-vector (4x1 matrix) multiplication using 4D homogenous coordinates. What happens if you apply a translation to a geometric vector with this transform?

\[T(t_x, t_y, t_x) \cdot P = \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \cdot \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \\ \end{pmatrix} = \begin{pmatrix} p_x + t_x \\ p_y + t_y\\ p_z + t_z\\ 1 \\ \end{pmatrix}\]

Scale

\[S(s_x, s_y, s_z) \cdot P = \begin{bmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \cdot \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \\ \end{pmatrix} = \begin{pmatrix} s_x \cdot p_x \\ s_y \cdot p_y \\ s_z \cdot p_z \\ 1 \\ \end{pmatrix}\]

Rotation

There are several ways to do rotation. We will primarily focus on methods that rotate about one of the axes of the basis. For example, a rotation around the \(z\)-axis is helpful, even for 2D applications, as it only modifies the \(x\) and \(y\) coordinates.

\[R_z(\theta) \cdot P = \begin{bmatrix} \cos \theta & -\sin \theta & 0 & 0 \\ \sin \theta & \cos \theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \cdot \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \\ \end{pmatrix} = \begin{pmatrix} p_x \cdot \cos \theta - p_y \cdot \sin \theta \\ p_x \cdot \sin \theta + p_y \cdot \cos \theta \\ p_z \\ 1 \\ \end{pmatrix}\]

Consider looking down the \(+z\)-axis. Is the rotation clockwise or counter-clockwise? How can you tell?

The matrices for \(R_x\) and \(R_y\) are given below.

\[R_x(\theta) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta & 0 \\ 0 & \sin \theta & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \ \ R_y (\theta) = \begin{bmatrix} \cos \theta & 0 & \sin \theta & 0 \\ 0 & 1 & 0 & 0 \\ -\sin \theta & 0 & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}\]

Using matrix transforms

The good news for this course is you will almost never have to create one of these matrices by hand on either the CPU or the GPU. Qt provides helper functions to create and manipulate 4x4 matrices and common graphics transforms.

What you should be able to do in this course is:

Describe what effect a transform has on some geometry. Sketch a small scene before and after the transform.
Understand that the order in which you apply transforms matters. Matrix multiplication is not commutative, i.e., \(AB \neq BA\) in general for matrices \(A\) and \(B\)
Know what frame/basis you are working in and how to convert between frames/basis.

Moving to 3D

Given the work we have done with the OpenGL pipeline so far, moving to 3D in some ways is relatively simple: we just add an extra coordinate. Creating buffers, setting up shader programs, and setting uniforms is the same in 3D as it is in 2D. I posted a small rotating sphere demo (Uses Javascript/TWGL) online and in the GHE w04-cube repo.

In 3D, the geometry tends to get more complex. To draw a sphere, we still need to create an array of points and render the sphere as a set of triangles, but now we will need many more triangles.

In our demo, we are divide the sphere into 20 longitudinal strips and 20 latitudinal stacks to create 400 quads. Each of these is divided into two triangles. TWGL will automatically create the buffer of points for us with the label position. Additionally, TWGL will create texture coordinates for each vertex so that we can wrap an image around the geometry, as is shown with the Earth texuture. TWGL will also create normal vectors at each vertex, which are unit length vectors perpendicular to the surface. While not used in this demo, normal vectors will be important once we talk about lighting.

Indexed drawing

For complex geometries, it is sometimes difficult to express the shape in a compact array that can be called using glDrawArrays with a GL_TRIANGLE_STRIP or GL_TRIANGLE_FAN. And repeating vertices to used GL_TRIANGLES could dramatically increase the size of our buffers.

OpenGL supports another drawing method called glDrawElements where we specify each vertex once, but provide a separate array of indices that index into our vertex array to draw the geometry. We can repeat an index multiple times to refer to the position coordinates for a particular vertex. For shapes like a sphere where a single vertex is part of multiple triangles, repeating a single integer index is more space efficient than repeating the geometry.

Projection and Model Transforms

Much like lab3, we don’t want to have to model everything in clip coordinates. Our 3D demo uses an orthographic projection matrix to define a rectangular space that is convenient for us. I just picked 10x10x10 as a reasonable default. I also scale the x-direction by the aspect ratio of the canvas so the scale looks uniform in the final image. Without the correction, spheres look squished depending on the canvas size.

With an orthographic projection matrix, we can model our scene in more convenient coordinates, and then project in the vertex shader down to clip space.

It is also common in 3D to apply individual model transforms to objects to position them in the world. In the demo, we apply a time varying rotation matrix to spin the globe, but we will explore model transforms more on Friday and lab4.

Model Transforms

Earlier, we introduced twgl primitives to construct 3D geometric models and we used the Orthographic projection to map our world into clip space. Most third party libraries typically construct the primitives in a standard object frame, e.g., for the sphere, the coordinates are chosen such that the center of the sphere is always at the origin. The primitive objects are then transformed to different locations in the world through model transforms. These are typically compositions of the translations, rotations, and scales we discussed earlier. We refer to the matrix that transforms the object from the standard object coordinate frame to the world frame as the model matrix. It is usually the first matrix to be applied to the geometry in the vertex shader. In our current demo, we apply the model matrix first, and then the orthographic projection matrix to transform the world coordinates to clip space.

\[\mathbf{v} = \mathbf{PMp}\]

where \(\mathbf{v}\) is the output vertex geometry (in clip coordinates), \(\mathbf{P}\) is the projection matrix, \(\mathbf{M}\) is the model matrix, and \(\mathbf{p}\) is the original vertex geometry of the sphere in object coordinates. Note when we say the model matrix is applied first, we mean it is closest to the \(\mathbf{p}\). In reading from left to right, we see the projection matrix first, but this matrix is applied to the result of the multiplication of \(\mathbf{M}\) and \(\mathbf{p}\), so its effects happen later.

This order of matrix operations is important. Consider a rotation and translation matrix. In the rotating sphere demo, I have modified the code to add some slider controls. Try setting the speed to 0 for a moment and adjusting the rotate and translate amounts. The translation slider translates the sphere in the current \(\vec{x}\) direction by building a translation matrix \(\mathbf{T}\). The rotation slider rotates the sphere around the current \(\vec{y}\) direction by building a rotataion matrix \(\mathbf{R}\). If the rotate_first check box is checked, the model matrix is \(\mathbf{M} = \mathbf{TR}\). Otherwise, the demo computes \(\mathbf{M} = \mathbf{RT}\). Again, the notation of first means which transform is applied first to the object geometry.

Try adjusting the sliders to get a feel for how the \(\mathbf{TR}\) and \(\mathbf{RT}\) transforms are different. You can check the mini_earth box to put a small static globe at the origin that is unaffected by either transform. You can also adjust the speed while tweaking the translate and rotate_first fields. This will animate the rotation amount. With a speed of 0, you can set a fixed rotation in degrees.

Occlusion testing and z-buffers

For 3D modeling, we still project our final output image into a 2D screen. During this final conversion, we only retain color information of what is visible, losing information about where the original objects were in 3D space. To make the scenes look realistic, we must find ways to convey the depth of the scene in a 2D image. We will explore multiple ways of doing this. Perspective transformations and lighting will be important upcoming steps in the next few weeks. But another key feature that is already implemented for us in the OpenGL pipeline is occlusion detection. If we enable the mini Earth in our demo and allow our larger model to rotate around the center of the scene, we note that sometimes the larger Earth is in front, occluding the mini Earth, while at other times, the larger Earth is behind its smaller version and partially occluded.

Because both the vertex shader and fragment shader are highly parallel, we don’t know when a particular vertex or fragment will be processed in the pipeline, and we cannot always order the geometry in a correct back to front rendering order. To detect which object we should draw when multiple objects overlap the same pixel, we use an auxiliary depth or z-buffer to keep track of track of the depth of the the most recent fragment written to the color buffer at a particular pixel location.

When it time to process a fragment, we assign its out color, but then check the z-buffer to determine if the depth of this fragment is smaller (closer to the viewer) than the most recently written fragment at this location. If our new fragment is closer, it will be visible, and we update both the color buffer with and the z-buffer with the information of the new fragment. If the new fragment is not closer however, it is occluded by another fragment, and we can safely discard this fragment without updating the buffer.

These depth checks assume our objects are not semi-transparent. For semi-transparent objects, we must be more careful in how we process color fragments. This is an advanced topic.