7_Enter the Matrix

26 minute read

In this lesson, we build a Matrix class that applies what we learned about calculating geometric transformations and integrate it with our CG framework.

Now that we have learned about the usefulness of matrix calculations for geometric transformations, we can implement a new component in our CG framework that provides matrices for specific types of transformations. Our framework will always use 4x4 matrices so that we can easily handle both 2D and 3D graphics. When rendering in 2D, we simply use a consistent value for all the $z$ coordinates so everything renders on the same plane.

The Matrix Class

Our Matrix class will have static methods that return transformation matrices for the given parameters. With static methods, we do not need to create an instance of the Matrix class or manage state variables. Simply calling these methods will give us the matrix we need to do geometric transformations.

Try it!
In your graphics/core folder, create a new file called matrix.py.
Open matrix.py for editing and add the following code:

# graphics/core/matrix.py
from math import sin, cos, tan, pi
import numpy as np

We first import the math functions for sine, cosine and tangent to use when calculating our matrices, and the constant $\pi$ (pi) for converting view angles to radians. Here we use the numpy.array function to create NumPy arrays for each of our matrices. This gives us access to the performance benefits of the NumPy library as well as the @ operator for easy matrix multiplication. Without the @ operator, we would need to unpack each matrix and calculate the product manually, but with NumPy we can simply multiply matrices like this matrix1 @ matrix2 instead.

Add the following code to matrix.py for defining the class and its method which gives the identity matrix.

class Matrix:
    """Provides four-dimensional matrices for various geometric transformations"""

    # the 4D identity matrix
    __identity = np.array((
        (1, 0, 0, 0),
        (0, 1, 0, 0),
        (0, 0, 1, 0),
        (0, 0, 0, 1)
    )).astype(float)

    @classmethod
    def identity(cls):
        """A copy of the 4D identity matrix"""
        return cls.__identity.copy()

Here we create the identity matrix as a NumPy array and store it to a class variable. Then we make a class method to give access to the identity matrix with the @classmethod decorator. Class methods access class variables and methods using the cls parameter without the need to create an instance of the class itself. Our applications will be able to access these class methods directly from the class, for example Matrix.identity().

The identity() method returns a copy of the identity matrix instead of the matrix itself in order to prevent the original value from being changed. If we do not give a copy of the matrix, our apps would have access to the original identity matrix and could potentially change its values. And since the identity matrix is used by all objects in the scene, any changes to it would create all kinds of confusion in our apps! So we make the matrix read-only with this approach.

Note that when we create a NumPy array, all of its values must be the same type. So we fill each of our matrices with float values by calling the astype() method on each newly created array. Each of these matrix methods will return a NumPy ndarray.

Add the next code to the Matrix class for the translation matrix.

    @staticmethod
    def translation(x, y, z):
        """4D matrix for translating along vector <x, y, z>"""
        return np.array((
            (1, 0, 0, x),
            (0, 1, 0, y),
            (0, 0, 1, z),
            (0, 0, 0, 1)
        )).astype(float)

Here the decorator defines the translation method as a static method. Static methods are like class methods but static methods do not access the class. (Notice there is no cls parameter.) We can still call static methods directly on the class itself (for example, shift_matrix = Matrix.translation(1, 2, 3)).

The parameters x, y, and z are the components of the translation vector. We simply plug those values into their respective locations within a numpy array and return it with all values converted to the float type.

Next, add the following methods to the Matrix class for the rotation matrices around each of the three axes $x$, $y$, and $z$.

    @staticmethod
    def rotation_x(angle):
        """4D matrix for rotating around the x-axis by the given angle in radians"""
        c = cos(angle)
        s = sin(angle)
        return np.array((
            (1, 0,  0, 0),
            (0, c, -s, 0),
            (0, s,  c, 0),
            (0, 0,  0, 1)
        )).astype(float)

    @staticmethod
    def rotation_y(angle):
        """4D matrix for rotating around the y-axis by the given angle in radians"""
        c = cos(angle)
        s = sin(angle)
        return np.array((
            ( c, 0, s, 0),
            ( 0, 1, 0, 0),
            (-s, 0, c, 0),
            ( 0, 0, 0, 1)
        )).astype(float)

    @staticmethod
    def rotation_z(angle):
        """4D matrix for rotating around the z-axis by the given angle in radians"""
        c = cos(angle)
        s = sin(angle)
        return np.array((
            (c, -s, 0, 0),
            (s,  c, 0, 0),
            (0,  0, 1, 0),
            (0,  0, 0, 1)
        )).astype(float)

These methods all take the angle of rotation in radians. Then we can simply calculate sine and cosine of the angle before constructing a matrix with the appropriate values.

Add the scale method to the Matrix class for the scaling matrix.

    @staticmethod
    def scale(r, s, t):
        """4D matrix for scaling dimensions x, y, and z by magnitudes r, s, and t respectively"""
        return np.array((
            (r, 0, 0, 0),
            (0, s, 0, 0),
            (0, 0, t, 0),
            (0, 0, 0, 1)
        )).astype(float)

Scaling can happen on any dimension, so we want to allow for scaling each dimension individually. Uniform scaling happens when all dimensions scale equally. In that case, we just need to use the same value for r, s, and t.

Finally, add the perspective method to the Matrix class for calculating the perspective projection matrix.

    @staticmethod
    def perspective(angle_of_view=60, aspect_ratio=1, near=0.1, far=1000):
        """4D matrix for a projection transformation to the given perspective"""
        a = angle_of_view * pi / 180.0
        d = 1.0 / tan(a/2)
        r = aspect_ratio
        b = (near + far) / (near - far)
        c = 2 * near * far / (near - far)
        return np.array((
            (d/r, 0,  0, 0),
            (  0, d,  0, 0),
            (  0, 0,  b, c),
            (  0, 0, -1, 0)
        )).astype(float)

The method definition provides values for a default perspective so we do not need to specify them in every app. The angle_of_view parameter is specified in degrees, so we need to convert it into radians a before we can calculate the distance between the projection window and the camera d. The depth components b and c are calculated from the near clipping distance and far clipping distance as explained in the previous lesson. The aspect_ratio is just abbreviated as r for the sake of readibility.

Now we have everything we need in the Matrix class. But before we can use it, we need to update our Uniform class to support 4x4 matrix data for uniform variables in our shader programs.

Updating the `Uniform` Class

Remember that our Uniform class manages a link between data in a vertex buffer and a uniform variable in a shader program. The class uses glUniform functions to assign data based on its data type. Now we want to use matrix data in our framework, so we need to update the Uniform class to associate matrix data with variables as well.

The shader language GLSL uses the mat4 data type for 4x4 matrices. We can use the glUniformMatrix4fv function to upload data for mat4 shader variables.

Try it!
Open the openGL.py file from your graphics/core folder and scroll down to the Uniform class.
Find the _VALID_TYPES variable inside the Uniform class and add ONLY the 'mat4' value to the tuple.

    _VALID_TYPES = ('int', 'bool', 'float', 'vec2', 'vec3', 'vec4', 'mat4')

Then, find the long if statement inside the upload_data method and add the following code at the end:

        elif self.data_type == "mat4":
            GL.glUniformMatrix4fv(self.variable_ref, 1, GL.GL_TRUE, self.data)

When calling the glUniformMatrix4fv function, the second parameter specifies the number of matrices to associate with the variable. For our Uniform objects, this will always be 1. The third parameter tells OpenGL that our matrix data is stored as an array of row vectors. If we ever give the data as an array of column vectors (spoiler alert: we won’t), then that parameter would be GL.GL_FALSE instead.

A Test of Transformations

Now we are ready to build an app that uses our Matrix class for geometric transformations. The test app will be a 2D triangle that moves and rotates according to user input. On the left side of the keyboard, the WASD keys will control global translation while Q and E control global rotation. On the right side, the IJKL keys will control local translation while U and O control local rotation.

Our shader program will use two uniform mat4 variables—one for the projection matrix and one for the model matrix. Applying both of these matrices to the position vector will give us the object’s position on the projection window. The new vector shader source code looks like this:

# GLSL version 330
in vec3 position;
uniform mat4 projectionMatrix;
uniform mat4 modelMatrix;
void main() {
    gl_Position = projectionMatrix * modelMatrix * vec4(position, 1.0);
}

Remember that the position vector won’t change while the program is running. Instead, all of the transformations for the object will apply to its modelMatrix. Then, the projectionMatrix will adjust the object’s vectors based on the perspective of the camera. This effectively makes shapes look smaller as they move away from the camera and bigger as they move closer.

Try it!
In your main working folder, create a new file called test_7.py.
Open test_7.py for editing and add the following code:

# test_7.py
from math import pi
import OpenGL.GL as GL

from graphics.core.app import WindowApp
from graphics.core.opengl_utils import initialize_program
from graphics.core.openGL import Attribute, Uniform
from graphics.core.matrix import Matrix

class Test_7(WindowApp):
    """Tests geometric transformations by moving a triangle around the screen"""

    def startup(self):
        print("Starting up Test 7...")

        vs_code = """
        in vec3 position;
        uniform mat4 projectionMatrix;
        uniform mat4 modelMatrix;
        void main() {
            gl_Position = projectionMatrix * modelMatrix * vec4(position, 1.0);
        }
        """

        fs_code = """
        out vec4 fragColor;
        void main() {
            fragColor = vec4(1.0, 1.0, 0.0, 1.0);
        }
        """

        self.program_ref = initialize_program(vs_code, fs_code)

Next, we create an Attribute object for the position data and two Uniform objects for the two matrices. We use Matrix.translation once to set the initial value of the model matrix to the triangle’s initial position. Our update method later will also use Matrix.translation and Matrix.rotation_z to update the model matrix whenever the user moves or rotates the triangle. The camera itself stays stationary, so we call Matrix.perspective just once here in our startup method.

Inside the startup method of test_7.py, add the following code:

        # one VAO for the singular triangle
        vao_ref = GL.glGenVertexArrays(1)
        GL.glBindVertexArray(vao_ref)

        # triangle vertices are local to the object and do not change
        position_data = ( 
            ( 0.0,  0.3, 0.0 ),
            ( 0.2, -0.3, 0.0 ),
            (-0.2, -0.3, 0.0 )
        )
        self.vertex_count = len(position_data)
        position_attribute = Attribute("vec3", position_data)
        position_attribute.associate_variable(self.program_ref, "position")

        # the model matrix
        m_matrix = Matrix.translation(0, 0, -5)
        self.model_matrix = Uniform("mat4", m_matrix)
        self.model_matrix.locate_variable(self.program_ref, "modelMatrix")

        # the perspective matrix
        p_matrix = Matrix.perspective()
        self.projection_matrix = Uniform("mat4", p_matrix)
        self.projection_matrix.locate_variable(self.program_ref, "projectionMatrix")

        # movement speed in world units per second
        self.move_speed = 1.0
        # rotation speed in radians per second
        self.turn_speed = pi / 2

        # render settings
        GL.glClearColor(0.0, 0.0, 0.0, 1.0)
        GL.glEnable(GL.GL_DEPTH_TEST)

        GL.glUseProgram(self.program_ref)

Our triangle’s shape will be taller than it is wide so that we can see its orientation as it rotates around the screen. Since the camera is located at the origin, we would not be able to see the triangle if it was also on the $z=0$ plane. So we move the triangle in front of the camera on the $z$-axis to $z=-5$ by setting its intial model matrix to a translation matrix.

The movement speed is set to units in world space. This means that the greater the distance between the object and the camera, the slower it appears to move. On the other hand, the rotation speed is not influenced by the object’s distance from the camera, but it does appear to speed up as the object moves away from the center of rotation (that is, the $z$-axis). This will be clear when we compare local rotation (with the $z$-axis at the center of the triangle) to global rotation (with the $z$-axis at the center of the screen).

Towards the end of the startup method, we use glEnable to enable OpenGL’s depth testing feature. Since we are rendering a 3D scene, we need depth testing to determine whether objects in the scene will block each other from view. This is an unnecessary calculation when there is only one object in the scene, but we turn it on now in case we want to add more objects to the scene later.

Now add the update method to the Test_7 class with the following code:

    def update(self):
        # first, get changes to position
        move_amount = self.move_speed * self.delta_time
        turn_amount = self.turn_speed * self.delta_time

As we did in the Animations lesson, we first calculate distances based on the time that has passed between frames. This time we also have a rotation distance calculated as the size of the rotation angle in radians.

Next, add code to the update method for handling translations in the global context.

        # global translations
        # "w" is upward movement in the positive y direction
        if self.input.is_key_pressed("w"):
            mat = Matrix.translation(0, move_amount, 0)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "a" is leftward movement in the negative x direction
        if self.input.is_key_pressed("a"):
            mat = Matrix.translation(-move_amount, 0, 0)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "s" is downward movement in the negative y direction
        if self.input.is_key_pressed("s"):
            mat = Matrix.translation(0, -move_amount, 0)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "d" is rightward movement in the positive x direction
        if self.input.is_key_pressed("d"):
            mat = Matrix.translation(move_amount, 0, 0)
            self.model_matrix.data = mat @ self.model_matrix.data

Similar to our Test 4-6 application, we move the triangle in the direction specified by the key press. Here, we make a translation matrix and then multiply it by the existing model matrix to get a new model matrix. We can use the @ operator to compose the matrices since they are NumPy arrays.

Now add code for handling global rotations to the update method.

        # global rotations
        # "q" is counterclockwise rotation around the world z-axis
        if self.input.is_key_pressed("q"):
            mat = Matrix.rotation_z(turn_amount)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "e" is clockwise rotation around the world z-axis
        if self.input.is_key_pressed("e"):
            mat = Matrix.rotation_z(-turn_amount)
            self.model_matrix.data = mat @ self.model_matrix.data

This code is similar to global translations except we use Matrix.rotation_z to get a rotation matrix instead. The Q key rotates in the positive (counterclockwise) direction and E rotates in the negative (clockwise) direction.

After that, add code for handling local translations to the update method.

        # local translations
        # "i" is movement in the triangle's positive y direction
        if self.input.is_key_pressed("i"):
            mat = Matrix.translation(0, move_amount, 0)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "j" is movement in the triangle's negative x direction
        if self.input.is_key_pressed("j"):
            mat = Matrix.translation(-move_amount, 0, 0)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "k" is movement in the triangle's negative y direction
        if self.input.is_key_pressed("k"):
            mat = Matrix.translation(0, -move_amount, 0)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "l" is movement in the triangle's positive x direction
        if self.input.is_key_pressed("l"):
            mat = Matrix.translation(move_amount, 0, 0)
            self.model_matrix.data = self.model_matrix.data @ mat

Here we handle local coordinates instead of global coordinates. As discussed in the previous lesson, the only difference is the order in which we compose our model matrices. The order of composition for local transformations is the reverse of that for global transformations, so we apply the model matrix after the translation matrix.

Now add the final code for handling local rotations to the update method.

        # local rotations
        # "u" is counterclockwise rotation around the triangle's center
        if self.input.is_key_pressed("u"):
            mat = Matrix.rotation_z(turn_amount)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "o" is clockwise rotation around the triangle's center
        if self.input.is_key_pressed("o"):
            mat = Matrix.rotation_z(-turn_amount)
            self.model_matrix.data = self.model_matrix.data @ mat

As local transformations, these rotations will apply to the object’s local coordinate axes. This effectively makes the triangle spin in place around its own center when pressing the U and O keys.

Finally, add code for clearing the screen and drawing with our matrix data at the end of the update method.

        # reset the color buffer and the depth buffer
        GL.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT)

        # draw the scene
        self.projection_matrix.upload_data()
        self.model_matrix.upload_data()
        GL.glDrawArrays(GL.GL_TRIANGLES, 0, self.vertex_count)

# initialize and run this test
Test_7().run()

Just as we reset the color buffer for every frame, we also want to reset the depth buffer when depth testing is enabled. Then we upload the data for both the model matrix and the projection matrix before drawing the triangle. Don’t forget to also include the last line for running this test app!

Save the file and run it with the python test_7.py command in the terminal.
Confirm that you can see a yellow triangle on the screen.
Test that the WASD keys move the triangle globally.
Test that the Q and E keys rotate the triangle globally.
Test that the IJKL keys move the triangle locally.
Test that the U and O keys rotate the triangle locally.

Now that we have geometric transformations built into our framework, we can start thinking about 3D objects. Next time we will set up the basic components for rendering a scene with multiple 3D objects in it. Look forward to it!

The Matrix Class

Updating the Uniform Class

A Test of Transformations

Updating the `Uniform` Class