7_Enter the Matrix

26 minute read

In this lesson, we build a Matrix class that applies what we learned about calculating geometric transformations and integrate it with our CG framework.

Now that we have learned about the usefulness of matrix calculations for geometric transformations, let’s add a new component to our CG framework that provides matrices for transformations of objects in our apps. Our framework will always use 4x4 matrices so that we can easily handle both 2D and 3D graphics. When rendering in 2D, we simply use a consistent value for all the $z$ coordinates so everything renders on the same plane.

The Matrix Class

Our Matrix class will have static methods that return different transformation matrices from the given parameters. With static methods, we do not need to create an instance of the Matrix class or manage state variables. Simply calling these methods will give us the matrix data we need to do geometric transformations.

Try it!
In your graphics/core folder, create a new file called matrix.py.
Open matrix.py for editing and add the following code:

# graphics/core/matrix.py
from math import sin, cos, tan, pi
import numpy as np

We first import the math functions for sine, cosine and tangent to use when calculating our matrices, and the constant $\pi$ (pi) for converting view angles to radians. We also use the numpy.array function to create NumPy arrays for each of our matrices. This gives us an advantages of less memory use, shorter processing times, and access to the @ operator for easy matrix multiplication. Instead of unpacking each matrix and calculating the product manually, we can simply multiply matrices as matrix1 @ matrix2.

Add the following code to matrix.py for defining the class and its method which gives the identity matrix.

class Matrix:
    """Provides four-dimensional matrices for various geometric transformations"""

    # the 4D identity matrix
    __identity = np.array((
        (1, 0, 0, 0),
        (0, 1, 0, 0),
        (0, 0, 1, 0),
        (0, 0, 0, 1)
    )).astype(float)

    @classmethod
    def identity(cls):
        """A copy of the 4D identity matrix"""
        return cls.__identity.copy()

Here we create the identity matrix as a NumPy array and store it to a class variable. Then we make a class method to give access to the identity matrix with the @classmethod decorator. Class methods access class variables and methods using the cls parameter without creating an instance of the class. Our applications will be able to access these class methods directly on the class itself, for example Matrix.identity().

We are using a class variable and a class method that returns copies of the identity matrix in order to prevent the original value from being changed. If we do not give a copy of the matrix, then the method would give a reference to the value stored in the class. Then our apps would be able to accidentally change the original identity matrix itself. That would cause all kinds of confusion in our apps! So we make the matrix read-only using this approach.

Note that when we create a NumPy array, all of its values must be the same type. So we will fill each of our matrices with float values by calling the astype() method on each newly created array. The return type is a NumPy ndarray.

Add the next code to the Matrix class which creates a translation matrix.

    @staticmethod
    def translation(x, y, z):
        """4D matrix for translating along vector <x, y, z>"""
        return np.array((
            (1, 0, 0, x),
            (0, 1, 0, y),
            (0, 0, 1, z),
            (0, 0, 0, 1)
        )).astype(float)

Here the decorator defines the translation method as a static method. Static methods are like class methods but static methods do not access the class. (Notice there is no cls parameter.) We can still call static methods directly on the class itself (for example, shift_matrix = Matrix.translation(1, 2, 3)).

The parameters x, y, and z are the components of the translation vector. We simply plug those values into their respective locations within a numpy array representing the translation matrix and return it with all values converted to the float type.

Next, add the following methods to the Matrix class that create matrices for rotations around each of the three axes $x$, $y$, and $z$.

    @staticmethod
    def rotation_x(angle):
        """4D matrix for rotating around the x-axis by the given angle in radians"""
        c = cos(angle)
        s = sin(angle)
        return np.array((
            (1, 0,  0, 0),
            (0, c, -s, 0),
            (0, s,  c, 0),
            (0, 0,  0, 1)
        )).astype(float)

    @staticmethod
    def rotation_y(angle):
        """4D matrix for rotating around the y-axis by the given angle in radians"""
        c = cos(angle)
        s = sin(angle)
        return np.array((
            ( c, 0, s, 0),
            ( 0, 1, 0, 0),
            (-s, 0, c, 0),
            ( 0, 0, 0, 1)
        )).astype(float)

    @staticmethod
    def rotation_z(angle):
        """4D matrix for rotating around the z-axis by the given angle in radians"""
        c = cos(angle)
        s = sin(angle)
        return np.array((
            (c, -s, 0, 0),
            (s,  c, 0, 0),
            (0,  0, 1, 0),
            (0,  0, 0, 1)
        )).astype(float)

These methods all take the angle of rotation in radians. Then we can simply calculate sine and cosine of the angle before constructing a matrix with the appropriate values.

Add the scale method to the Matrix class for scaling transformations.

    @staticmethod
    def scale(r, s, t):
        """4D matrix for scaling dimensions x, y, and z by magnitudes r, s, and t respectively"""
        return np.array((
            (r, 0, 0, 0),
            (0, s, 0, 0),
            (0, 0, t, 0),
            (0, 0, 0, 1)
        )).astype(float)

Scaling can happen on any dimension, so we want to allow for scaling each dimension individually. Uniform scaling happens when all dimensions scale equally. In that case, we just need to use the same value for r, s, and t.

Finally, add the perspective method to the Matrix class for calculating the perspective projection matrix.

    @staticmethod
    def perspective(angle_of_view=60, aspect_ratio=1, near=0.1, far=1000):
        """4D matrix for a projection transformation to the given perspective"""
        a = angle_of_view * pi / 180.0
        d = 1.0 / tan(a/2)
        r = aspect_ratio
        b = (near + far) / (near - far)
        c = 2 * near * far / (near - far)
        return np.array((
            (d/r, 0,  0, 0),
            (  0, d,  0, 0),
            (  0, 0,  b, c),
            (  0, 0, -1, 0)
        )).astype(float)

This method definition provides values for a default perspective so we do not need to specify them in every app. The angle_of_view parameter should have a value in degrees, so we need to convert it into radians a before we can calculate the distance between the projection window and the camera d. The depth components b and c are calculated from the near clipping distance and far clipping distance as explained in the previous lesson. The aspect_ratio is just abbreviated as r for the sake of readibility.

Now we have everything we need in the Matrix class. But before we can use it, we need to update our Uniform class to support 4x4 matrix data for uniform variables in our shader programs.

Updating the `Uniform` Class

Remember that our Uniform class manages a link between data in a vertex buffer and a uniform variable in a shader program. The class uses glUniform functions to assign data based on its data type. Now we want to use matrix data in our framework, so we need to update the Uniform class to associate matrix data with variables as well.

The shader language GLSL uses the mat4 data type for 4x4 matrices. We can use the glUniformMatrix4fv function to upload data for mat4 shader variables.

Try it!
Open the openGL.py file from your graphics/core folder and scroll down to the Uniform class.
Find the _VALID_TYPES variable inside the Uniform class and add ONLY the 'mat4' value to the tuple.

    _VALID_TYPES = ('int', 'bool', 'float', 'vec2', 'vec3', 'vec4', 'mat4')

Then, find the long if statement inside the upload_data method and add the following code at the end:

        elif self.data_type == "mat4":
            GL.glUniformMatrix4fv(self.variable_ref, 1, GL.GL_TRUE, self.data)

When calling the glUniformMatrix4fv function, the second parameter specifies the number of matrices to associate with the variable. For our Uniform objects, this will always be 1. The third parameter tells OpenGL that our matrix data is stored as an array of row vectors. If we ever give the data as an array of column vectors (spoiler alert; we won’t), then that parameter would be GL.GL_FALSE instead.

A Test of Transformations

Now we are ready to build an app that uses our Matrix class for geometric transformations. The test app will be a 2D triangle that moves and rotates according to user input. On the left side of the keyboard, the WASD keys will control global translation while Q and E control global rotation. On the right side, the IJKL keys will control local translation while U and O control local rotation.

Our shader program will use two uniform mat4 variables—one for the projection matrix and one for the model matrix. Applying both of these matrices to the position vector will give us the object’s position on the projection window. The new vector shader source code looks like this:

# GLSL version 330
in vec3 position;
uniform mat4 projectionMatrix;
uniform mat4 modelMatrix;
void main() {
    gl_Position = projectionMatrix * modelMatrix * vec4(position, 1.0);
}

Remember that the position vector won’t change while the program is running. Instead, all of the transformations for the object will apply to its modelMatrix. Then, the projectionMatrix will adjust the object’s vectors based on the perspective of the camera. This effectively makes shapes look smaller as they move away from the camera and bigger as they move closer.

Try it!
In your main working folder, create a new file called test_7.py.
Open test_7.py for editing and add the following code:

# test_7.py
from math import pi
import OpenGL.GL as GL

from graphics.core.app import WindowApp
from graphics.core.openGLUtils import initialize_program
from graphics.core.openGL import Attribute, Uniform
from graphics.core.matrix import Matrix

class Test_7(WindowApp):
    """Tests geometric transformations by moving a triangle around the screen"""

    def startup(self):
        print("Starting up Test 7...")

        vs_code = """
        in vec3 position;
        uniform mat4 projectionMatrix;
        uniform mat4 modelMatrix;
        void main() {
            gl_Position = projectionMatrix * modelMatrix * vec4(position, 1.0);
        }
        """

        fs_code = """
        out vec4 fragColor;
        void main() {
            fragColor = vec4(1.0, 1.0, 0.0, 1.0);
        }
        """

        self.program_ref = initialize_program(vs_code, fs_code)

Next, we create an Attribute object for the position data and two Uniform objects for the two matrices. We use Matrix.translation once to set the initial value of the model matrix to the triangle’s initial position. Our update method later will also use Matrix.translation and Matrix.rotation_z to update the model matrix whenever the user moves or rotates the triangle. The camera itself stays stationary, so we call Matrix.perspective just once here in our startup method.

Inside the startup method of test_7.py, add the following code:

        # one VAO for the singular triangle
        vao_ref = GL.glGenVertexArrays(1)
        GL.glBindVertexArray(vao_ref)

        # triangle vertices are local to the object and do not change
        position_data = ( 
            ( 0.0,  0.3, 0.0 ),
            ( 0.2, -0.3, 0.0 ),
            (-0.2, -0.3, 0.0 )
        )
        self.vertex_count = len(position_data)
        position_attribute = Attribute("vec3", position_data)
        position_attribute.associate_variable(self.program_ref, "position")

        # the model matrix
        m_matrix = Matrix.translation(0, 0, -5)
        self.model_matrix = Uniform("mat4", m_matrix)
        self.model_matrix.locate_variable(self.program_ref, "modelMatrix")

        # the perspective matrix
        p_matrix = Matrix.perspective()
        self.projection_matrix = Uniform("mat4", p_matrix)
        self.projection_matrix.locate_variable(self.program_ref, "projectionMatrix")

        # movement speed in world units per second
        self.move_speed = 1.0
        # rotation speed in radians per second
        self.turn_speed = pi / 2

        # render settings
        GL.glClearColor(0.0, 0.0, 0.0, 1.0)
        GL.glEnable(GL.GL_DEPTH_TEST)

        GL.glUseProgram(self.program_ref)

Our triangle’s shape will be taller than it is wide so that we can see its orientation as it rotates around the screen. Since the camera is located at the origin, we would not be able to see the triangle if it was also on the $z=0$ plane. So we move the triangle in front of the camera by setting its intial model matrix to a translation matrix that moves the triangle down the $z$-axis to $z=-5$.

The movement speed is set to units in world space. This means that the greater the distance between the object and the camera, the slower it appears to move. On the other hand, the rotation speed is not influenced by the object’s distance from the camera, but it does appear to speed up as the object moves away from the center of rotation (that is, the $z$-axis). This will be clear when we compare local rotation (with the $z$-axis at the center of the triangle) to global rotation (with the $z$-axis at the center of the screen).

Towards the end of the startup method, we use glEnable to enable OpenGL’s depth testing feature. Since we are rendering a 3D scene, we need depth testing to determine whether objects in the scene will block each other from view. This is an unnecessary calculation when there is only one object in the scene, but we turn it on now in case we want to add more objects to the scene later.

Now add the update method to the Test_7 class with the following code:

    def update(self):
        # first, get changes to position
        move_amount = self.move_speed * self.delta_time
        turn_amount = self.turn_speed * self.delta_time

As we did in the Animations lesson, we first calculate distances based on the time that has passed between frames. This time we also have a rotation distance calculated as the size of the rotation angle in radians.

Next, add code for handling translations in the global context to the update method.

        # global translations
        # "w" is upward movement in the positive y direction
        if self.input.is_key_pressed("w"):
            mat = Matrix.translation(0, move_amount, 0)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "a" is leftward movement in the negative x direction
        if self.input.is_key_pressed("a"):
            mat = Matrix.translation(-move_amount, 0, 0)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "s" is downward movement in the negative y direction
        if self.input.is_key_pressed("s"):
            mat = Matrix.translation(0, -move_amount, 0)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "d" is rightward movement in the positive x direction
        if self.input.is_key_pressed("d"):
            mat = Matrix.translation(move_amount, 0, 0)
            self.model_matrix.data = mat @ self.model_matrix.data

Similar to our Test 4-6 application, we move the triangle in the direction specified by the key press. Here, we make a translation matrix for the movement and then multiply it by the existing model matrix to get a new model matrix. We use the @ operator to compose the matrices since they are NumPy arrays.

Now add code for handling global rotations to the update method.

        # global rotations
        # "q" is counterclockwise rotation around the world z-axis
        if self.input.is_key_pressed("q"):
            mat = Matrix.rotation_z(turn_amount)
            self.model_matrix.data = mat @ self.model_matrix.data
        # "e" is clockwise rotation around the world z-axis
        if self.input.is_key_pressed("e"):
            mat = Matrix.rotation_z(-turn_amount)
            self.model_matrix.data = mat @ self.model_matrix.data

This code is similar to global translations except we use Matrix.rotation_z to get a rotation matrix instead. The Q key rotates in the positive (counterclockwise) direction and E rotates in the negative (clockwise) direction.

After that, add code for handling local translations to the update method.

        # local translations
        # "i" is movement in the triangle's positive y direction
        if self.input.is_key_pressed("i"):
            mat = Matrix.translation(0, move_amount, 0)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "j" is movement in the triangle's negative x direction
        if self.input.is_key_pressed("j"):
            mat = Matrix.translation(-move_amount, 0, 0)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "k" is movement in the triangle's negative y direction
        if self.input.is_key_pressed("k"):
            mat = Matrix.translation(0, -move_amount, 0)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "l" is movement in the triangle's positive x direction
        if self.input.is_key_pressed("l"):
            mat = Matrix.translation(move_amount, 0, 0)
            self.model_matrix.data = self.model_matrix.data @ mat

Here we handle local coordinates instead of global coordinates. As discussed in the previous lesson, the only difference is the order in which we compose our model matrices. The order of composition for local transformations is the reverse of that for global transformations, so we apply the model matrix after the translation matrix.

Now add the final code for handling local rotations to the update method.

        # local rotations
        # "u" is counterclockwise rotation around the triangle's center
        if self.input.is_key_pressed("u"):
            mat = Matrix.rotation_z(turn_amount)
            self.model_matrix.data = self.model_matrix.data @ mat
        # "o" is clockwise rotation around the triangle's center
        if self.input.is_key_pressed("o"):
            mat = Matrix.rotation_z(-turn_amount)
            self.model_matrix.data = self.model_matrix.data @ mat

As local transformations, these rotations will apply to the object’s local coordinate axes. This effectively makes the triangle spin in place around its own center when pressing the U and O keys.

Finally, add code for clearing the screen and drawing with our matrix data at the end of the update method.

        # reset the color buffer and the depth buffer
        GL.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT)

        # draw the scene
        self.projection_matrix.upload_data()
        self.model_matrix.upload_data()
        GL.glDrawArrays(GL.GL_TRIANGLES, 0, self.vertex_count)

# initialize and run this test
Test_7().run()

Just as we reset the color buffer for every frame, we also want to reset the depth buffer when depth testing is enabled. Then we upload the data for both the model matrix and the projection matrix before drawing the triangle. Don’t forget to also include the last line for running this test app!

Save the file and run it with the python test_7.py command in the terminal.
Confirm that you can see a yellow triangle on the screen.
Test that the WASD keys move the triangle globally.
Test that the Q and E keys rotate the triangle globally.
Test that the IJKL keys move the triangle locally.
Test that the U and O keys rotate the triangle locally.

Now that we have geometric transformations built into our framework, we can start thinking about 3D objects. Next time we will set up the basic components for rendering a scene with multiple 3D objects in it. Look forward to it!

The Matrix Class

Updating the Uniform Class

A Test of Transformations

Updating the `Uniform` Class