For my junior-year student project we decided to implement a heavily multi-threaded game engine. The most practical method of multi-threading a modern game engine is a job system, defining units of asynchronous work and feeding them to a system to handle it. The next few posts on here will be a deep look into that new job system.

This first part will focus on how jobs are declared, created and used.

All code content in this file is © 2018 DigiPen Institute of Technology.


In order to support the realities of games, a few synchronization methods needed to be defined. The most important being the ability to wait for another job to be finished before continuing. In order to better support this, the concept of ‘parent’ and ‘child’ jobs was added, meaning one job could really just be a mask for potentially thousands of child jobs that represent the actual work required. This allows large stupidly parallelizable tasks to be threaded optimally but invisibly.

In order to avoid cores idly waiting, threads that are waiting for work to be completed execute other work until the task its waiting for is complete. This seemed fine at first, but it came with some problems that weren’t obvious until jobs that loaded files were added to the system. This meant that while the main thread was waiting for information it needed to draw, it ended up loading a large file. This was a bad thing. It meant that the game would occasionally halt for a few seconds.

The solution to this was to somehow associate information about a job with the job itself. Since without some compiler-specific C++14 attributes behavior there isn’t really a neat way of associating data with functions directly. My solution is a structure that is designed to be initialized globally when the function is defined.

enum struct JobType
{
    Tiny, Huge, IO, Graphics, Important, Misc,
    NumJobTypes,
    Null = -1
};

struct JobFunction
{
    JobFunction(JobFunctionPointer func, JobType type = JobType::Misc);

    JobFunctionPointer function = nullptr;
    unsigned char flags = 0;
};

This structure associates the information with the function pointer, and was initially intended to be like this.

void ExampleJobFunction(Job* job)
{
    Do a bunch of asynchronous work in here
}
JobFunction ExampleJob(ExampleJobFunction, JobType::Misc);

This works as a method for associating the data, but has a problem in that: a) You can’t call the job from within the job, which is a useful feature b) Its error prone for the user, just far too easy to forget or tag wrong c) Its a bunch of typing and programmers are lazy and will find a way around it

Addressing point a isn’t too hard, just a little weird with some forward declaration.

void ExampleJobFunction(Job* job);
JobFunction ExampleJob(ExampleJobFunction, JobType::Misc);
void ExampleJobFunction(Job* job)
{
    Do a bunch of asynchronous work in here
}

This solved point a, and now we have jobs that are able to recursively call themselves. However, this made problems b and c way worse. What good is this extra layer of information if its too much effort for anyone to actually add?

I solved the remaining 2 problems at once stroke with some macros (I know, but bare with me).

DECLARE_JOB(ExampleJob)
{
    Do a bunch of asynchronous work in here
}

This removes any real possibility of messing it up, and expands into exactly the last part. All of these declarations do still work in the previous examples, but this is much more concise and very difficult to get wrong.

In order to make it simpler to use, I also defined variations on the macro for each JobType.

DECLARE_GRAPHICS_JOB(ExampleGraphicsJob)
{
    Do a bunch of asynchronous graphics work here
}

DECLARE_IO_JOB(ExampleFileIOJob)
{
    Do a bunch of asynchronous file work here
}

Now its relatively simple for anyone to create job functions in the engine.


In order to improve system performance through cache coherency, jobs are allocated in an internal buffer and there is no public constructor. To create a new job, the syntax looks like:

Job* job = Job::Create(ExampleJob);
Job* childJob = Job::CreateChild(ExampleFileIOJob, job);

When designing the Job class, I realized early on that in order to minimize false sharing between CPU cores of this shared data, the jobs would need to fit evenly on cache lines. This meant that there were plenty of free bytes within each job. Since jobs require data to operate on, these padding bytes seem an ideal place to keep that.

In order to achieve this effect, I made use of some C++11 constant expressions to calculate the guaranteed size. (Obviously this requires compiler padding to be disabled for this class, since it handles its own padding and alignment).

//Size that we want jobs to be
static constexpr size_t TARGET_JOB_SIZE = 128;
//Amount of data within a job
static constexpr size_t PAYLOAD_SIZE = 2 * sizeof(JobFunctionPointer) + sizeof(std::atomic_int) + sizeof(Job*) + sizeof(std::atomic_char) + sizeof(unsigned char);
//Amount of bytes to add in order to reach target size
static constexpr size_t PADDING_BYTES = TARGET_JOB_SIZE - PAYLOAD_SIZE;
//Padding bytes
unsigned char padding_[PADDING_BYTES];

While this could have been achieved with some cunning nested classes, it would have cluttered the interface. The size of the job is calculated by taking the sum of the size of its members. Once the size is known, an array of bytes is declared to add the remaining size.

In order to use these bytes to hold desired job data, there are also two templated functions for Setting and Getting that information, while hiding the ugly casts. The obvious downside of this solution is that there is now no type checking of input vs output, and this means that users have to be careful. There is simply no way I know of of remembering what was stored within the bytes and making sure it is being accessed in the same way later.

template<typename T>
inline void Job::SetData(const T& data)
{
	//Verify that the data will fit in the allocated space
	static_assert(sizeof(T) <= PADDING_BYTES, "Job data too large, recommend passing by pointer");

	//Put the data in the padding bytes
	*reinterpret_cast<T*>(padding_) = data;
}
template<typename T>
inline T & Job::GetData()
{
	//Verify that the requested data is small enough to be in the padding space
	static_assert(sizeof(T) <= PADDING_BYTES, "Job data too large, recommend passing by pointer");

	//Get the data from the padding bytes
	return *reinterpret_cast<T*>(padding_);
}

Because it is most often used immediately after the Create functions, I also defined a templated create, allowing for jobs to be created with appropriate JobType information and with job data all in one line.

This means that creating a job with associated data now looks like this

struct ExampleJobData
{
  int num;
  double t;
};
Job::Create(ExampleJob, ExampleJobData{1, 1.2});

Thats all for job declaration and creation, in my next post I’ll talk about the lifespan of a job and callbacks.