In Part 1 we explored what ECS was and roughly how it worked. Today it is time to build a basic ECS system in C++.
Design
First we should probably look back and see how we want to use ECS.
- We want to define Components as structs/classes
- We want to create and destroy Entities
- We want to add and remove Components from Entities
- We want to be able to query data in Systems
We’re going to be evolving our API over time, but we can find a starting API v0. Something that works, but isn’t going to be released.
class Moveable {
Position position;
Moveable(float x, float y) : position(x, y) {}
void Walk(Position walkTo);
}
class Nameable {
string name;
Nameable(string n) : name(n) {}
}
class Purrable {
Purrable() {}
void Purr();
}
void AllCatsPurr(EcsManager* manager) {
for (size_t i = 0; i < manager.size<Purrable>(); ++i) {
EntityID entity = manager.entityAt<Purrable>(i);
if (!manager.has<Nameable>(entity)) {
continue;
}
std::cout << "The cat "
<< manager.getComponent<Nameable>(entity)->name
<< " purrs\n";
}
}
EcsManager<Moveable, Nameable, Purrable> manager;
manager.addSystem(AllCatsPurr);
EntityID kitty = manager.createEntity();
manager.addComponent<Nameable>(kitty, Nameable("Kitty"));
manager.addComponent<Moveable>(kitty, Moveable(1, 2));
manager.addComponent<Purrable>(kitty, Purrable());
manager.run();
Let’s break down some of the choices made here.
Using C++ templates adds compile time type checking. This can make debugging a challenge when someone forgets to update the EcsManager’s definition. I think for this first pass this is reasonable. Not needing to do runtime checks will help keep performance high.
addComponent is getting a stack object passed into it. We could use an emplace style and later on we will. Adding to an API is a bit easier than changing or removing. APIs are forever. But not launching while we work on the perfect API is also bad.
All interactions happen through the EcsManager.
Some frameworks return a more special Entity where components can be added that way.
These are just a simple wrapper that contain the Entity’s ID and the manager.
All calls are forwarded to the same manager.addComponent<C>(id, ...) that we used.
The extra memory use is small, since these are generally short lived objects.
The compiler most likely inlines these function calls, so there’s no performance hit.
I’m just choosing to not implement that here.
Build It
Entity
We’re going to start off at the lowest level and slowly build our way to the top. What’s at the bottom of the stack? The Entity ID. What is an ID? It’s a number. Sort of like a key or row index for looking data up later.
typedef size_t EntityID
Cool we’re done.
First question: what happens when we run out of IDs? size_t on your machine is most likely 18,446,744,073,709,551,615. You could spawn 1,000,000 entities per second and run out of IDs after about… oh… 584 years. According to Steam 99% of users are on 64-bit. It’s not impossible someone is still on 32-bit or some device where size_t isn’t 64-bit, but I’m leaving that as an exercise to the reader.
Is there a downside to using really large IDs? Now that is a possibility. If we were to keep everything in a simple vector, then the front of our vector is going to be very empty. We’re going to use sparse maps to solve that, but there’s different ways to do sparse maps. Some implementations use a bitset to track which fields are empty or non-empty. For all I talked about performance, we might be taking a slower approach. The trade off is maintainability. Using a simple map to point an ID to an actual row is fast enough. It’s also simple.
SparseMap
Speaking of that problem of storing data. We can’t use a simple vector. There are two problems:
- If we have 1,000,000 entities but 1 moveable, then we are wasting 999,999 entries of RAM.
- Space for IDs previously used or ones not yet used is wasting RAM.
- Iterating over a vector is fast, but we waste time iterating over empty entries.
(darn those off by 1 errors)
Sparse maps are going to give us a way to compact all of our data in a small vector, but still have a way to look up any entity we want. It can be as easy as it sounds. You can also make it harder, but I’m not going to. So we need a vector to hold data and a map to figure out where an Entity is. Remember: an index in a vector is of type size_t.
template <typename T>
class SparseMap {
private:
unordered_map<EntityID, size_t> entityToIndex;
vector<T> values;
}
Something we might need later is a way to say what Entity lives at an index.
If we are iterating over values, then how can we know what Entity it is?
So let’s add another vector that is kept in sync with values.
template <typename T>
class SparseMap {
private:
unordered_map<EntityID, size_t> entityToIndex;
vector<EntityID> entities;
vector<T> values;
}
Adding entries
Adding things to the SparseMap is fairly simple. We can just add everything to the end. Something we need to choose is what happens when an Entity already has a Component. We could use a vector of vectors. We can also just tell people they get maximum 1 component per entity. That brings up another decision: how to handle when someone tries anyway. We could
- Throw an error and hope someone isn’t playing the game right now.
- Ignore the recently added things and hope that doesn’t cause wild bugs.
- Drop the previous value… and still hope that doesn’t cause wild bugs.
Spoiler alert, all answers are wrong and result in bad endings. Let’s just throw an error for now. We could return a status, but developers could ignore that. Throwing isn’t the most efficient, but it will do for now.
Now I did earlier say we were basically going to emplace data to avoid copies. Let’s start with something like a push back since it is a bit simpler.
public:
void add(Entity entity, T value) {
assert(entityToIndex.find(entity) == entityToIndex.end()
&& "Entity already has that component");
size_t index = entities.size();
entities.push_back(entity);
values.push_back(value);
entityToIndex[entity] = index;
}
Removing entries
Removing is a bit more complicated. If we simply remove an entry, then our vector will no longer be compact. Iterating over it will give us some old data we don’t need. But, we don’t really care what the order of the entries are right? We can just swap the last entry with the one we want to remove. Popping off the last item is pretty easy!
Once again we still want to check our edge cases. Removing something that doesn’t exist?
Still going with throwing an error.
We have no way of knowing what bugs are going to result if we ignore it.
If we have the entity, then we can safely assume that it is not empty.
Therefore there is always a last element and we won’t do 0 - 1 on a size_t.
public:
void remove(Entity entity) {
assert(entityToIndex.find(entity) != entityToIndex.end()
&& "Entity does not have that component");
size_t index = entityToIndex[entity];
size_t last = entities.size() - 1;
EntityID swappedEntity = entities[last]; // see, entities came in handy!
// swap the last one to the new index
entityToIndex[swappedEntity] = index;
std::swap(entities[index], entities[last]);
std::swap(values[index], values[last]);
// and finally remove it
entities.pop();
values.pop();
entityToIndex.erase(entity);
}
Invalid Iteration
Before we get to fetching data there’s one issue with how we have done adding and removing. If data is added or removed in a loop we will hit what is called iterator invalidation.
Imagine we are iterating our data. We are processing index 3 and decide to delete it. We are swapping the last item, say index 6, with index 3. The next iteration is index 4. The index 6 entry is now 3 and we never processed it. Even if we didn’t do the swap and just bumped index 4 down to 3, then we still could miss 3.
One way around this is we can hold onto new items and soon to be removed items. We can then do the actions at sync points where we can guarantee that data isn’t being iterated on. This will make sure that our iterators are valid at all times. The tradeoff here is that our iterators will not be processing new data until the next sync point. We will give developers ways to control those sync points if they want or just once per frame.
Let’s change our API a bit to make this safer.
public:
void add(Entity entity, T value) {
assert(entityToIndex.find(entity) == entityToIndex.end()
&& "Entity already has that component");
assert(toBeAdded.find(entity) == toBeAdded.end()
&& "Entity already is adding that component");
toBeAdded[entity] = value;
}
void remove(Entity entity) {
assert(entityToIndex.find(entity) != entityToIndex.end()
&& "Entity does not have that component");
assert(entityToIndex.find(entity) != entityToIndex.end()
&& "Entity is already being deleted");
toBeRemoved.insert(entity);
}
void finalizeUpdates() {
for (const auto& entity : toBeRemoved) {
size_t index = entityToIndex[entity];
size_t last = entities.size() - 1;
EntityID swappedEntity = entities[last]; // see, entities came in handy!
// swap the last one to the new index
entityToIndex[swappedEntity] = index;
std::swap(entities[index], entities[last]);
std::swap(values[index], values[last]);
// and finally remove it
entities.pop();
values.pop();
entityToIndex.erase(entity);
}
toBeRemoved.clear();
for (const auto& it : toBeAdded) {
EntityID entity = it.first;
size_t index = entities.size();
entities.push_back(entity);
values.push_back(it.second);
entityToIndex[entity] = index;
}
toBeAdded.clear();
}
private:
unordered_map<EntityID, T> toBeAdded;
unordered_set<EntityID> toBeRemoved;
Getting data
The rest of the Sparse map is fairly simple fetchers. Getting data out is just returning data. Or returning an iterator. Here’s some handy getters to look up an index or a value or an entity. We need to return pointers to our data, so people can modify the data.
There’s one edge case on our getters. What if someone requests data that does not exist? We could throw an error or return an optional. Both have valid cases. Returning an optional has the advantage that the caller is forced to deal with the optional. They have to conciously force get the data to cause a crash. On the other hand everything else we have done thus far is throw errors. Simply to maintain consistency I will throw errors here with assert.
public:
size_t size() {
return entities.size();
}
bool has(EntityID entity) {
return entityToIndex.find(entity) != entityToIndex.end();
}
EntityID entityAt(size_t index) {
return entities.at(index);
}
T* valueAt(size_t index) {
assert(index < values.size()
&& "That index is not valid for a component lookup");
return &values.at(index);
}
T* valueOf(EntityID entity) {
assert(entityToIndex.find(entity) != entityToIndex.end()
&& "Entity does not have component");
auto i = entityToIndex.find(entity);
return &values.at(i->second);
}
Systems
Before we get too high on the stack of ECS we need to address another building block: Systems. These can be simple or very complex. In the simple case we assume all systems will have the same signature. That is they are going to return nothing and take exactly 1 parameter: the EcsManager.
using System = function<void(EcsManager*)>;
What is with using vs typedef?
Defining a type with using will let us do tricks with our types later.
A typedef is not able to be templated.
If we ever wanted a system with custom args, we should use using now.
Another question: why std::function instead of just a C style function pointer?
This is a bit of a personal choice. C++ std::function gives us some power later.
We can take class methods and turn them into plain functions.
We can do some silly depedency injection stuff in a bunch of chapters from now.
(Spoiler alert, that already was posted. I got ahead of myself)
EcsManager
We have our Entity. We have our System. We have a way to store our data without completely wasting our RAM. Next in the stack is we need the EcsManager to hold everything.
As stated earlier I am choosing to use C++ templates for their compile time checking. There are definite pros and cons that would make for a good blog post. I know, a very non-answer, but right now our priority is: get something working.
Same thing as the Sparse Map we will start with the data.
template <typename... Components>
class EcsManager {
using System = std::function<void(EcsManager<Components...>*)>;
private:
tuple<SparseMap<Components>...> data;
vector<System> systems;
}
There’s something a bit hidden here. A leaky absraction that is about to occur. Using templates and tuple has locked us into a world where a Component type can only have one entry in data. We cannot have two sets of data for a Component. If you find yourself needing that, I leave that exercise to the reader. I don’t need that ability, so I am sticking with the tuple.
Now let’s start with managing all of our Entities.
Adding Entities
We made Entities just a simple size_t. We chose not to re-use IDs.
Threading might change exactly how we allocate IDs,
but for now we will go with the simple answer: the incrementing number.
We could do an edge case check to make sure the nextId doesn’t overflow, but on 64-bit odds are we will never run out of IDs. I won’t say it is impossible, but unlikely.
public:
EntityID createEntity() {
return nextId++;
}
private:
EntityID nextId;
Adding Components to Entities
Adding a component to an entity is just a matter of adding an entry to our maps. There are a few edge cases we need to consider.
What happens if someone adds a component to an entity that doesn’t exist? Maybe it wasn’t created yet. We can check if the passed in ID is greater or equal to the nextID.
What if we already called destroy on the entity? This is a bit trickier. This whole post I have tried to take a stance on bug prevention. How can we know if an ID is destroyed? How can we tell if an ID that was meant to be destroyed vs one that has no components? We could keep track of every currently active ID. It will increase our RAM usage. What happens if we have no concept of entity destruction? It’s possible bugs may occur, but how severe do we think they are? I suspect, though I can’t be certain, that these types of bugs will be lower severity. I am going to decide to assume entities can always be “revived” by simply adding components.
Also we could check that T is in Components for EcsManager.
The tuple and std::get also will do that.
If we added our own check, then we could add friendly compiler messages.
I’m not going to for now, but it is a thing we could do.
public:
template <typename T>
void addComponent(EntityID entity, T componentData) {
assert(entity < nextId && "Cannot add components to entities not yet created");
std::get<SparseMap<T>>(data).add(entity, componentData);
}
Removing Components
What we add we will need to remove at some point. Similar question: what if we try to remove a component from an entity that doesn’t exist? If it doesn’t exist, then it has no component. It’s the same as what if we try to remove a component that isn’t attached. The Sparse Map has already handled that for us!
public:
template <typename T>
void removeComponent(EntityID entity) {
std::get<SparseMap<T>>(data).remove(entity);
}
Removing All Components
I said we won’t support deleting entities. Once an ID exists it is forever. But there are still cases where we will need to essentially wipe out an entity. In our model that is done by removing all components and then forgetting that ID existed.
We can make use of parameter packs to force C++ to fill in all the lines for us.
We write one removal and then unpack it into all removals.
The EcsManager class already has typename... Components that we can unpack.
The , operator is going to let us run multiple commands in a row.
The ... after is the unpack to repeat the last operation for every type in Components.
Something we do have to be aware of is that we made our remove in the SpareMap fail if the entity does not have the component. So we need to wrap that removal in a non-fail way.
public:
void removeIfExists(EntityID entity) {
SparseMap<T>& map = std::get<SparseMap<T>>(data);
if (map.has(entity)) {
map.remove(entity);
}
}
void removeAllComponents(EntityID entity) {
(removeIfExists<Components>(entity), ...);
}
Update Sync Points
Back in the development of the Sparse Map we realized we need to protect
ourselves from iterator invalidation.
We created a finalizeUpdates method to synchronize changes.
Let’s create a way to create sync points on demand and as a system.
The sync point system just simply adds the finalizeUpdates as a system.
A system is anything that takes 1 parameter: the EcsManager.
Fun fact, any class instance method that takes 0 parameters fits that.
We can cast the instance function into a plain function and store that.
public:
void addSyncPointSystem() {
std::function<void(EcsManager<Components...>*)> sync =
std::mem_fn(&EcsManager<Components...>::finalizeUpdates);
systems.push_back(sync);
}
void finalizeUpdates() {
(std::get<SparseMap<Components>>(data).finalizeUpdates(), ...);
}
Getting Entity Data
If all we want to do is fetch some entity data, then it’s a matter of relaying to the correct Sparse Map.
We handled the edge cases for data not existing.
We also assumed that std::get is taking care of T not being in Components.
public:
template <typename T>
T* getComponent(EntityID entity) {
return std::get<SparseMap<T>>(data).valueOf(entity);
}
template <typename T>
bool hasComponent(EntityID entity) {
return std::get<SparseMap<T>>(data).has(entity);
}
Iterating Data
Iterating data can be as simple as we want it to be. In our design we have the ability to iterate over a single component and then check if that entity has the other desired components. This has the flexibility that users can check if conditions are met. It’s not super powerful, but it is easy to maintain.
Why are we using for (size_t i = 0; ... ++i) instead of an iterator?
This is just for simplicity of our code.
We want something that works. We can add iteration ability later.
To do this we already need to add getters for using indices. We have everything in Sparse Map ready to go.
public:
template <typename T>
size_t size() {
return std::get<SparseMap<T>>(data).size();
}
template <typename T>
EntityID entityAt(size_t index) {
return std::get<SparseMap<T>>(data).entityAt(index);
}
template <typename T>
T* valueAt(size_t index) {
return std::get<SparseMap<T>>(data).valueAt(index);
}
What more could be done?
There’s so much more we could do with our ECS.
We chose to throw errors. We could return status or optionals. That would allow code to recover from errors.
We chose to use C++ templates to limit our EcsManager’s Component list. Many frameworks do not. They just let you use whatever Components as you go. Downside is that could create a runtime question of if the component even exists.
We chose an near-infinitely growing ID. We could recycle IDs. We could track what IDs are in use.
We chose to only iterate over a single component. We could make it easier for users to query data like a database.
We chose to not use iterators and instead iterate over indices. We could create an iterator.
We chose to only support passing one argument to each system. In reality we will want to be able to pass lots of arguments. Maybe even different arguments.
We could also restrict what data a system can use. Forcing the system to specify what data it reads and writes. As we get into multi threading we may want a way to enforce two things can’t write at the same time.
Next time we will spend some time evolving how we interact with our Component data. We will build a View class that can select entities if they have required components. We will limit what data that View can have access to to set up some concurrency one day. And last we will add those iterators to make the developer experience just that tiny bit better. Tune in for the next (not) exciting adventure of Summer of ECS!