فارِس أحمد بَخيت

C++ Modules are here to stay

Within C++, there is a much smaller and cleaner language struggling to get out.

—Bjarne Stroustrup

C++ enthusiasts will often bash you for using the C preprocessor instead of the latest great metaprogramming feature of the language but until recently, any meaningful use of C++ meant that you had to use at least one preprocessor directive (#include) and that is no longer true.

C++20 modules provide a way to encapsulate a library (or a namespace) such as Qt, cv, or std1.

Know your modules

Using modules is as easy as

import std;

auto main() -> int {
  std::println("Hello world!");
}

and creating your own module is no harder, but we ought to have some terminology laid down first:

With those definitions out of the way, we can begin by declaring our first module, a data structures and algorithms module:

// dsa.cpp
export module dsa;

namespace dsa {
export int pow(int a, int b) {
  ...
}
}

Well, that was easy. How about we add Red-Black Tree as a submodule?

// rbtree.cpp
export module dsa.rbtree;

export namespace dsa {
enum class AllowDuplicates : bool {
    No,
    Yes,
};

template<typename T, AllowDuplicates AllowDuplicates, typename Compare = std::less<T>>
class RedBlackTree {
  ...
}
}

It’s the same thing. In fact, from the compiler’s perspective submodules are not a thing; dsa.rbtree is to dsa what “openai” is to “open”.

Since there’s no such thing as “submodules”, there’s no way for modules to interact except by their public interfaces and that is by design. But this also means that you’ll have one gigantic module unit with many correlating parts and implementation details; navigating such code will be a nightmare.

Module partitions to the rescue, they’re module units that are only importable by their named module and the other module partitions under the named module.

For example, you’ve added a bunch of linked list variants to your DSA library and they all share the private Node structure, so you split your code (that is for the same module) into multiple dependent modules that are only visible to each other and to your module.

// linked_list.cpp
export module dsa.linked_list;

export import :circular_list;
export import :ordered_list;
export import :unordered_list;

// circular_list.cpp
import :node; // Declares the (private) Node structure that is
              // shared between all 3 list variants.

export namespace dsa {
template <typename T>
class CircularSinglyLinkedList {
}
}

There’s one missing piece of the puzzle: backwards compatibility. Yes, you can use libraries that don’t support modules inside of modules and even upgrade your code incrementally to use modules through what is called the “global module fragment” (module;). There’s also a private module fragment, but we won’t discuss it.

module;
#define GLFW_INCLUDE_NONE
#include <GLFW/glfw3.h>
#include <glad/gl.h>
export module dsa.sortvis;

That’s about everything you need to know to get started using modules!

You might wonder, “Why go through the hassle when all of my C++ projects use header files and work perfectly fine?” Yes, that’s true but at some point the C model begins to show its age, particularly in the time it takes to compile a big project. That’s not to mention that with enough preprocessor hacks (or time until you resort to them,) your abstractions become leaky and you’re faced with Hyrum’s Law.

Fast compile times

It’s actually pretty easy to notice that C++ has a “compile time” problem. For quite some time, I’ve taken it upon myself to go through most of the CSES Problem Set and while in these competitive programming style problems, you spend most of your time analyzing the problem and coming up with a plausible algorithm, I’ve found the compile times of the major C++ compilers to be a real bottleneck; having to wait >4 seconds2 simply interrupts my flow.

It’s apparent from the µbenchmark on my solved problems that C++20 Modules are a clear winner and they provide an 8.6x speedup over the stock Clang and a 1.2x speedup over PCH. See footnotes3 for the competitive programming template and script using modules.

To not bore you to death and frankly I’d do a bad job at it, I’ll not explain how to work with C++20 modules in Clang, instead the amazing developers behind Clang wrote a comprehensive article: Standard C++ Modules — Clang documentation (You don’t need to read it unless you’re building tooling around Clang, which the CMake folks have already done.)

But support

Lagging support for modules from C++ vendors to tooling is a valid point to not consider modules at all in your projects. But your personal projects don’t need the guarantees C++ often holds for commercial projects (and I think most commericial projects don’t either), and the story is only half bad. As of now, most major compilers either implement the spec completely or partially and CMake provides complete modules support but experimental support for import std;, which is enough in my book.

Here’s the minimal CMakeLists.txt to get you started:

cmake_minimum_required(VERSION 3.28)

project(dsa)

set(CMAKE_CXX_SCAN_FOR_MODULES ON)
set(CMAKE_CXX_STANDARD 23)

add_library(dsa)
target_sources(dsa
  PUBLIC FILE_SET dsa_public_modules TYPE CXX_MODULES
  FILES
    src/dsa.cpp
)

add_executable(hello src/bin/hello.cpp)
target_link_libraries(hello PRIVATE dsa)

or if you really want to do import std; (EXPERIMENTAL), you need to add the following lines:

# You can find this UUID in the CMake/Help/dev/experimental.rst file of your version.
set(CMAKE_EXPERIMENTAL_CXX_IMPORT_STD "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX")
set(CMAKE_CXX_MODULE_STD ON)

Footnotes

  1. The modules std and std.compact are standarized in C++23 under the standard library modules feature

  2. This is due to including the “everything from std” file (#include <bits/stdc++.h>) which is non-standard, or its modern (C++23) equivalent: import std;. Also excuse my humble machine.

  3. Here’s the extremely fast ./run script,

    #!/bin/bash
    STDPCM="std.pcm"
    STDCPPM="/usr/share/libc++/v1/std.cppm"
    INFILE="./$1"
    BINDIR="./bin"
    OUTFILE="$BINDIR/${1%.*}"
    [ -d "$BINDIR" ] || mkdir -p "$BINDIR" && \
    [ -f "$STDPCM" ] \
      || clang++ -fsanitize=address              \
                 -std=c++23                      \
                 -Wno-reserved-module-identifier \
                 -stdlib=libc++                  \
                 --precompile                    \
                 -o "$STDPCM" "$STDCPPM" && \
    [ -f "$OUTFILE" -a "$INFILE" -ot "$OUTFILE" ] \
      || clang++ -fsanitize=address               \
                 @compile_flags.txt               \
                 -o "$OUTFILE" "$INFILE" &&       \
    exec "$OUTFILE"
    

    its corresponding compile_flags.txt (also useful for clangd),

    -std=c++23
    -stdlib=libc++
    -fmodule-file=std=std.pcm
    -Wall
    -Wextra
    -Wno-unused-const-variable
    -DNJUDGE
    

    and the “header” of each C++ file

    #ifdef NJUDGE
    import std;
    #else
    #include <bits/stdc++.h>
    #endif
    using namespace std;