ctx caching and threading

The ctx rasterizer/renderer is compact and strives towards a minimal memory footprint for its use on microcontrollers. For the SDL2 and KMS/fbdev backends ctx creates multiple renderer threads by default half the number of available cores - but this can be overriden with the environment variable CTX_THREADS. There is nothing shared between the rasterizers, they have their own gradient and shape-cache (explained below) and do fully independent read-only, lock-free rasterization from a shared drawlist and textures.

grid-hash-cache

The rendering target is split into a grid of 8x8 cells and hashes of the rendering commands affecting each cell are accumulated. Only cells that have changed from the preceding frame are queued for rendering in threads. This way most contents can remain unchanged when all that is happening is editing of text in a terminal shell or editor. This type of caching is similar to how react uses a virtual DOM, clients do a full rerender and the system figures out bits to avoid rerendering. A longer writeup of the same idea can be found at cached software rendering page describing a similar implementation in the lite text editor.

shape cache

As part of constructing a poly-line in the early stages of filling shapes, a hash for the shapes is computed.

For paths with small bounding boxes in the range coverage masks are cached, this caching happens in the buffer-allocator for temporary coverage mask surfaces used when this compile-time option is enabled. The cache is useful for subsequent frames with updates without scrolling without further adaptations. Using monospace text or snapping text start positions to specific sub-pixel choices, like 1/4 pixels increases the hit-rate.

For desktop/phone use duplicating the cache for each thread is already a large win. On micro-controllers with RAM left over for caching it can already be useful - but introduces dynamic allocations where ctx could have run without any.

As rasterization has been finely optimized the threshold size for using the shape-cache has been reduced, the optimizations possible from not touching coverage mask bytes/target pixel bytes for transparent parts are more important than not traversing data - at least for fonts with few bezier lines like sans type faces.

inter-frame protocol compression

Applications the use ctx as a libary can either directly run with the above mentioned SDL2, KMS or fbdev backends - or they can run as clients inside the ctx terminal and window manager. When running as clients, the HTML5 2D Context is serialized in compact form using the ctx protocol. When transmitting the commands for the current frame, the ctx terminal permits reuse of segments of data from the previous frames raw data as a form of compression. Avoiding global absolute coordinates when drawing by using relative coordinates and/or transforms can be used to optimize data. This caching permits complex immediate mode UIs to be updated over a low-bandwidth link. (NOTE, as of late june 2021 - these code paths are not active by default - as they reduce latency and some features are unstable with it).

Texture caching

The caching scheme used by ctx dictates that textures are valid for reuse in the current frame and if the texture was used in the previous frame. The client and the clients implement this policy separately.

When providing a NULL eid, to define_texture, the sha1 of the pixeldata is computed. If the eid was already registered and valid, we stop here and issue a ctx_texture call instead - thus client code that knows eids can always use define_texture without overhead, as well as create unique strings for new frames for video/animations.