ctx rasterizer

The ctx vector rasterizer is an active edge table scanline rasterizer, with per-scanline choice between related rasterization strategies.

The best case is when the slope of all scanlines crossing the scanline are steeper than 45 degrees. Where the aa for each coverage span has a single pixel of start/end aa.

The second best case is when all portions of AA in a scanline are linear ramps, and no edges interesecting, these scanlines comprise of three types spans, opaque, aa gradient and no coverage.

Scanlines where the number of active edges change within the scanline are rasterized with 15 level vertical oversampling, this is the fallback when other strategies fail us, and it is expensive - the algorithms worst case scanline.

For other scanlines we use either 3, 5 or 15 levels of oversampling depending on the steepest edge crossing the scanline.

For an introduction to how scanline rasterizerization and vertical oversampling works oversampling works, which might prove useful in understanding the above explaination. See How the stb_truetype Anti-Aliased Software Rasterizer v2 Works.

Render targets handled natively are 8bit sRGBA RGBA8, floating point scRGB RGBAF 8bit and floating point grayscale with alpha. GRAYA8 and GRAYAF and floating point CMYK CMYKAF. Integration points are catered for in API and protocol for color management, which will be done with babl. The formats RGB332, RGB565, RGB565_BYTESWAPPED, CMYKA8, RGB8, GRAY1, GRAY2, GRAY4, GRAY8 and GRAYF are handled by converting processed scanlines back and forth to one of the supported targets. BGRA8 is handled by swapping components in the compositing source.

ctx supports grayscale, RGB and CMYK color models, all of which can be used and freely mixed while drawing. Conversion to the device/compositing representation is done during rasterization / rendering; at this point conversion between ICC matrix profiles for RGB spaces is currently supported when babl support is built in; making a hard-coded set of primaries known to match the specific display used - without babl - would be nice for microcontroller use.

The default RGB color space for both device and user is sRGB. Thus code from elsewhere specifying sRGB colors will work as expected. By adding an RGB matrix display profile in /tmp/ctx.icc the SDL,KMS and fbdev backends use the display space instead of sRGB for compositing.

optimization vs binary size

ctx is designed from the beginning to act as a software GPU for modern microcontrollers, some of which are more powerful than the PCs in the mid 90s. Ctx xan be tuned for microcontrollers down to ~7kb of RAM + 42kb of code + 12kb of fontdata, combined with immediate mode UI that can be re-run, it is sufficient to have a framebuffer covering one or a few scanlines. More RAM permits more flexible arrangement of more components like the parser for the text version of the CTX protocol. The resource constrained programming suitable for a microcontroller is also suitable for rendering cores as represented by threads, in this scenario all the rendering threads render from a shared read-only drawlist, while sharing textures.

font data size:    18027 bytes (A sans font subsetted to only ASCII,
                                latin1 ~= 33kb )
RGBA8 rasterizer:  43597 bytes (compiled with -Os, can triple in size with -O3)
ctx parser:        24608 bytes (not needed for direct use from C, but also
                                on embedded this can be useful for ease of
                                integration with other languages or directly
                                using ctx+mictrocontroller+display as a serial
                                display.)

The RAM requirements are small and by tuning the engine to have only a couple of save/restore states, and paths with fewer than 256 edges, the total RAM footprint of the rasterizer can be as low as ~5kb on 32bit platforms when a display with retained framebuffer is used, the parser for the ctx protocol needs an additional 1kb. Where framebuffer is too large to fit in RAM, the allocation needed for scanline(s) must be weighed against RAM needed for renderstream. Commands take a multiple of 9bytes, there is code/provisions for runtime compacting of the renderstream in prior git revisions.