Serialization Proposal 1#52
Conversation
|
Thanks for the PR, I agree that the storage of the glyphs should be optimized to save space. But I would prefer not to implement manual serialization and deserialization if possible. My idea was to store the glyphs as a struct of arrays instead of an array of struct BdfFont<'a, Coord, Size, Index> {
top_left: &'a [Coord],
size: &'a [Size],
device_width: &'a [Size],
data_index: &'a [Index],
}
const TOP_LEFT: [i8; 256 * 2] = ...;
const SIZE: [u8; 256 * 2] = ...;
const WIDTH: [i8; 256] = ...;
const DATA_INDEX: [u16; 256] = ...;
pub const MY_FONT: BdfFont<'static, i8, u8, u16> = {
.top_left = &TOP_LEFT,
.size = &SIZE,
.width = &WIDTH,
.data_index = &DATA_INDEX,
};The advantage is that no manual serialization is necessary and that each font can use the minimum data size required, because of the type parameters in ´BdfFont But my proposal only works for fonts embedded in the source code, a serialization format would still be useful for other applications which load fonts at runtime from some external storage. If you want to continue with this PR I would suggest to look into using a serialization format like https://github.com/jamesmunns/postcard. |
|
Representing fonts in source code would make it harder to use those fonts in other languages. For example, generating custom fonts for u8g2_fonts requires compiling C source code since that's the only format u8g2's tooling generates. Using postcard, would it be possible to index serialized data without de-serializing it? For a large font, creating a whole copy of it could use up the memory of a machine. |
|
Although the serialization in this PR tries to replicate the As of now, finding a glyph is O(n), and there are 17 bytes of metadata per glyph. This makes rendering slow for CJK characters and emojis, and means that the metadata of a font can get larger than the bitmap data. Sorting the list of glyphs would allow for a binary search, but storing ranges of contiguous sections of glyphs would make the search functionally O(1), since fonts usually group glyphs together in large blocks This would also remove the need to denote the corresponding character of each glyph, making the character struct 13 bytes per glyph, potentially saving up to 200kb for a font like unifont I propose: |
These additions provide a way to represent and render a parsed BDF font as a slice of
u8s, and tooling for that conversion in the font converter and previewer.The format for serialization attempts to be as close to the
BdfFontstruct as possible while saving space, using 17 bytes per character: