Thanks for sharing. Simplicity is a good thing.
In order to speed it up by reducing the number of malloc() calls, it may be worth adding a simple arena memory allocation measure: by allocating one larger block (e.g. 1 MB) initially and then doubling the memory size each time you run out, all malloc()/calloc() calls can become local salmagundi_alloc() calls that are just macro invocations that return an arena buffer pointer and also increment said pointer as a side effect.
I also recommend you have a look at Chris Hanson's book "C: Interfaces and Implementations", which has a few neat C API tricks that your code could benefit from (e.g. for reducing name space pollution, for avoiding null pointer argument errors, API method naming etc.).
If you double a memory block with realloc, the memory may be relocated to a different address (at least on MacOS). Then all pointers pointing to the old block will be invalid. This can be addressed by adding new blocks to a list. It will be more complex than a minimal arena allocator.
Furthermore, a typical arena allocator only grows but doesn't shrink. A deleted item from the memory block will still hold the memory and will not be released. You will need a more sophisticated allocator for deletion. For fixed sized items, memory pool may be an option.