This wiki is being migrated to http://www.github.com/z88dk/z88dk/wiki

User Tools

Site Tools


library:memory_allocation

MEMORY ALLOCATION (balloc.h, malloc.h)

Header {z88dk}/include/balloc.h {z88dk}/include/malloc.h
Source {z88dk}/libsrc/balloc {z88dk}/libsrc/malloc
Include #include <balloc.h> #include <malloc.h>
Linking -lballoc -lmalloc
Compile from {z88dk}/libsrc : make balloc.lib ; make install from {z88dk}/libsrc : make malloc.lib ; make install
Comments none none

This page discusses automatic, static and dynamic memory allocation under z88dk. Automatic memory allocation occurs when a function is called in order to allocate memory for local variables. Static memory allocation is performed by the compiler to allocate permanent memory space for global and static variables. Dynamic memory allocation is performed by the program to gain a pointer to a free memory block at runtime. Some z88dk libraries perform dynamic memory allocation implicitly – this is also described here.

This page only deals with the usual 64k near memory model of the z80; it does not discuss the far memory model, where a flat memory model is used to address more than 64k, nor does it discuss the ramdisk model where memory beyond the z80's 64k address space is treated as a disk drive. The far memory model is supported by a far malloc library not described here.

Automatic Memory Allocation

Local, non-static variables declared in a C function are allocated on the stack as a function is entered and are popped off the stack as a function is exited. These variables are dynamic, in the sense that memory is allocated and deallocated from the stack as required at runtime, and transient, in that they no longer exist after the function terminates. More specific details for assembly language programmers can be found in the interfacing assembler reference.

Static Memory Allocation

In C, static variables include global variables, declared outside the scope of any function, and those variables declared static within functions. Global variables can be referred to from any function at any time and therefore must always be available while the program runs. Static local variables must retain state between function calls and therefore must also be available for the lifetime of the program. For this reason, these variables are allocated permanent memory space in the binary generated by the compiler. The assembler used by z88dk does not, at this time, have any concept of separate code or data segments. Therefore the compiler simply generates a single monolithic binary that contains compiled code and space for globally and statically declared variables. This can be an issue for code destined for a ROM – see the discussion on generating ROMable code for remedies. Even if ROMable code is not your interest, the static memory allocator discussed there may be of interest. More specific details for assembly language programmers can be found in the interfacing assembler reference.

NOTE: z88dk can separate the global/static variables space from the compiled code if DEFVARS is used but this is not the default behaviour.

Dynamic Memory Allocation

There are currently two dynamic memory allocators in z88dk that can operate individually, separately or in tandem. The first is the standard C malloc library that allocates variable sized blocks of memory. z88dk's implementation of the standard C malloc library also supports multiple heaps. The second is a dynamic block memory allocator that allocates fixed size blocks on demand.

Standard C Malloc Library (malloc.h)

The standard C malloc library manages a list of free memory blocks, called “heap”, from which it can allocate variably-sized blocks on demand. The implementation is a simple first-fit algorithm.

In the tiny memory space of the z80, exactly what parts of memory are available for the heap is a problem that varies from machine to machine and application to application. The solution chosen by z88dk's implementation is to have the user program initialize the heap by first declaring it empty and then by declaring what memory is available by adding one or more available memory blocks to the heap. This way free memory fragments from around the memory map can be made available through malloc().

The program must globally declare the standard heap's heap pointer in the main.c file using one of two methods:

long heap;                       // method #1 - heap pointer statically compiled as part of the binary
extern long heap(50000);         // method #2 - heap pointer's location specified in RAM

The first declaration above reserves four bytes for the heap pointer that will be compiled statically as part of the final binary generated by the compiler. This may not be a good idea if the program needs be ROMable or if you want to keep the size of the final binary as small as possible.

The second method above uses the special extern syntax to place the four-byte heap variable at a specific address (50000 in this case). The variable will not be statically compiled into the final binary and all references to the variable will access four bytes in RAM at address 50000.

With the heap pointer declared, a call to mallinit() should be made to initialize the heap pointer to empty. One or more calls to sbrk() should follow to specify available memory blocks to be made available through malloc. With these chores performed, dynamic memory can be allocated and deallocated using the usual ansi-C functions malloc, calloc, realloc and free.

NOTE: An older version of this library made use of the macro HEAPSIZE(bp) and the function heapinit(a) to initialize the malloc library. Macros have been included in the header file to allow programs using the old api to continue to work.

Malloc API

The heap pointer must be globally declared in the program as described above.

1. mallinit()

Must be called on program initialization to empty the heap. An alternative is to zero the heap pointer with an assignment like “heap = 0L;”

void mallinit(void);

2. sbrk()

Make the indicated memory block available from malloc. Any number of disjoint memory blocks from around the memory map may be added to malloc's heap but take care that none overlap.

void sbrk(void *addr, uint size);
// addr = address of available memory block
// size = size of available block in bytes (must be >= 4)

3. mallinfo() ; Uses AF'

Return the size of the largest single block and the total number of bytes available in the heap.

void mallinfo(uint *total, uint *largest);
// total   = address of an integer where the total number of free bytes in the heap will be stored
// largest = address of an integer where the size of the largest available block in the heap will be stored

4. calloc()

Returns a pointer to a memory block large enough to accomodate an array of nobj objects of size size bytes (total request size = nobj*size). The memory block is initialized to zeroes. If the request cannot be satisfied, 0 is returned and the carry flag is reset.

void *calloc(uint nobj, uint size);
// nobj = number of objects
// size = size of each object in bytes

5. malloc()

Returns a pointer to a memory block of given size. The space is not initialized. If the request cannot be satisfied, 0 is returned and the carry flag is reset.

void *malloc(uint size);
// size = size of memory request in bytes

6. free()

Returns a memory block to the heap for reuse. The block must have been allocated by malloc() or calloc().

void free(void *addr);
// addr = address of the memory block as returned by malloc() or calloc()

7. realloc()

Changes the size of the object pointed to by p to size. The contents will be unchanged up to the minimum of the old and new sizes. If the new size is larger, the new space is uninitialized. realloc returns a pointer to the new space or 0 and carry reset if the request cannot be satisfied, in which case p is unchanged.

void *realloc(void *p, uint size);
// addr = address of existing memory block as returned by malloc() or calloc()
// size = new size requested

Code Example

#include <malloc.h>
 
...
 
extern long heap(60000);    // place malloc's heap pointer at address 60000
 
...
 
main()
{
 
   ...
 
   // initialize the heap before malloc is used
 
   mallinit();              // heap cleared to empty
   sbrk(30000,2000);        // add 2000 bytes from addresses 30000-31999 inclusive to the heap
   sbrk(65000,536);         // add 536 bytes from addresses 65000-65535 inclusive to the heap
 
   ...
 
   a = (int *)(malloc(20*sizeof(int)));    // dynamically allocate an array of 20 integers
 
   ...
 
   free(a);                 // return memory to heap for reuse
 
   ...
 
}

Multiple Named Heap Malloc Library (malloc.h)

The standard C library allocates and deallocates memory from a single heap, implicitly named “heap”. z88dk's implementation of the malloc library can support multiple named heaps using an API that specifies from which heap allocation and deallocation should occur.

There are two good reasons to make use of multiple heaps on a z80 machine:

  1. In a multitasking environment, giving each task its own heap means there will be no synchronization issues or critical section problem that would otherwise occur with the use of a single shared heap.
  2. In machines that support more than 64k, a heap can be created for each 64k memory configuration, allowing programs to allocate valid near memory from the heap in force for the current memory configuration. This can be an alternative or complementary to the far malloc library. The far malloc library can allocate memory valid in any banking configuration, however that memory can only be accessed through 24-bit far pointers that entail sluggish banking code automatically inserted by the compiler when accessing that memory. The far malloc library can allocate sluggish memory that can be shared across memory configurations whereas with multiple heaps, fast near (normal?) memory can be allocated that is valid only in specific memory configurations.

As with the standard C malloc library, each heap must have its own heap pointer declared using one of two methods:

long myheap;                     // method #1 - heap pointer statically compiled as part of the binary
extern long myheap(50000);       // method #2 - heap pointer's location specified in RAM

The first declaration above reserves four bytes for the myheap pointer that will be compiled statically as part of the final binary generated by the compiler. This may not be a good idea if the program needs be ROMable or if you want to keep the size of the final binary as small as possible.

The second method above uses the special extern syntax to place the four-byte myheap variable at a specific address (50000 in this case). The variable will not be statically compiled into the final binary and all references to the variable will access four bytes in RAM at address 50000.

The name of this heap pointer variable, in this case “myheap”, is known as the name of the heap.

With the myheap pointer declared, a call to HeapCreate(&myheap) should be made to initialize the myheap pointer to empty. One or more calls to HeapSbrk(&myheap,…) should follow to add memory blocks to be made available through the heap. With these chores performed, dynamic memory can be allocated and deallocated using the named heap API: HeapAlloc(), HeapCalloc(), HeapRealloc() and HeapFree().

The name of the standard C library's heap is “heap”. This means a call to malloc(…), for example, is equivalent to a call to HeapAlloc(&heap,…).

Named Heap API

All heap pointers must be globally declared in the program as described above.

1. HeapCreate()

Must be called on program initialization to empty the specified heap. Corresponds to mallinit().

void HeapCreate(void *heap);
// heap = named heap pointer

2. HeapSbrk()

Make the indicated memory block available to the specified heap. Any number of disjoint memory blocks from around the memory map may be added to the heap but take care that none overlap. Corresponds to sbrk().

void HeapSbrk(void *heap, void *addr, uint size);
// heap = named heap pointer
// addr = address of available memory block
// size = size of available block in bytes (must be >= 4)

3. HeapInfo() ; Uses AF'

Return the size of the largest single block and the total number of bytes available in the specified heap. Corresponds to mallinfo().

void HeapInfo(uint *total, uint *largest, void *heap);
// heap    = named heap pointer
// total   = address of an integer where the total number of free bytes in the heap will be stored
// largest = address of an integer where the size of the largest available block in the heap will be stored

4. HeapCalloc()

Returns a pointer to a memory block large enough to accomodate an array of nobj objects of size size bytes (total request size = nobj*size) from the specified heap. The memory block is initialized to zeroes. If the request cannot be satisfied, 0 is returned and the carry flag is reset. Corresppnds to calloc().

void *HeapCalloc(void *heap, uint nobj, uint size);
// heap = named heap pointer
// nobj = number of objects
// size = size of each object in bytes

5. HeapAlloc()

Returns a pointer to a memory block of given size from the specified heap. The space is not initialized. If the request cannot be satisfied, 0 is returned and the carry flag is reset. Corresponds to malloc().

void *HeapAlloc(void *heap, uint size);
// heap = named heap pointer
// size = size of memory request in bytes

6. HeapFree()

Returns a memory block to the specified heap for reuse. The block must have been allocated by HeapAlloc() or HeapCalloc() from the same heap. Corresponds to free().

void HeapFree(void *heap, void *addr);
// heap = named heap pointer
// addr = address of the memory block as returned by malloc() or calloc()

7. HeapRealloc()

Changes the size of the object pointed to by p to size. The contents will be unchanged up to the minimum of the old and new sizes. If the new size is larger, the new space is uninitialized. realloc returns a pointer to the new space from the heap specified or 0 and carry reset if the request cannot be satisfied, in which case p is unchanged. Corresponds to realloc().

void *HeapRealloc(void *heap, void *p, uint size);
// heap = named heap pointer
// addr = address of existing memory block as returned by malloc() or calloc()
// size = new size requested

Block Memory Allocator (balloc.h)

The block memory allocator maintains several memory queues, each of which contains a list of free blocks of fixed size. Queue #0 might contain 6-byte blocks, queue #1 14-byte blocks, etc. Memory requests are made from a specific queue, retrieving a block of known size. Because the free memory blocks are maintained in a list, allocation and deallocation is very fast. Another important feature of the block memory allocator is that scattered free memory from all over the memory map can be made available in the queues.

The balloc library requires the program to reserve space for its queue table, an array of size numQ*2 bytes where numQ is the number of memory queues desired. This can be done in one of two ways.

The first method is to declare the following macro, available from the header file, with the rest of the globally declared variables in main.c:

BAQTBL(numQ)               // numQ is the number of memory queues desired

The macro will declare a global int array of size numQ*2 bytes that will be compiled statically as part of the final binary. This may not be desirable if the final binary needs to be ROMable or if the goal is to minimize the size of the final binary.

The second method is to declare the location of the queue table using the following C code:

extern int ba_qtbl(50000)  // locate the queue table at address 50000

This will place the queue table at address 50000.

The queues must be initialized to empty prior to use with a call to ba_Init(). Available memory blocks must then be added to the queues with calls to ba_AddMem(). Each queue should contain blocks of the same size.

Balloc API

The location of the queue table must be specified as described above.

1. ba_Init()

Clear all the memory queues to empty.

void ba_Init(uchar numQ);
// numQ = number of memory queues to manage, must be >= 1 and <= 127

2. ba_BlockCount() ; O(N)

Return the number of memory blocks available in a specific queue.

uint ba_BlockCount(uchar q);
// q = specifies memory queue

3. ba_AddMem()

Adds memory blocks to a specific queue from available memory designated by the caller. Returns the address of the next available byte following the designated memory area added to the queue. There is a one byte overhead incurred for each block. This means, for example, if twenty 10-byte blocks are added to a queue, a total of 20*(10+1) = 220 bytes will be consumed.

void *ba_AddMem(uchar q, uchar numblocks, uint size, void *addr);
//         q = specifies which memory queue to add blocks to
// numblocks = number of memory blocks to add, >= 1 and <= 255
//      size = size of each memory block in bytes, >= 2
//      addr = location of available memory from which the blocks will be drawn

4. ba_Malloc()

Return the address of a memory block retrieved from a specific queue. If no memory block is available, 0 is returned and the carry flag will be reset.

void *ba_Malloc(uchar q);
// q = specific queue from which to allocate a memory block

5. ba_Free()

Return the memory block to the queues for reuse. The block must have been allocated with ba_Malloc() or ba_BestFit().

void ba_Free(void *addr);
// addr = address of the memory block to be freed as returned by ba_Malloc() or ba_BestFit()

6. ba_BestFit()

Return the address of an available memory block from a range of possible queues. If no memory block is available, 0 is returned and the carry flag will be reset. ba_BestFit() will try to allocate memory beginning from a specific queue and then from following queues in increasing order. It will stop once it is successful. If queues are arranged such that they hold memory blocks in increasing size as queue number increases, ba_BestFit() will attempt to allocate a memory block of a minimum size (determined from the starting queue number) and will satisfy the request, if necessary, with larger blocks from succeeding queues.

void *ba_BestFit(uchar q, uchar numQ);
//    q = starting queue from which to attempt allocation
// numQ = number of queues to attempt allocation from before giving up, >= 1

Code Example

In this scenario the program makes requests for three different block sizes, namely 4-byte blocks, 6-byte blocks and 24-byte blocks.

#include <balloc.h>
 
...
 
extern int ba_qtbl(65530)  // locate the queue table at address 65530
 
...
 
main()
{
 
   ...
 
   // initialize the block memory allocator's queues
 
   ba_Init(3);                    // three memory queues initialized to empty; the table will occupy addresses 65530-65535 inclusive
   ba_AddMem(0, 10,  4, 25000);   // add ten 4-byte blocks to queue #0 from memory available at address 25000; addresses 25000-25049 inclusive will be used
   ba_AddMem(0,  3,  4, 49878);   // add three 4-byte blocks to queue #0 from memory available at address 49878; addresses 49878-49892 inclusive will be used
   ba_AddMem(1, 90,  6, 32768);   // add ninety 6-byte blocks to queue #1 from memory available at address 32768; addresses 32768-33397 inclusive will be used
   ba_AddMem(2, 12, 24, 33398);   // add twelve 24-byte blocks to queue #2 from memory available at address 33398; addresses 33398-33697 inclusive will be used
 
   ...
 
   f = (struct four_byte *)(ba_BestFit(0, 2));   // get a block from queue #0 or queue #1, trying #0 first; want at least a 4-byte block but will accept a 6-byte
   s = (struct six_byte *)(ba_Malloc(1));        // get a 6-byte block from queue #1; ba_BestFit() not used because don't want to dip into 24-byte blocks
   t = (struct twofour_byte *)(ba_Malloc(2));    // get a 24-byte block
 
   if (!s)
      printf("Ran out of 6-byte blocks!\n");
 
   ...
 
   ba_Free(f);
   ba_Free(s);
   ba_Free(t);
 
   ...
 
}

Malloc or Balloc? Which to Choose?

The short answer is that balloc is better if many allocation and deallocation requests are made and malloc is more convenient to use in other circumstances.

Malloc is very convenient – it allows allocation of any size memory blocks but in order to provide that convenience a lot of overhead, relatively speaking, is incurred keeping track of the size and locations of available chunks inside the heap. This overhead manifests as both speed overhead and memory overhead. Another weakness of malloc is that because it allows any size memory block allocation, a situation known as external fragmentation can occur. Over the course of a program's lifetime where many allocations and deallocations occur, the heap can become fragmented so that many small pockets of free memory exist but memory allocation can no longer occur because no single pocket is large enough to fulfill requests. This, despite the fact that if all those small pockets were merged together, plenty of memory would be available. This situation is more likely to occur for limited heap sizes (like on small micros such as the z80), long-running programs, and programs that frequently allocate and free variable-size memory blocks. It is almost likely for certain types of long running z80 programs to grind to a halt because of this external fragmentation issue.

The balloc library, a dynamic block memory allocator, was designed to resolve this issue at the expense of some convenience. The loss in convenience is that programs can no longer request arbitrarily sized memory blocks. Instead they must request memory blocks of fixed and known size from specific queues. However this loss of convenience is accompanied by some advantages as well. One is that allocation and deallocation is very fast. A second is that the memory overhead is just one byte per block, compared to two for malloc; this can be noticeable when many small requests are made. And a third is that external fragmentation is no longer an issue – the program will not grind to a halt as may occur with prolonged use of malloc. Although external fragmentation is no longer an issue, internal fragmentation can be. Because balloc only makes available blocks of fixed size, should a program require a block that doesn't match what is available in the queues, it must request the next larger block size that is available. This means several bytes in the block actually allocated is wasted and this is known as internal fragmentation.

Some of z88dk's libraries perform dynamic memory allocation implicitly and this usually involves allocation requests for many small, fixed size memory blocks. This fits well with the balloc model. There is also no law against using both allocators – malloc to take care of occasional variable size requests and balloc to handle frequent fixed size requests.

Memory Allocation Implicitly Performed by z88dk Libraries

Some z88dk libraries need to perform dynamic memory allocation in the course of performing their functions. A couple of examples include the abstract data types library where various data types need to allocate memory to act as containers for user items and the SP1 sprite library that needs to allocate memory to hold descriptors of sprites and sprite characters. Any library functions that perform implicit memory allocation will be documented as doing so in their API description.

Because the z80 has such a small memory space, how memory is managed is very programmer- and application-dependent. Rather than make assumptions about how libraries should perform their dynamic memory allocation (they could have, for example, called malloc directly to satisfy memory requests), libraries instead call user-supplied functions that satisfy memory requests on their behalf. That way the user program has complete control over how library memory requests are satisfied.

Library memory allocation and deallocation requests are made through two functions:

1. u_malloc()

Returns the address of an available memory block of the specified size. Must return with the carry flag set if successful. If unsuccessful, the carry flag must be reset and 0 returned. C functions can make use of the special z88dk keywords return_c() and return_nc() to return with known carry flag state.

void *u_malloc(uint size);
// size = size of memory request in bytes

2. u_free()

Deallocates the memory block passed in and makes it available for future allocation.

void u_free(void *addr);
// addr = address of memory block to be freed; must ignore 0

If a program uses library functions that need to perform implicit memory allocation, it must supply these functions to perform memory allocations on the library's behalf. The malloc and balloc libraries have been written to be compatible with this scheme so it is entirely possible to allow malloc or balloc to handle all requests simply as the following code examples demonstrate.

Code Example 1

In this scenario we just want malloc to take care of everything. Only the u_malloc and u_free function definitions are shown; none of the initialization for malloc is shown.

#include <malloc.h>
 
...
 
void *u_malloc(int size) {
   return malloc(size);             // let malloc take care of things; carry flag set properly by the malloc() call
}
 
void u_free(void *addr) {
   free(addr);                      // free() deallocates memory allocated by malloc() and ignores addr==0
}
 
...

Code Example 2

In this scenario we have library functions that allocate three block sizes: 4-byte, 6-byte and 10-byte. balloc will be used to handle memory allocations; queue #0 will contain 4-byte blocks, queue #1 6-byte blocks and queue #2 10-byte blocks. The initialization required for balloc is not shown.

#include <balloc.h>
 
...
 
void *u_malloc(int size) {
   if (size == 4)
      return ba_Malloc(0);          // ba_Malloc sets the carry flag appropriately
 
   if (size == 6)
      return ba_Malloc(1);
 
   return ba_Malloc(2);
}
 
void u_free(void *addr) {
   ba_Free(addr);                   // ba_Free() deallocates memory allocated by ba_Malloc() and ignores addr==0
}
 
...

Questions & Answers

Post common questions and answers here. Report any bugs or ask a question not listed here on the z88dk mailing list.

1. Can the BALLOC library manage queues of available blocks from different banks in machines with more than 64k?

Yes it can. My suggestion would be to logically separate groups of queues so that, for example, queues 0-3 apply to memory configuration A, queues 4-7 to memory congifuration B, etc. When memory blocks are required for memory configuration B, they should be requested from queues 4-7. Note that the BALLOC library code, the queue table and the memory blocks from the queues accessed must be paged in when making calls to the BALLOC library functions.

This does require the program to know which memory configuration is currently valid and which allocated blocks are valid in which memory configuration but this will generally be known. To allocate memory that is valid across any memory configuration, look into the far malloc library. The far malloc library's memory can always be accessed, with the compiler supplying any necessary bankswitching code automatically. The price paid is that accessing far memory is more sluggish than near memory.

2. Do you have to free memory to the same NAMED HEAP it was allocated from? Shouldn't this be the compiler's responsibility?

Yes, with the named heap API you must free memory to the same heap it was allocated from. This means the program must know which heap a specific memory block was allocated from in order to free it.

The named API was envisioned to be useful in two scenarios. In the first scenario, each task in a multitasking environment is given its own heap. In this case it is clear that each task knows from which heap its memory blocks are allocated from. In the second scenario, a heap is created for each memory configuration in a machine with more than 64k. In this case the program must know from which heap a block was allocated from because it must know when that block's memory configuration is active and therefore when the pointer is valid.

It would have been possible to store the heap from which blocks were allocated in the allocated blocks themselves and then a single free() function could serve all memory blocks, but this would entail an extra two bytes per allocated block overhead, and, on top of this, in scenario #2 above it would still not be clear whether it would be safe to perform the free() since it would be unknown whether the correct memory configuration was active.

If you are looking to allocate and deallocate memory that is always valid in any memory configuration you should be looking at the far malloc library. This will take care of all details, including having the compiler automatically generate banking code when accessing far memory, but of course this all comes at the expense of speed.

3. Can you change the heap that the standard C malloc library uses by changing the contents of the heap variable?

Yes, all the information concerning available memory in a particular heap is stored in the heap's four-byte heap pointer. In the case of the standard C malloc library, this is the contents of the long variable named “heap”. If you would like your program to use a specific heap when making calls to the standard C's malloc, realloc, calloc, free functions, a simple assignment like “heap=myotherheap;” will cause future standard C malloc functions to allocate from another heap. Great care must be taken when doing this, however. This is a near heap implementation, meaning it only knows about a 64k address space. If you have multiple heaps for different 64k spaces, freeing a block allocated from one heap into another may result in overlapping free blocks (two blocks occupying or overlapping the same address range). This will surely mess things up.

An interesting use of this nugget of information might occur in a multitasking system. Each individual task can use the usual standard C malloc functions, but on context switch, the current contents of the “heap” variable are stored into the exiting task's process structure and then the contents of the next task's process structure is used to set the new value of the “heap” variable before the next task is resumed. This way all tasks transparentely get independent heaps without resorting to the named heap api.

library/memory_allocation.txt · Last modified: 2017/03/24 07:22 (external edit)