============= Legacy BSD kernel memory allocator =============

Before we start on OpenSolaris' Vmem allocator, it will be instructive
to look at the legacy BSD and the comparatively recent Linux
kernel memory allocator interfaces.

The legacy BSD "malloc" code has been made famous by the SCO case
(in which SCO laid a claim to "intellectual property" in the Linux kernel):

The story: 
      http://www.lemis.com/grog/SCO/code-comparison.html

The code:

/*
 * Allocate 'size' units from the given
 * map.  Return the base of the allocated space.
 * In a map, the addresses are increasing and the list is terminated by a 0 size.
 * The core map unit is 64 bytes; the swap map unit is 512 bytes.
 * Algorithm is first-fit.
 */
malloc(mp, size)
struct map *mp;
{
        register unsigned int a;
        register struct map *bp;

        for(bp=mp;bp->m_size && ((bp-mp) < MAPSIZ);bp++) {
                if (bp->m_size >= size) {
                        a = bp->m_addr;
                        bp->m_addr += size;
                        if ((bp->m_size -= size) == 0) {
                                do {
                                        bp++;
                                        (bp-1)->m_addr = bp->m_addr;
                                } while ((bp-1)->m_size = bp->m_size);
                        }
                        return(a);
                }
        }
        return(0);
}

The sequence of "struct map"s traversed by incrementing bp acts as a
free list in which the first chunk of size greater or equal than
requested is found.
 
Note that the "struct map" pointed by  bp  can be allocated both "in-band"
or in a separate memory area. OpenSolaris chooses to allocate similar structures
out-of-band, as explained in 11.3.4.1

============= Linux generic kernel memory allocator API =============

Linux malloc with in-band boundary tags is explained in 

    http://www.dent.med.uni-muenchen.de/~wmglo/malloc-slides.html

The 2.6 Linux kernel memory allocator API is described here:

                  http://www.linuxjournal.com/article/6930

Note that kmalloc() is a function shared by all of the kernel's non-slab
allocations (slabs are handled differently and are closer to the OpenSolaris
KMEM allocator (Ch. 11.2) without the extra "magazine" and "depot" layers.

Flags from Table 4 in the above link determine whether this particular allocation
can or cannot block, and also distinguish between several purposes of allocated
memory. 

============= OpenSolaris kernel's VMEM allocators =============

By contrast, OpenSolaris interfaces allow multiple names pools
of memory with uniform properties per pool. Essentially, a pool becomes
a named object, in which allocations and deallocation functions become
methods. Pools can be nested and configured to obtain new allocations
from an enclosing pool object when necessary.

The textbook stresses the generalized character of the VMEM allocator in
Ch. 11.3, pp. 552--553. As described, VMEM allocates subranges 
of integers of requested size within the initial range allocated
at system boot. The integers are primarily meant to be address
ranges (in particular, nested), but can also be integer ID ranges.

This is stressed by calling the allocated ranges "resources", not
"addresses". Although the allocator includes some special
functions that are address-aware (vmem_xalloc, in particular,
controls address range "coloring" as in 10.2.7), they try to
be as forgetful about the nature of the ranges as possible, and
treat the allocation as a general algorithmic problem about
allocating integer intervals economically.

The initial range is ultimately derived either from the static
per-platform kernel memory layout as in
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/startup.c#358 -- 457 
or from a fixed permissible range of IDs.

Page 554 summarizes the VMEM interface, explained in pp. 555-560.
Read it before looking at the actual code.

============= VMEM code walk-through =============

In the boot-time kernelheap_init()  observe the calls to vmem_init()
and then a series of calls to vmem_create() creating individual VMEM pools objects
of type "struct vmem":
 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/vm/seg_kmem.c#212
 
The pools are called "arenas" (arenas are created from ranges of consecutive integers/addresses).

The vmem_init() function in 

   http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/vmem.c#1707

shows a very interesting feature of VMEM objects created by vmem_create() (in OO terms, a constructor
for instances of VMEM objects): they are nested.

At line 1723, "heap" is created from "heap_start" and "heap_size" which are static and platform-specific.

At 1728, "heap" is used as a source by the vmem_metadata_arena embedded in heap (both as
an object and an integer range). Note that start and base members of vmem_metadata_arena
are set to NULL and 0, because they are to be determined dynamically by the "vmem_alloc" function call
on "heap". 

Then vmem_seg_arena and vmem_hash_arena are similarly nested within vmem_metadata_arena ,
except their ranged will be obtained from vmem_metadata_arena  by calling another function,
"heap_alloc".

This trick where the function is provided the arguments (or other environment) for subsequent 
calls is known in Programming Languages as "closure". They are the mainstay of dynamic
interpreted languages.

See  http://www.perl.com/doc/FAQs/FAQ/oldfaq-html/Q3.14.html  about closures.
In the words of the Perl creator Larry Wall, 

"This is a notion out of the Lisp world that says if you define an
anonymous function in a particular lexical context, it pretends to run
in that context even when it's called outside of the context.

In human terms, it's a funny way of passing arguments to a subroutine
when you define it as well as when you call it. It's useful for
setting up little bits of code to run later, such as callbacks."

The combination  *afunc , source  in the vmem_create closely resembles a closure (pun intended).

The vmem_create() function

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/vmem.c#1416

that creates new named and possibly nested vmem objects looks like a typical constructor
that first allocates a chunk of RAM (vmp) then fills it in. Note that some of the arenas (VMEM_INITIAL)
are pre-allocated at boot as members of vmem0[] array  (cf.  line 1443).

Freelists in a vmem  instance correspond to power of 2 (see Ch. 11.3.4.2, p. 558). Observe
their initialization in lines 1464-1473. The hash table for allocated "resources" and their sizes
in initialized immediately after.

Definition of struct vmem :
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/vmem_impl.h#113

Observe the logic of vmem_alloc() as it accommodates different types of allocations
encoded by flags:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/vmem.c#1236

Note: VM_NEXTFIT ignored the freelists entirely and just gets the next free ID. 

"flist" is the index of the appropriate freelist corresponding to the requested size
in the freelist array. Recall that highbit() of an integer value returns its  highest set bit + 1 . 

Note that line 1544 is the special case for vmem allocators nested inside others, as shown 
in vmem_init() example above (with initial NULL for base address/resource and 0 for length ) . 
vmem_add() will cause  the "closure-like" allocation function *afunc to be called on source
to grab the initial interval boundaries.


Finally, observe the use of the VMEM instance "ptms_minor" to manage the ID range:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/ptms_conf.c#281

In particular , the "base address" is (void *1), overall size  ptms_nslots and quantum is 1.
The cast of the resource integer to (void*)  prevents type conversion warnings.