The Linux page allocator, from mm/page_alloc.c, is the main memory allocation mechanism in the Linux kernel. It has to deal with allocations from many parts of the Linux kernel, under many different circumstances. Consequently the Linux page allocator is fairly complex, and easiest to understand in the context of its environment. Because of this, this wiki article begins with an explanation of exactly what the page allocator needs to do, before going into the details of how things are done.
I am writing this article bit by bit whenever I feel like it. If you feel like writing something, go right ahead - RikvanRiel
Various different parts of the Linux kernel allocate memory, under different circumstances. Most memory allocations happen on behalf of userspace programs; these allocations can use any memory in the system (highmem, zone_normal and dma) and, if free memory is low, can wait for memory to be freed by the pageout code. Page cache and page table allocations can also use any memory in the system and can wait for memory to be freed.
Most kernel level allocations are different and can only use memory that is directly mapped into kernel address space (zone_normal and dma). Most, though not all, kernel level allocations can wait for memory to be freed, if free memory is low.
Allocations from interrupt context are different. They can not wait for memory to be freed, so if free memory on the system is low at the time of the allocation, the allocation will simply fail.
The gfp_mask is used to tell the page allocator which pages can be allocated, whether the allocator can wait for more memory to be freed, etc. All the gfp flags are defined in include/linux/gfp.h:
/* Zone modifiers in GFP_ZONEMASK (see linux/mmzone.h - low two bits) */ #define __GFP_DMA 0x01u #define __GFP_HIGHMEM 0x02u /* * Action modifiers - doesn't change the zoning * * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt * _might_ fail. This depends upon the particular VM implementation. * * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller * cannot handle allocation failures. * * __GFP_NORETRY: The VM implementation must not retry indefinitely. */ #define __GFP_WAIT 0x10u /* Can wait and reschedule? */ #define __GFP_HIGH 0x20u /* Should access emergency pools? */ #define __GFP_IO 0x40u /* Can start physical IO? */ #define __GFP_FS 0x80u /* Can call down to low-level FS? */ #define __GFP_COLD 0x100u /* Cache-cold page required */ #define __GFP_NOWARN 0x200u /* Suppress page allocation failure warning */ #define __GFP_REPEAT 0x400u /* Retry the allocation. Might fail */ #define __GFP_NOFAIL 0x800u /* Retry for ever. Cannot fail */ #define __GFP_NORETRY 0x1000u /* Do not retry. Might fail */ #define __GFP_NO_GROW 0x2000u /* Slab internal usage */ #define __GFP_COMP 0x4000u /* Add compound page metadata */ #define __GFP_ZERO 0x8000u /* Return zeroed page on success */ #define __GFP_NOMEMALLOC 0x10000u /* Don't use emergency reserves */ #define __GFP_NORECLAIM 0x20000u /* No realy zone reclaim during allocation */
page allocation order
per-cpu page queues
This page is part of the ["LinuxMMInternals"] section: ["CategoryLinuxMMInternals"]