Documentation and diagrams created by Adam Litke <agl@us.ibm.com>
Overview
The hugetlb pool exists to manage the allocation of fixed objects (huge pages). As special requirements have evolved, the complexity of the hugetlb pool management code has increased. This documentation exists to demystify the code and make it easier for others to work with the hugetlb pool and avoid mistakes.
Venn Diagram
This diagram shows how the counters relate to one another with respect to the huge pages and their various states. The three main counters are: nr_huge_pages, free_huge_pages, and resv_huge_pages. Pages can exist in each of the three domains (as described in the diagram). Each of the three counters is equal to the number of pages within its respective domain plus all pages in sub-domains. For example: Page "C" is a reserved huge page and its presence there is noted in all three counters; while page "B" is noted only in free_huge_pages and nr_huge_pages.
This logic shows that:
- nr_huge_pages represents the total number of huge pages IN ANY STATE.
- free_huge_pages counts the number of pages not currently in use (eg. pages are not mapped).
- resv_huge_pages represents a special kind of free_huge_page that is set aside for an existing shared mapping but is not yet mapped.
State Diagram
During their existence in the hugetlb pool, pages will pass through many different states. This diagram shows the states a huge page can be in along with all valid state transitions. The colors of states shown here correspond to a page's domain (or position) in the Venn diagram above. With this information it is possible to work out how the global pool counters change for each state transition.
Surplus Huge Pages
So far, the dynamic nature of the hugetlb pool has not been addressed by this documentation. When dynamic pool resizing is enabled, surplus pages can be allocated to the pool from the buddy allocator to satisfy additional demand for huge pages. As shown by the Venn diagram (pages "D" and "E"), surplus huge pages affect the global pool counters in similar ways to normal huge pages. The main exception is that surplus huge pages will not exist in the free_huge_pages domain. This is because surplus pages are only allocated to satisfy an actual need for a huge page. So when allocated, it will either be used to satisfy a reservation or to be mapped into a process.
An additional global counter comes into play when there are surplus huge pages: surplus_huge_pages This counter merely keeps track of the number of huge pages in the pool (the number of nr_huge_pages pages) that were allocated in surplus. Keeping this number allows the pool to gravitate back to its original size once the need for surplus huge pages ceases.