Differences between revisions 8 and 9
|Deletions are marked like this.||Additions are marked like this.|
|Line 84:||Line 84:|
|It marks the page as being part of the swap cache. The I/O operation needed to write it into its allocated location in the swap area will be carried out after `add_to_swap_cache()` returns. `add_to_swap()` marks the page as dirty, so that bdflush(?) will do the actual write to disk.||It marks the page as being part of the swap cache. The I/O operation needed to write it into its allocated location in the swap area will be carried out after `add_to_swap_cache()` returns. `add_to_swap()` marks the page as dirty, so that a kernel pdflush thread will do the actual write to disk.|
The Call Chain at a Glance
Entering the Call Chain
Choosing Pages to Swap Out
Before we dive into shrink_lruvec(), it’s worth discussing what the LRU (Least Recently Used) cache is, and how it works. Its purpose is to prioritize pages on the basis of readiness to be swapped out. It consists of four doubly linked lists of struct page: active and inactive anonymous pages, and active and inactive file pages. The struct page's are strung onto a list through their struct list_head lru member, and the list handles are kept in an array (there are macros to specify which one) in a struct lruvec (defined in include/linux/mmzone.h). This struct lruvec is the important paramater to shrink_lruvec().
The job of shrink_lruvec() is to identify swap candidates, considering pages at the tail ends of the two inactive LRU lists. (As we continue down the call chain, after the pages are swapped out, they will be removed from the LRU lists--this is how shrink_lruvec() will “shrink the LRU vector.”) If it doesn't find enough possible candidates right away, it will shuffle some of the pages from the active to the inactive lists. Pages are moved down the lists, and from the active to the inactive list, according to the number of times they've been scanned versus the number of times they've been rotated (i.e. found in the course of a scan to have been accessed, and thus moved to the head of their respective active LRU list).
Once the lists have been sufficiently shuffled, shrink_lruvec() will pass each one of the lists to be shortened down the chain in turn (the actual parameters will be the entire struct lruvec, the macro to index the intended list within it, and the number of pages at its tail to consider). It passes it via
which will verify and cull pages that can be evicted to swap. It creates an accessory page list (a doubly linked list like the LRU lists, with struct list_head l_hold), which it populates with eviction candidates by passing it to
(again, the actual parameters will be the lruvec and the macro to access the intended list within it). The pages are removed from the LRU list and added to the accessory list. After the accessory list is populated, shrink_inactive_list() passes it to
shrink_page_list() performs many types of checking and management on the eviction candidates it receives; for instance, it handles the situations in which a page is under writeback, dirty, or congested. Basically, it does a final sorting of the pages that the kernel really wants to keep from the ones it can let go of. The most important thing it does, though, is to send the outgoing pages to the actual swapping process. For each page on the accessory list, it makes a final check of its references to be sure the page isn’t being actively used. Then it removes the page from the accessory list and sends it along to:
Sending a Page to Swap Space
For the rest of this article, you'll need to know how a location in swap is described, so take a look here: SwapEntryRepresentation
Also, if you would like to look at add_to_swap() in detail, here’s a step by step breakdown: FnAddToSwapInDetail
It marks the page as being part of the swap cache. The I/O operation needed to write it into its allocated location in the swap area will be carried out after add_to_swap_cache() returns. add_to_swap() marks the page as dirty, so that a kernel pdflush thread will do the actual write to disk.