Differences between revisions 3 and 7 (spanning 4 versions)

For 2.6.x kernels

Nitin Gupta

This page gives all the details collected together for compressed cache implementation. Many of the ideas have been taken from Rodrigo’s work and this paper: Adaptive Main Memory Compression by Irina Chihaia, Thomas Gross.

Here I will keep track of project progress. Please add any suggestions you have or edit anything you find incorrect.

Introduction

Compressed caching is the introduction of new layer in virtual memory hierarchy -- Compressed Cache. It compresses and stores pages that would otherwise have been swapped to slow disks or freed under memory pressure. This effectively increases RAM space and avoids /reduces accesses to slow disks. This basically takes advantage to rapidly increasing CPU power, faster, lower latency memories and sluggish hard-disk speed improvements.

Work Done

Currently the implementation simply replaces a page with a copy of itself (i.e. no actual compression yet) and replaces its reference in page cache with a ‘chunk_head’ (all this is detailed below). On page cache lookup, the original page is obtained by again copying and setting corresponding page cache entry back to this page instead of ‘chunk_head’. Also, one of compression algorithm (WKdm) has been ported to kernel space. The idea is to first have a framework ready for further implementation and gain more familiarity with VMM code.

General Background

The system maintains two LRU lists – active and inactive LRU lists. These lists may contain both page-cache (file backed) and swap-cache (anonymous) pages. When under memory pressure, pages in inactive list are freed as:

Swap-cache pages are written out to swap disksusing ‘swapper_space’ writepage() (swap_writepage()).
Dirty page-cache pages are flushed to filesystemdisks using filesystem specific writepage().
Clean page-cache pages are simply freed.

For compressed cache to be effective, it needs to store both swap-cache and page-cache (clean and dirty) pages. So, a way is needed to transparently (i.e. changes should be required within VMM subsystem only) take these pages in/out of compressed cache.

About Page Cache

Please see: http://www.linuxsymposium.org/2005/linuxsymposium_procv2.pdf -- paper ‘Examining Linux 2.6 Page-Cache Performance’.

Each open file has a separate radix tree to maintain its pages in page cache. So, in effect, each open file has its own page cache. The offset within the file is used as key to locate the corresponding page in memory.

About Swap Cache

All swap cache pages are part of a singleswapper_space – a single radix tree maintains all pages in the swap cache.swp_entry_t is used as a key to locate the corresponding pages in memory.

attachment:SwapCacheEntry.jpg

type identifies things we can swap to.

During Swap-Out

shrink_cache() prepares a list of pages to be freed (these pages are from inactive list) and hands over this list to shrink_list() which then tries to free pages in the list, a page at-a-time handling swap-cache pages and (clean/dirty) page-cache pages as above.

attachment:ShrinkCache.jpg

attachment:ShrinkList.jpg

During Swap-In

If page is anonymous, do_swap_page() looks-up swap cache first (using swp_entry_t stored in pte as key), if not in swap cache the required page is read in from swap disk (with some readahead), added to swap cache and the page is returned (mapped to process’ VMA).

For file-backed pages same logic applies (do_file_page()):

If page is file backed, page cache is looked up first (usingoffset within file stored in pte as key), if not in page cache the requiredpage is read in from filesystem disk (with some readahead), added to page cacheand returned (mapped to process’ VMA).

attachment:HandleFault.jpg

pt_none(*pte)==true means page not present in memory, so call do_nopage() to get it?

Handling Page Cache Pages

Taking page cache pages to compressed cache

When page is to be flushed to disk (dirty) or just freed (clean), determine if it is to be added to compressed cache (should_add_to_ccache()). If yes, then compress it (compress_page()) and add to compressed cache (add_to_ccache()).

add_to_ccache() will not maintain separate data structures to maintain location of various pages in ccache. Instead it will modify the radix tree node to now point to a chunk_head which will contain all the information needed to locate the page in ccache. This chunk_head will have flag PG_compressed set to identify that it does not point to a real page but is a container for information required to locate it in ccache.

So, when a page is looked-up up in page cache, it is determined if it is ccache just by checking PG_compressed flag. In that case the chunk_head contains information to locate the corres. page in ccache. The page will be taken from the ccache, decompressed to a page (newly allocated), and radix tree node will again be made to point to this newly decompressed page.

Swap-out for page cache pages

attachment:SwapOutCC.jpg

Swap In for page cache pages

attachment:SwapInCC.jpg

This approach eliminates the need for double lookups: On single page cache lookup you will be able to tell if page is in ccache or not and if it is compressed, then where it is places in ccache structure. This is better than requiring to lookup ccache every time a page is not found in page-cache.

Handling Swap Cache pages

We can take completely analogous approach to handle swap cache pages i.e.:

When a swap-cache page is to be moved to swap disk, mark it as PG_compressed, modify tree node (as is done for page-cache pages) to now point to a new struct page which contains all information required to locate page in ccache. When page is again looked-up in swap cache during page fault, simply note if PG_compressed is set and decompress, clear PG_compressed, set node to again point to this decompressed page and return this page.

Thereare MAX_SWAPFILES (default 32) swap disks differentiated using ‘type’ field (5bits) in swp_entry_t. Mark one (or some) of them as ‘virtual swap disks’ andassign it the highest priority of all swap disks. The swapper_space radix treecan then be used for ccache too and no separate data structure will be requiredto save location information of compressed page in ccache structure (i.e. nodouble lookups).

Compression Structure

Many of ideas here are taken from: Adaptive Main Memory Compression by Irina Chihaia, Thomas Gross.

(http://www.lst.inf.ethz.ch/research/publications/publications/USENIX_2005/USENIX_2005.pdf)

-  ⇤ ← Revision 3 as of 2006-04-27 23:26:55 → 
  Size: 2889
  Editor: triband-del-59
  Comment:
+   ← Revision 7 as of 2006-04-28 00:05:31 → ⇥
  Size: 6729
  Editor: triband-del-59
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-= Compressed Caching =
-Line 36:
+Line 35:
+== During Swap-Out ==
shrink_cache() prepares a list of pages to be freed (these pages are from inactive list) and hands over this list to shrink_list() which then tries to free pages in the list, a page at-a-time handling swap-cache pages and (clean/dirty) page-cache pages as above.

attachment:ShrinkCache.jpg

attachment:ShrinkList.jpg

== During Swap-In ==
If page is anonymous, '''do_swap_page()''' looks-up swap cache first (using swp_entry_t stored in pte as key), if not in swap cache the required page is read in from swap disk (with some readahead), added to swap cache and the page is returned (mapped to process’ VMA).

For file-backed pages same logic applies ('''do_file_page()'''):

If page is file backed, page cache is looked up first (usingoffset within file stored in pte as key), if not in page cache the requiredpage is read in from filesystem disk (with some readahead), added to page cacheand returned (mapped to process’ VMA).

attachment:HandleFault.jpg

pt_none(*pte)==true means page not present in memory, so call do_nopage() to get it?

== Handling Page Cache Pages ==
=== Taking page cache pages to compressed cache ===
When page is to be flushed to disk (dirty) or just freed (clean), determine if it is to be added to compressed cache (should_add_to_ccache()). If yes, then compress it (compress_page()) and add to compressed cache (add_to_ccache()).

add_to_ccache() will not maintain separate data structures to maintain location of various pages in ccache. Instead it will modify the radix tree node to now point to a '''chunk_head''' which will contain all the information needed to locate the page in ccache. This chunk_head will have flag PG_compressed set to identify that it does not point to a real page but is a container for information required to locate it in ccache.

So, when a page is looked-up up in page cache, it is determined if it is ccache just by checking PG_compressed flag. In that case the chunk_head contains information to locate the corres. page in ccache. The page will be taken from the ccache, decompressed to a page (newly allocated), and radix tree node will again be made to point to this newly decompressed page.

=== Swap-out for page cache pages ===
attachment:SwapOutCC.jpg

=== Swap In for page cache pages ===
attachment:SwapInCC.jpg

This approach eliminates the need for double lookups: On single page cache lookup you will be able to tell if page is in ccache or not and if it is compressed, then where it is places in ccache structure. This is better than requiring to lookup ccache every time a page is not found in page-cache.

== Handling Swap Cache pages ==
We can take completely analogous approach to handle swap cache pages i.e.: 



When a swap-cache page is to be moved to swap disk, mark it as PG_compressed, modify tree node (as is done for page-cache pages) to now point to a new struct page which contains all information required to locate page in ccache. When page is again looked-up in swap cache during page fault, simply note if PG_compressed is set and decompress, clear PG_compressed, set node to again point to this decompressed page and return this page.



Thereare MAX_SWAPFILES (default 32) swap disks differentiated using ‘type’ field (5bits) in swp_entry_t. Mark one (or some) of them as ‘virtual swap disks’ andassign it the highest priority of all swap disks. The swapper_space radix treecan then be used for ccache too and no separate data structure will be requiredto save location information of compressed page in ccache structure (i.e. nodouble lookups).



----

== Compression Structure ==
Many of ideas here are taken from: Adaptive Main Memory Compression by Irina Chihaia, Thomas Gross.

(http://www.lst.inf.ethz.ch/research/publications/publications/USENIX_2005/USENIX_2005.pdf)