Thoughts on load control:
- Load control code in BSD evolved from full process swapping.
- Due to enormous memory sizes and slow disks, full process swapping is not feasible on today's systems.
- However, some form of load control may be desired, to avoid thrashing.
- The old BSD habit of freeing thread stacks would be hard to implement in Linux and would gain little memory (one or two pages per process).
- However, we could free page tables that only have linear file pages in them. Easy to implement and frees up more memory.
- Page tables of shared memory segments are a big issue on large database systems.
Main goals of load control:
- Prevent the system from thrashing.
- Reduce the number of processes that fault pages in from swap simultaneously.
- Prevent thundering herds.
- Allow the still running processes to finish their work faster, so the not currently running ones can get their jobs done faster too.
Methods:
- When memory pressure starts to get higher:
- Put sleeping processes under load control.
- Next to no-op "swap out".
- Control their wakeup rate, to avoid a thundering herd of processes that need to access swap.
- When memory pressure gets really high:
- Suspend active processes temporarily, so the still running ones can make progress.
What suspending processes buys us:
- Reduces contention on disk IO.
- Indirectly reduces contention on memory, because suspended processes are not paging anything in.
It is not clear what measurements to use to detect a busy VM or near-thrashing situations:
- Paging IO wait time?
- Kswapd has trouble freeing enough memory?
- On which zones/nodes?
- ... ?