Thoughts on load control: * Load control code in BSD evolved from full process swapping. * Due to enormous memory sizes and slow disks, full process swapping is not feasible on today's systems. * However, some form of load control may be desired, to avoid thrashing. * The old BSD habit of freeing thread stacks would be hard to implement in Linux and would gain little memory (one or two pages per process). * However, we could free page tables that only have linear file pages in them. Easy to implement and frees up more memory. * Page tables of shared memory segments are a big issue on large database systems. Main goals of load control: * Prevent the system from thrashing. * Reduce the number of processes that fault pages in from swap simultaneously. * Prevent thundering herds. * Allow the still running processes to finish their work faster, so the not currently running ones can get their jobs done faster too. Methods: * When memory pressure starts to get higher: * Put sleeping processes under load control. * Next to no-op "swap out". * Control their wakeup rate, to avoid a thundering herd of processes that need to access swap. * When memory pressure gets really high: * Suspend active processes temporarily, so the still running ones can make progress. It is not clear what measurements to use to detect a busy VM or near-thrashing situations: * Paging IO wait time? * Kswapd has trouble freeing enough memory? * On which zones/nodes? * ... ?