Want to turn this [http://kernelnewbies.org/ #kernelnewbies] irc log into a real explanation? Feel free to create yourself an account (UserPreferences) and edit this page
<saxm> is everything from PAGE_OFFSET onwards paged? <riel> saxm: depends, what do you mean by "paged" and what do you mean by "everything" ? ;)) * riel could find exceptions on either side of PAGE_OFFSET, depending on which meanings you want to use <saxm> riel: "paged" as in hardware paged by the cpu, "everything" meaning addressable memory <riel> after bootup, all memory is accessed through the MMU <ahu> riel, do you recall when current mainline 2.6.10 will decide not to cache a file? <riel> so everything before and after PAGE_OFFSET is paged <riel> not everything can be demand paged, though ... <ahu> for example, when I do: open() seek() read() close() <ahu> I seem to recall that sequential reads were special cased? <saxm> riel: but there's a difference between paging above and below PAGE_OFFSET?? Process pages below PAGE_OFFSET map to kernel pages above PAGE_OFFSET? <riel> pages below PAGE_OFFSET belong to userspace <riel> and can be demand paged <riel> addresses above PAGE_OFFSET are kernel memory <saxm> riel: so there is no linear mapping between pages in virtual memory and consecutive area of physical memory? <riel> there is a linear mapping for the first 900 MB of kernel memory <riel> where physical address 0 - 896 MB is mapped into PAGE_OFFSET - PAGE_OFFSET+896MB <Bertl> (depending on the split) <saxm> riel: ok, so there are 896*1024/4 physical frames addressable from PAGE_OFFSET->PAGE_OFFSET+896mb, and page directorys/tables map userspace page accesses to the appropriate page within this range? <riel> saxm: no, userspace does not have access to the virtual memory beyond PAGE_OFFSET <riel> saxm: userspace only gets access to virtual addresses below PAGE_OFFSET <saxm> riel: just trying to understand how virtual pages relate to this mapped area of memory from PAGE_OFFSET to PAGE_OFFSET+896? <riel> memory above PAGE_OFFSET is kernel virtual memory <riel> part of it is a direct map of the first part of physical memory <riel> but that same physical memory could also get virtual mappings from elsewhere, eg. userspace <riel> or vmalloc <riel> also, userspace and vmalloc can map physical memory from outside the 896MB of direct mapped memory (as well as inside it) <saxm> riel: ok, multiple mappings to physical pages, that clears things up for me! <saxm> riel: so how does it works for kernel memory? kernel memory allocations (for page tables etc...) must come out of that 896meg chunk too? <riel> most kernel memory allocation needs to come from that 896 MB, indeed <riel> though page tables are the big exception ;) <saxm> riel: which means they're resident in memory all the time - if that's where physical memory is mapped to? <riel> kernel data structures are always resident <saxm> riel: so where do page tables reside? Surely not below PAGE_OFFSET? Somewhere above PAGE_OFFSET+896mb then? <riel> they could reside anywhere <saxm> anywhere from 0->4gb (on x86 with no pae)? <maks> once it was recommended for lower latency by audio folks, it turns out that todays ext3 is for them the best bet too. <maks> echan pardon <riel> saxm: yeah <riel> saxm: so it could be either inside the low 896MB, or in highmem (or some page tables in both - more likely) <saxm> riel: and that 896meg chunk of physical memory addressed at PAGE_OFFSET, is also pagaeble right? So kernel allocations (not including page tables) just set some flag to disable paging on that page? <riel> ummmmmmmmmm, they map physical memory <riel> physical memory is, by definition, not pageable <riel> the contents of those pages might be pageable though <riel> so you could have a page P at physical address 400MB <riel> a process (eg. mozilla) is using that page <riel> at virtual address 120MB <riel> somewhere in its heap <riel> the contents of the physical page can be paged out, at which point mozilla's heap page at 120MB is paged out <riel> but the kernel mapping (at PAGE_OFFSET + 400MB) still maps the same page P <riel> just with different contents ;) <saxm> riel: thanks for that very helpful example!