(IA32, non-PAE specific information) The kernel paging actually begins early on in the boot process. head.S contains code to create provisional page tables and get the kernel up and running, but that us beyond this overview. '''Initialization:''' Paging is initialized in arch/i386/mm/init.c. The function 'paging_init()' is called once by setup_arch during kernelinitialization. It immediately (in non-PAE) callspagetable_init(). pagetable_init() starts by definingthe base of the page table directory: {{{ *pgd_base = ;}}} swapper_pg_dir is defined in head.S as well, using.org directives. It points to 1000 above the 'root' ofkernelmemory. Kernel memory is defined to start at PAGE_OFFSET,whichin x86 is 0XC0000000, or 3 gigabytes. This is where the3gig/1gig split is defined. Every virtual address abovePAGE_OFFSET is the kernel, any below is user. After some capability checking, pagetable_init() calls kernel_physical_mapping_init. This function performs the lionsshare of the kernel page table setup. Want to turn this [http://kernelnewbies.org/ #kernelnewbies] irc log into a real explanation? Feel free to create yourself an account (UserPreferences) and edit this page ;-) {{{ is everything from PAGE_OFFSET onwards paged? saxm: depends, what do you mean by "paged" and what do you mean by "everything" ? ;)) * riel could find exceptions on either side of PAGE_OFFSET, depending on which meanings you want to use riel: "paged" as in hardware paged by the cpu, "everything" meaning addressable memory after bootup, all memory is accessed through the MMU riel, do you recall when current mainline 2.6.10 will decide not to cache a file? so everything before and after PAGE_OFFSET is paged not everything can be demand paged, though ... for example, when I do: open() seek() read() close() I seem to recall that sequential reads were special cased? riel: but there's a difference between paging above and below PAGE_OFFSET?? Process pages below PAGE_OFFSET map to kernel pages above PAGE_OFFSET? pages below PAGE_OFFSET belong to userspace and can be demand paged addresses above PAGE_OFFSET are kernel memory riel: so there is no linear mapping between pages in virtual memory and consecutive area of physical memory? there is a linear mapping for the first 900 MB of kernel memory where physical address 0 - 896 MB is mapped into PAGE_OFFSET - PAGE_OFFSET+896MB (depending on the split) riel: ok, so there are 896*1024/4 physical frames addressable from PAGE_OFFSET->PAGE_OFFSET+896mb, and page directorys/tables map userspace page accesses to the appropriate page within this range? saxm: no, userspace does not have access to the virtual memory beyond PAGE_OFFSET saxm: userspace only gets access to virtual addresses below PAGE_OFFSET riel: just trying to understand how virtual pages relate to this mapped area of memory from PAGE_OFFSET to PAGE_OFFSET+896? memory above PAGE_OFFSET is kernel virtual memory part of it is a direct map of the first part of physical memory but that same physical memory could also get virtual mappings from elsewhere, eg. userspace or vmalloc also, userspace and vmalloc can map physical memory from outside the 896MB of direct mapped memory (as well as inside it) riel: ok, multiple mappings to physical pages, that clears things up for me! riel: so how does it works for kernel memory? kernel memory allocations (for page tables etc...) must come out of that 896meg chunk too? most kernel memory allocation needs to come from that 896 MB, indeed though page tables are the big exception ;) riel: which means they're resident in memory all the time - if that's where physical memory is mapped to? kernel data structures are always resident riel: so where do page tables reside? Surely not below PAGE_OFFSET? Somewhere above PAGE_OFFSET+896mb then? they could reside anywhere anywhere from 0->4gb (on x86 with no pae)? once it was recommended for lower latency by audio folks, it turns out that todays ext3 is for them the best bet too. echan pardon saxm: yeah saxm: so it could be either inside the low 896MB, or in highmem (or some page tables in both - more likely) riel: and that 896meg chunk of physical memory addressed at PAGE_OFFSET, is also pagaeble right? So kernel allocations (not including page tables) just set some flag to disable paging on that page? ummmmmmmmmm, they map physical memory physical memory is, by definition, not pageable the contents of those pages might be pageable though so you could have a page P at physical address 400MB a process (eg. mozilla) is using that page at virtual address 120MB somewhere in its heap the contents of the physical page can be paged out, at which point mozilla's heap page at 120MB is paged out but the kernel mapping (at PAGE_OFFSET + 400MB) still maps the same page P just with different contents ;) riel: thanks for that very helpful example! }}}