== Remappable memory == Drivers often implement mmap() to allow userspace to have direct access to memory that was allocated/reserved inside the kernel. For example, you may wish to allow userspace to have direct access to a kernel-allocated buffer that is used for DMA with a PCI device. [http://lwn.net/Kernel/LDD3/ LDD3] chapter 15 discusses this topic, but one important point was not made clear to me: you cannot remap memory allocated with kmalloc() through the fault/nopage technique. mmap is a page-oriented interface, but kmalloc does not provide pages - it provides another type of memory object. You may be tempted to use virt_to_page() to get a struct page pointer to a kmalloced region and then remap that in your pagefault handler, but this is a violation of the abstraction as [http://marc.info/?l=linux-mm&m=121238525325385&w=2 explained by Johannes Weiner]. At least with the SLAB allocator, virt_to_page() on a kmalloced region returns pages with the PG_slab flag set, which will cause "Bad page state" messages during the munmap path (see free_pages_check() in mm/page_alloc.c). That aside, it is legal to remap kmalloced memory into userspace by using e.g. remap_pfn_range(). This technique does not touch the underlying page structures. If you do need the flexibility offered by the nopage/fault VM handlers, use an interface such as alloc_pages() to obtain memory which is safe to remap through a PF handler. Another option is to use vmalloc() to obtain virtually contiguous memory regions, which can safely be converted to page structure pointers using vmalloc_to_page(). == mmap and real files == This is a cut'n'paste of an IRC conversation on the #kernelnewbies channel. One day this should be rewritten into a more easily readable article... {{{ if you're using mmap on a file descriptor, how are the changes eventually written to disk? what gets called? does the normal read/write function eventually get called? bronaugh: two times bronaugh: changes are written to disk either at/after msync(2) time, or after munmap(2) time bronaugh: or, if the system has a memory shortage, by the pageout code alright. and that uses the normal read/write calls? but I believe he means if the actual sys_read() / sys_write() code ie getting called. to that, no, the actual "dirty" pages are written bronaugh: no bronaugh: data changed through mmap does not go through read/write syscalls ok. here's why I'm asking. I'm modifying framebuffer code for some nefarious purposes. I don't want a memory-backed framebuffer; I want all calls like that to go over the network. now, framebuffers have an fb_read and an fb_write call associated with them. these end up being called in fbmem.c by the main handler for read and write, which is set up in the file_operation struct. my question is -- will those routines be called? (given that they will be called normally by a read/write system call) <-- SGH has quit (Quit: Client exiting) sorry if I might be a bit confusing here.. just trying to get a handle on it myself if you set those routines as the mmap read and write functions, yes ohh, special functions. ok. I'll dig into that. you can set them at mmap(2) time ok, so how does one do that? (set the mmap read and write functions) lets take a look at drivers/video/skeletonfb.c static struct fb_ops xxxfb_ops = { alright. wish I'd looked at that. heh. you can see it set .fb_read and .fb_write and .fb_mmap functions ? yup. I've set those up in my driver. they're stubbed but present. wait, I forgot something important that is device driver specific on a frame buffer, you want writes to show up on the screen immediately you don't want to wait on msync() for your changes to hit the screen yeah. but this is a network framebuffer, so batching up writes is a plus. though you don't want to go -too- far with that. we'll just say it's a normal framebuffer as a simplifying assumption. normal but remote (ie, not in the same memory space) one thing you could do every once in a while is initiate the msync from kernel space not the cheapest thing to do, but ... it'd work in a pinch. easy to verify the functionality, transparently to userspace ok so... back on topic. I don't see skeletonfb having an mmap func, just a stub. sorry. not a stub, just a declaration with no implementation. indeed, the mmap function is in fbmem.c the main one, yeah. but that dispatches to others if they are present. I've looked at the main one, but I don't understand io_remap_pfn_range. I've followed the code, I know that eventually it mucks with page table entries. but beyond that it is opaque to me. bronaugh: basically it maps physical addresses to page table entries bronaugh: and may not be what you want when your frame buffer is backed by non-physically contiguous memory yeah, I was wondering about that. I'm wondering if you might be better off hacking up ramfs and using a virtual file as your framebuffer so is there an alternate type of memory mapping I can set up; one such as used with files? because clearly that eventually has to call functions to do the IO; the problem is equivalent, a device with a different kind of address space. hmm, filemap.c... anyhow, how would one set up a mapping of that sort? make it a file inside the page cache then the VM can handle page faults for you ok, that's definitely what I want. but how do I go about doing that? is there somewhere I can read? try fs/ramfs/ alright. wow. short. ramfs was written as a demonstration of what the VFS can do (and what filesystems do not have to do themselves) sounds like a worthy goal. ok, hmm. generic_file_mmap you'll be able to chainsaw out lots of code from ramfs, since you won't need mounting, a directory, etc... yeah. it seems to me that I should be able to just plug in generic_file_mmap as my mmap handler. but - I need to see the code first. }}} ---- ["CategoryLinuxMMInternals"]