LinuxMM:

Remappable memory

Drivers often implement mmap() to allow userspace to have direct access to memory that was allocated/reserved inside the kernel. For example, you may wish to allow userspace to have direct access to a kernel-allocated buffer that is used for DMA with a PCI device.

[http://lwn.net/Kernel/LDD3/ LDD3] chapter 15 discusses this topic, but one important point was not made clear to me: you cannot remap memory allocated with kmalloc(). mmap is a page-oriented interface, but kmalloc does not provide pages - it provides another type of memory object. You may be tempted to use virt_to_page() to get a struct page pointer to a kmalloced region and then remap that in your pagefault handler, but this is a violation of the abstraction as [http://marc.info/?l=linux-mm&m=121238525325385&w=2 explained by Johannes Weiner]. At least with the SLAB allocator, virt_to_page() on a kmalloced region returns pages with the PG_slab flag set, which will cause "Bad page state" messages during the munmap path (see free_pages_check() in mm/page_alloc.c).

Instead, use an interface such as alloc_pages() to obtain memory which is safe to remap through a page fault handler.

Another option is to use vmalloc() to obtain virtually-contiguous memory regions, which can safely be converted to page structure pointers using vmalloc_to_page().

mmap and real files

This is a cut'n'paste of an IRC conversation on the #kernelnewbies channel. One day this should be rewritten into a more easily readable article...

<bronaugh> if you're using mmap on a file descriptor, how are the changes eventually written to disk? what gets called?
<bronaugh> does the normal read/write function eventually get called?
<riel> bronaugh: two times
<riel> bronaugh: changes are written to disk either at/after msync(2) time, or after munmap(2) time
<riel> bronaugh: or, if the system has a memory shortage, by the pageout code
<bronaugh> alright. and that uses the normal read/write calls?
<rene> but I believe he means if the actual sys_read() / sys_write() code ie getting called. to that, no, the actual "dirty" pages are written
<riel> bronaugh: no
<riel> bronaugh: data changed through mmap does not go through read/write syscalls
<bronaugh> ok. here's why I'm asking.
<bronaugh> I'm modifying framebuffer code for some nefarious purposes. I don't want a memory-backed framebuffer; I want all calls like that to go over the network.
<bronaugh> now, framebuffers have an fb_read and an fb_write call associated with them. these end up being called in fbmem.c by the main handler for read and write, which is set up in the file_operation struct.
<bronaugh> my question is -- will those routines be called?
<bronaugh> (given that they will be called normally by a read/write system call)
<-- SGH has quit (Quit: Client exiting)
<bronaugh> sorry if I might be a bit confusing here.. just trying to get a handle on it myself
<riel> if you set those routines as the mmap read and write functions, yes
<bronaugh> ohh, special functions. ok.
<bronaugh> I'll dig into that.
<riel> you can set them at mmap(2) time
<bronaugh> ok, so how does one do that?
<bronaugh> (set the mmap read and write functions)
<riel> lets take a look at drivers/video/skeletonfb.c
<riel> static struct fb_ops xxxfb_ops = {
<bronaugh> alright.
<bronaugh> wish I'd looked at that. heh.
<riel> you can see it set .fb_read and .fb_write and .fb_mmap functions ?
<bronaugh> yup.
<bronaugh> I've set those up in my driver.
<bronaugh> they're stubbed but present.
<riel> wait, I forgot something important that is device driver specific
<riel> on a frame buffer, you want writes to show up on the screen immediately
<riel> you don't want to wait on msync() for your changes to hit the screen
<bronaugh> yeah.
<bronaugh> but this is a network framebuffer, so batching up writes is a plus.
<bronaugh> though you don't want to go -too- far with that.
<bronaugh> we'll just say it's a normal framebuffer as a simplifying assumption.
<bronaugh> normal but remote
<bronaugh> (ie, not in the same memory space)
<riel> one thing you could do every once in a while is initiate the msync from kernel space
<riel> not the cheapest thing to do, but ...
<bronaugh> it'd work in a pinch.
<riel> easy to verify the functionality, transparently to userspace
<bronaugh> ok so... back on topic. I don't see skeletonfb having an mmap func, just a stub.
<bronaugh> sorry. not a stub, just a declaration with no implementation.
<riel> indeed, the mmap function is in fbmem.c
<bronaugh> the main one, yeah. but that dispatches to others if they are present.
<bronaugh> I've looked at the main one, but I don't understand io_remap_pfn_range.
<bronaugh> I've followed the code, I know that eventually it mucks with page table entries.
<bronaugh> but beyond that it is opaque to me.
<riel> bronaugh: basically it maps physical addresses to page table entries
<riel> bronaugh: and may not be what you want when your frame buffer is backed by non-physically contiguous memory
<bronaugh> yeah, I was wondering about that.
<riel> I'm wondering if you might be better off hacking up ramfs and using a virtual file as your framebuffer
<bronaugh> so is there an alternate type of memory mapping I can set up; one such as used with files?
<bronaugh> because clearly that eventually has to call functions to do the IO; the problem is equivalent, a device with a different kind of address space.
<bronaugh> hmm, filemap.c...
<bronaugh> anyhow, how would one set up a mapping of that sort?
<riel> make it a file inside the page cache
<riel> then the VM can handle page faults for you
<bronaugh> ok, that's definitely what I want.
<bronaugh> but how do I go about doing that? is there somewhere I can read?
<riel> try fs/ramfs/
<bronaugh> alright.
<bronaugh> wow. short.
<riel> ramfs was written as a demonstration of what the VFS can do (and what filesystems do not have to do themselves)
<bronaugh> sounds like a worthy goal.
<bronaugh> ok, hmm. generic_file_mmap
<riel> you'll be able to chainsaw out lots of code from ramfs, since you won't need mounting, a directory, etc...
<bronaugh> yeah.
<bronaugh> it seems to me that I should be able to just plug in generic_file_mmap as my mmap handler.
<bronaugh> but - I need to see the code first.


["CategoryLinuxMMInternals"]

LinuxMM: DeviceDriverMmap (last edited 2008-06-04 09:29:55 by rp073a)