[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090727165329.4acfda1c.akpm@linux-foundation.org>
Date: Mon, 27 Jul 2009 16:53:29 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Roland Dreier <rdreier@...co.com>
Cc: linux-kernel@...r.kernel.org, jsquyres@...co.com,
rostedt@...dmis.org
Subject: Re: [PATCH v2] ummunotify: Userspace support for MMU notifications
On Fri, 24 Jul 2009 15:56:17 -0700
Roland Dreier <rdreier@...co.com> wrote:
> As discussed in <http://article.gmane.org/gmane.linux.drivers.openib/61925>
> and follow-up messages, libraries using RDMA would like to track
> precisely when application code changes memory mapping via free(),
> munmap(), etc. Current pure-userspace solutions using malloc hooks
> and other tricks are not robust, and the feeling among experts is that
> the issue is unfixable without kernel help.
>
> We solve this not by implementing the full API proposed in the email
> linked above but rather with a simpler and more generic interface,
> which may be useful in other contexts. Specifically, we implement a
> new character device driver, ummunotify, that creates a /dev/ummunotify
> node. A userspace process can open this node read-only and use the fd
> as follows:
>
> 1. ioctl() to register/unregister an address range to watch in the
> kernel (cf struct ummunotify_register_ioctl in <linux/ummunotify.h>).
>
> 2. read() to retrieve events generated when a mapping in a watched
> address range is invalidated (cf struct ummunotify_event in
> <linux/ummunotify.h>). select()/poll()/epoll() and SIGIO are
> handled for this IO.
>
> 3. mmap() one page at offset 0 to map a kernel page that contains a
> generation counter that is incremented each time an event is
> generated. This allows userspace to have a fast path that checks
> that no events have occurred without a system call.
>
> Thanks to Jason Gunthorpe <jgunthorpe@...idianresearch.com> for
> suggestions on the interface design. Also thanks to Jeff Squyres
> <jsquyres@...co.com> for prototyping support for this in Open MPI, which
> helped find several bugs during development.
>
> ...
>
> +config UMMUNOTIFY
> + tristate "Userspace MMU notifications"
> + select MMU_NOTIFIER
> + help
> + The ummunotify (userspace MMU notification) driver creates a
> + character device that can be used by userspace libraries to
> + get notifications when an application's memory mapping
> + changed. This is used, for example, by RDMA libraries to
> + improve the reliability of memory registration caching, since
> + the kernel's MMU notifications can be used to know precisely
> + when to shoot down a cached registration.
Does `select' dtrt here if UMMUNOTIFY=m? I never trust it...
<searches in vain for ummunotify.txt>
Oh well :(
A little test app would be nice - I assume you have one. We could toss
in in the tree as a how-to-use example, and people could perhaps turn
it into a regression test - perhaps the LTP people would take it.
>
> ...
>
> + if (test_bit(UMMUNOTIFY_FLAG_HINT, ®->flags)) {
> + clear_bit(UMMUNOTIFY_FLAG_HINT, ®->flags);
> + } else {
> + set_bit(UMMUNOTIFY_FLAG_HINT, ®->flags);
It's a shame that change_bit() didn't return the old (or new) value.
The overall userspace interface seems a bit klunky, but I can't really
suggest anything better. Netlink delivery?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists