[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0906241154350.30928@makko.or.mcafeemobile.com>
Date: Wed, 24 Jun 2009 15:45:11 -0700 (PDT)
From: Davide Libenzi <davidel@...ilserver.org>
To: Rusty Russell <rusty@...tcorp.com.au>
cc: Gregory Haskins <ghaskins@...ell.com>, mst@...hat.com,
kvm@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
avi@...hat.com, paulmck@...ux.vnet.ibm.com,
Ingo Molnar <mingo@...e.hu>
Subject: Re: [PATCH 3/3] eventfd: add internal reference counting to fix
notifier race conditions
On Wed, 24 Jun 2009, Rusty Russell wrote:
> On Tue, 23 Jun 2009 03:33:22 am Davide Libenzi wrote:
> > What you're doing there, is setting up a kernel-to-kernel (since
> > userspace only role is to create the eventfd) communication, using a file*
> > as accessory. That IMO is plain wrong.
>
> The most sensible is that userspace can use these fds; an in-kernel variant is
> possible too, but not primary IMHO.
>
> It's nice that userspace create the fds; it can then use the same fd for
> multiple event sources.
>
> But I didn't see anything wrong with the way eventfd used to work: you have a
> kvm ioctl to say "attach this eventfd to this guest notification" and that does
> the eventfd_fget. A detach ioctl does the fput (as does release of the kvm
> fd).
>
> If they close the eventfd and don't do the detach ioctl, it's their problem.
Some components would like to know if userspace dropped the fd, and take
proper action accordingly (release resources, drop module instances, etc...).
The POLLHUP helps with that, but w/out decoupling file* memory from
eventfd_ctx memory, it becomes pretty tricky (if feasible at all) to
handle the event in a race-free way.
Once the file* is decoupled from the eventfd_ctx, it becomes saner to
expose the internal kernel API via the eventfd_ctx.
Another thing that comes in my mind (that for some components might not
matter) is considering the effect of userspace doing things like:
for (;;) {
fd = eventfd(...);
ioctl(xfd, XXX_ADD, fd);
close(fd);
}
That might lead to unprivileged users drawing kernel memory w/out any
userspace accountability, if not properly handled.
- Davide
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists