[<prev] [next>] [day] [month] [year] [list]
Message-ID: <send-serie.davidel@xmailserver.org.16376.1233372314.0>
Date: Fri, 30 Jan 2009 19:25:14 -0800
From: Davide Libenzi <davidel@...ilserver.org>
To: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Ingo Molnar <mingo@...e.hu>, David Miller <davem@...emloft.net>
Subject: [patch 0/7] epoll keyed wakeups - introduction
The follwing patch set introduces wakeup hints for some of the most
popular (from epoll POV) devices, so that epoll code can avoid spurious
wakeups on its waiters.
The problem with epoll is that the callback-based wakeups do not, ATM,
carry any information about the events the wakeup is related to.
So the only choice epoll has (not being able to call f_op->poll() from
inside the callback), is to add the file* to a ready-list and resolve
the real events later on, at epoll_wait() (or its own f_op->poll()) time.
This can cause spurious wakeups, since the wake_up() itself might be
for an event the caller is not interested into.
The rate of these spurious wakeup can be pretty high in case of many
network sockets being monitored.
By allowing devices to report the events the wakeups refer to (at least
the two major classes - POLLIN/POLLOUT), we are able to spare useless
wakeups by proper handling inside the epoll's poll callback.
Epoll will have in any case to call f_op->poll() on the file* later on,
since the change to be done in order to have the full event set sent
via wakeup, is too invasive for the way our f_op->poll() system works
(the full event set is calculated inside the poll function - there are
too many of them to even start thinking the change - also poll/select
would need change too).
Epoll is changed in a way that both devices which send event hints, and
the ones that don't, are correctly handled. The former will gain some
efficiency though.
As a general rule for devices, would be to add an event mask by using
key-aware wakeup macros, when making up poll wait queues.
Test program available here:
http://www.xmailserver.org/epoll_test.c
Andrew, those are directly based over the bits you already have in -mm.
- Davide
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists