[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20060731103322.GA1898@2ka.mipt.ru>
Date: Mon, 31 Jul 2006 14:33:22 +0400
From: Evgeniy Polyakov <johnpol@....mipt.ru>
To: Ulrich Drepper <drepper@...hat.com>
Cc: Zach Brown <zach.brown@...cle.com>,
David Miller <davem@...emloft.net>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [RFC 1/4] kevent: core files.
On Sat, Jul 29, 2006 at 09:18:47AM -0700, Ulrich Drepper (drepper@...hat.com) wrote:
> Evgeniy Polyakov wrote:
> > Btw, why do we want mapped ring of ready events?
> > If user requestd some event, he definitely wants to get them back when
> > they are ready, and not to check and then get them?
> > Could you please explain more on this issue?
>
> If of course makes no sense to enter the kernel to actually get the
> event. This should be done by storing the event in the ring buffer.
> I.e., there are two ways to get an event:
>
> - with a syscall. This can report as many events at once as the caller
> provides space for. And no event which is reported in the run buffer
> should be reported this way
>
> - if there is space, report it in the ring buffer. Yes, the buffer
> can be optional, then all events are reported by the system call.
That requires a copy, which can neglect syscall overhead.
Do we really want it to be done?
> So the use case would be like this:
>
>
> wait_and_get_event:
>
> is buffer empty ?
>
> yes -> make syscall
>
> no -> get event from buffer
>
>
> To avoid races, the syscall needs to take a parameter indicating the
> last event checked out from the buffer. If in the meantime the kernel
> put another event in the buffer the syscall immediately returns.
> Similar to what we do in the futex syscall.
And how "misordering" between queue and buffer is going to be managed?
I.e. when buffer is full and events are placed into queue, so syscall
could get them, and then syscall is called to get events from the queue
but not from the buffer - we can endup taking events from buffer while
old are placed in the queue.
And how waiting will be done without syscalls? Will glibc take care of
it?
> The question is how to best represent the ring buffer. Zach and some
> others had some ready responses in Ottawa. The important thing is to
> avoid cache line ping pong when possible.
>
> Is the ring buffer absolutely necessary? Probably not. But it has the
> potential to help quite a bit. Don't look at the problem to solve in
> the context of heavy I/O operations when another syscall here and there
> doesn't matter. With this single event mechanism for every possible
> event the kernel can generate programming can look quite different.
> E.g., every read() call can implicitly we changed into an async read
> call followed by a user-level reschedule. This rescheduling allows
> another thread of execution to run while the read request is processed.
> I.e., it's basically a setjmp() followed by a goto into the inner loop
> to get the next event. And now suddenly the event notification
> mechanism really should be as fast as possible. If we submit basically
> every request asynchronously and are not creating dedicated threads for
> specific tasks anymore we
>
> a) have a lot more event notifications
>
> b) the probability of an event being reported when we want the receive
> the next one if higher (i.e., the case where no syscall vs syscall
> makes a difference)
>
> Yes, all this will require changes in the way programs a written but we
> shouldn't limit the way we can write programs unnecessarily. I think
> that given increasing discrepancies in relative speed/latency of the
> peripherals and the CPU this is one possible solution to keep the CPUs
> busy without resorting to a gazillion separate threads in each program.
Ok, let's do it in the following way:
I present new version of kevent with new syscalls and fixed issues mentioned
before, while people look at it we can end up with mapped buffer design.
Is it ok?
> --
> ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
>
--
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists