[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSd-HvtPVwRto0EGExm-Pz7dGpxAt+1sTb51P_QBd-N9KQ@mail.gmail.com>
Date: Wed, 2 Nov 2022 19:09:38 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jens Axboe <axboe@...nel.dk>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCHSET v3 0/5] Add support for epoll min_wait
On Wed, Nov 2, 2022 at 1:54 PM Jens Axboe <axboe@...nel.dk> wrote:
>
> On 11/2/22 11:46 AM, Willem de Bruijn wrote:
> > On Sun, Oct 30, 2022 at 6:02 PM Jens Axboe <axboe@...nel.dk> wrote:
> >>
> >> Hi,
> >>
> >> tldr - we saw a 6-7% CPU reduction with this patch. See patch 6 for
> >> full numbers.
> >>
> >> This adds support for EPOLL_CTL_MIN_WAIT, which allows setting a minimum
> >> time that epoll_wait() should wait for events on a given epoll context.
> >> Some justification and numbers are in patch 6, patches 1-5 are really
> >> just prep patches or cleanups.
> >>
> >> Sending this out to get some input on the API, basically. This is
> >> obviously a per-context type of operation in this patchset, which isn't
> >> necessarily ideal for any use case. Questions to be debated:
> >>
> >> 1) Would we want this to be available through epoll_wait() directly?
> >> That would allow this to be done on a per-epoll_wait() basis, rather
> >> than be tied to the specific context.
> >>
> >> 2) If the answer to #1 is yes, would we still want EPOLL_CTL_MIN_WAIT?
> >>
> >> I think there are pros and cons to both, and perhaps the answer to both is
> >> "yes". There are some benefits to doing this at epoll setup time, for
> >> example - it nicely isolates it to that part rather than needing to be
> >> done dynamically everytime epoll_wait() is called. This also helps the
> >> application code, as it can turn off any busy'ness tracking based on if
> >> the setup accepted EPOLL_CTL_MIN_WAIT or not.
> >>
> >> Anyway, tossing this out there as it yielded quite good results in some
> >> initial testing, we're running more of it. Sending out a v3 now since
> >> someone reported that nonblock issue which is annoying. Hoping to get some
> >> more discussion this time around, or at least some...
> >
> > My main question is whether the cycle gains justify the code
> > complexity and runtime cost in all other epoll paths.
> >
> > Syscall overhead is quite dependent on architecture and things like KPTI.
>
> Definitely interested in experiences from other folks, but what other
> runtime costs do you see compared to the baseline?
Nothing specific. Possible cost from added branches and moving local
variables into structs with possibly cold cachelines.
> > Indeed, I was also wondering whether an extra timeout arg to
> > epoll_wait would give the same feature with less side effects. Then no
> > need for that new ctrl API.
>
> That was my main question in this posting - what's the best api? The
> current one, epoll_wait() addition, or both? The nice thing about the
> current one is that it's easy to integrate into existing use cases, as
> the decision to do batching on the userspace side or by utilizing this
> feature can be kept in the setup path. If you do epoll_wait() and get
> -1/EINVAL or false success on older kernels, then that's either a loss
> because of thinking it worked, or a fast path need to check for this
> specifically every time you call epoll_wait() rather than just at
> init/setup time.
>
> But this is very much the question I already posed and wanted to
> discuss...
I see the value in being able to detect whether the feature is present.
But a pure epoll_wait implementation seems a lot simpler to me, and
more elegant: timeout is an argument to epoll_wait already.
A new epoll_wait variant would have to be a new system call, so it
would be easy to infer support for the feature.
>
> --
> Jens Axboe
Powered by blists - more mailing lists