[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4281b354-d67d-2883-d966-a7816ed4f811@kernel.dk>
Date: Mon, 7 Nov 2022 14:38:52 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Stefan Hajnoczi <stefanha@...hat.com>
Cc: linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCHSET v3 0/5] Add support for epoll min_wait
On 11/7/22 1:56 PM, Stefan Hajnoczi wrote:
> Hi Jens,
> NICs and storage controllers have interrupt mitigation/coalescing
> mechanisms that are similar.
Yep
> NVMe has an Aggregation Time (timeout) and an Aggregation Threshold
> (counter) value. When a completion occurs, the device waits until the
> timeout or until the completion counter value is reached.
>
> If I've read the code correctly, min_wait is computed at the beginning
> of epoll_wait(2). NVMe's Aggregation Time is computed from the first
> completion.
>
> It makes me wonder which approach is more useful for applications. With
> the Aggregation Time approach applications can control how much extra
> latency is added. What do you think about that approach?
We only tested the current approach, which is time noted from entry, not
from when the first event arrives. I suspect the nvme approach is better
suited to the hw side, the epoll timeout helps ensure that we batch
within xx usec rather than xx usec + whatever the delay until the first
one arrives. Which is why it's handled that way currently. That gives
you a fixed batch latency.
--
Jens Axboe
Powered by blists - more mailing lists