lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_31DEA62F31CFF96D3ED356F1508707594C0A@qq.com>
Date:   Fri, 21 Apr 2023 01:44:35 +0800
From:   Wen Yang <wenyang.linux@...mail.com>
To:     Jens Axboe <axboe@...nel.dk>,
        Christian Brauner <brauner@...nel.org>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        Christoph Hellwig <hch@....de>, Dylan Yudaken <dylany@...com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        Paolo Bonzini <pbonzini@...hat.com>, Fu Wei <wefu@...hat.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] eventfd: support delayed wakeup for non-semaphore eventfd
 to reduce cpu utilization


在 2023/4/20 00:42, Jens Axboe 写道:
> On 4/19/23 3:12?AM, Christian Brauner wrote:
>> On Tue, Apr 18, 2023 at 08:15:03PM -0600, Jens Axboe wrote:
>>> On 4/17/23 10:32?AM, Wen Yang wrote:
>>>> ? 2023/4/17 22:38, Jens Axboe ??:
>>>>> On 4/16/23 5:31?AM, wenyang.linux@...mail.com wrote:
>>>>>> From: Wen Yang <wenyang.linux@...mail.com>
>>>>>>
>>>>>> For the NON SEMAPHORE eventfd, if it's counter has a nonzero value,
>>>>>> then a read(2) returns 8 bytes containing that value, and the counter's
>>>>>> value is reset to zero. Therefore, in the NON SEMAPHORE scenario,
>>>>>> N event_writes vs ONE event_read is possible.
>>>>>>
>>>>>> However, the current implementation wakes up the read thread immediately
>>>>>> in eventfd_write so that the cpu utilization increases unnecessarily.
>>>>>>
>>>>>> By adding a configurable delay after eventfd_write, these unnecessary
>>>>>> wakeup operations are avoided, thereby reducing cpu utilization.
>>>>> What's the real world use case of this, and what would the expected
>>>>> delay be there? With using a delayed work item for this, there's
>>>>> certainly a pretty wide grey zone in terms of delay where this would
>>>>> perform considerably worse than not doing any delayed wakeups at all.
>>>>
>>>> Thanks for your comments.
>>>>
>>>> We have found that the CPU usage of the message middleware is high in
>>>> our environment, because sensor messages from MCU are very frequent
>>>> and constantly reported, possibly several hundred thousand times per
>>>> second. As a result, the message receiving thread is frequently
>>>> awakened to process short messages.
>>>>
>>>> The following is the simplified test code:
>>>> https://github.com/w-simon/tests/blob/master/src/test.c
>>>>
>>>> And the test code in this patch is further simplified.
>>>>
>>>> Finally, only a configuration item has been added here, allowing users
>>>> to make more choices.
>>> I think you'd have a higher chance of getting this in if the delay
>>> setting was per eventfd context, rather than a global thing.
>> That patch seems really weird. Is that an established paradigm to
>> address problems like this through a configured wakeup delay? Because
>> naively this looks like a pretty brutal hack.
> It is odd, and it is a brutal hack. My worries were outlined in an
> earlier reply, there's quite a big gap where no delay would be better
> and the delay approach would be miserable because it'd cause extra
> latency and extra context switches. It'd be much cleaner if you KNEW
> there'd be more events coming, as you could then get rid of that delayed
> work item completely. And I suspect, if this patch makes sense, that
> it'd be better to have a number+time limit as well and if you hit the
> event number count that you'd notify inline and put some smarts in the
> delayed work handling to just not do anything if nothing is pending.

Thank you very much for your suggestion.

We will improve the implementation according to your suggestion and send 
the v2 later.


--

Best wishes,

Wen


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ