linux-kernel - Re: [PATCH] eventfd: support delayed wakeup for non-semaphore eventfd to reduce cpu utilization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <tencent_F479CB8FE699B64A8B7A39E865F23F5D9005@qq.com>
Date:   Wed, 19 Apr 2023 23:23:15 +0800
From:   Wen Yang <wenyang.linux@...mail.com>
To:     Christian Brauner <brauner@...nel.org>,
        Jens Axboe <axboe@...nel.dk>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        Christoph Hellwig <hch@....de>, Dylan Yudaken <dylany@...com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        Paolo Bonzini <pbonzini@...hat.com>, Fu Wei <wefu@...hat.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] eventfd: support delayed wakeup for non-semaphore eventfd
 to reduce cpu utilization


在 2023/4/19 17:12, Christian Brauner 写道:
> On Tue, Apr 18, 2023 at 08:15:03PM -0600, Jens Axboe wrote:
>> On 4/17/23 10:32?AM, Wen Yang wrote:
>>> ? 2023/4/17 22:38, Jens Axboe ??:
>>>> On 4/16/23 5:31?AM, wenyang.linux@...mail.com wrote:
>>>>> From: Wen Yang <wenyang.linux@...mail.com>
>>>>>
>>>>> For the NON SEMAPHORE eventfd, if it's counter has a nonzero value,
>>>>> then a read(2) returns 8 bytes containing that value, and the counter's
>>>>> value is reset to zero. Therefore, in the NON SEMAPHORE scenario,
>>>>> N event_writes vs ONE event_read is possible.
>>>>>
>>>>> However, the current implementation wakes up the read thread immediately
>>>>> in eventfd_write so that the cpu utilization increases unnecessarily.
>>>>>
>>>>> By adding a configurable delay after eventfd_write, these unnecessary
>>>>> wakeup operations are avoided, thereby reducing cpu utilization.
>>>> What's the real world use case of this, and what would the expected
>>>> delay be there? With using a delayed work item for this, there's
>>>> certainly a pretty wide grey zone in terms of delay where this would
>>>> perform considerably worse than not doing any delayed wakeups at all.
>>>
>>> Thanks for your comments.
>>>
>>> We have found that the CPU usage of the message middleware is high in
>>> our environment, because sensor messages from MCU are very frequent
>>> and constantly reported, possibly several hundred thousand times per
>>> second. As a result, the message receiving thread is frequently
>>> awakened to process short messages.
>>>
>>> The following is the simplified test code:
>>> https://github.com/w-simon/tests/blob/master/src/test.c
>>>
>>> And the test code in this patch is further simplified.
>>>
>>> Finally, only a configuration item has been added here, allowing users
>>> to make more choices.
>> I think you'd have a higher chance of getting this in if the delay
>> setting was per eventfd context, rather than a global thing.

Thank you.
We will follow your suggestion to change the global configuration to per eventfd.

> That patch seems really weird. Is that an established paradigm to
> address problems like this through a configured wakeup delay? Because
> naively this looks like a pretty brutal hack.

Thanks.

Well, what you are concerned about may be that the rough delay may cause 
additional problems, which is indeed worth considering.

Meanwhile, prolonged and frequent write_eventfd calls are actually 
another type of attack.

If we change it to this:

When a continuous write_eventfd reaches a certain threshold in a short 
period of time, a delay is added as a penalty.

Do you think this is acceptable?


--

Best wishes,

Wen