lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <efd77e7d-aeb0-427b-8829-dd17b1795094@geekplace.eu>
Date: Tue, 14 Jan 2025 17:45:28 +0100
From: Florian Schmaus <flo@...kplace.eu>
To: Kent Overstreet <kent.overstreet@...ux.dev>,
 Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 linux-bcachefs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] bcachefs: set rebalance thread to SCHED_BATCH and
 nice 19

On 14/01/2025 16.25, Kent Overstreet wrote:
> On Tue, Jan 14, 2025 at 03:32:14PM +0100, Peter Zijlstra wrote:
>> On Tue, Jan 14, 2025 at 01:47:28PM +0100, Florian Schmaus wrote:
>>> While the rebalance thread is isually not compute bound, it does cause
>>> a considerable amount of I/O. Since "reducing" the nice level from 0
>>> to 19, also implicitly reduces the threads best-effort I/O scheduling
>>> class level from 4 to 7, the reblance thread's I/O will be depriotized
>>> over normal I/O.
>>>
>>> Furthermore, we set the rebalance thread's scheduling class to BATCH,
>>> which means that it will potentially receive a higher scheduling
>>> latency. Making room for threads that need a low
>>> schedulinglatency (e.g., interactive onces).
>>
>> sorta.. what worries me most about these patches are the claims without
>> backing numbers.
>>
>> Supposedly there is a problem, and this here fixes it, but it doesn't
>> really get quantified much here.

I am sorry, Peter; I know that changes should be motivated by some data, 
but I unfortunately don't have any in this case.

As you wrote, the difference between BATCH and NORMAL tasks is that the 
former will not immediately kick a running task from the CPU.

With that in mind, it made sense that janitorial tasks running in the 
background and not requiring a low scheduling latency should run under 
BATCH (and not NORMAL). Bcachefs' rebalance thread is a prime example of 
such a task.

Additionally, I believe, but please correct me if I am wrong, that tasks 
using BATCH instead of NORMAL grant the scheduler more flexibility to 
provide scheduling-latency-sensitive tasks with lower latency. But you 
are right, I should have made some experiments if this is really the case.


> yeah, it was explained to me and made sense at the time, but things
> somehow keep falling out of my overflowing brain.
> 
> Florian, could you update the patch message with that? Was it intended
> as a partial workaround for the rebalance spinning issue some users have
> been hitting?

I did not run into that issue myself, but it probably would help 
somewhat mitigate the effects of the periods during which the rebalance 
task is CPU bound.

- Florian


Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (619 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ