lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240905095134.GC15400@noisy.programming.kicks-ass.net>
Date: Thu, 5 Sep 2024 11:51:34 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Jens Axboe <axboe@...nel.dk>
Cc: linux-kernel@...r.kernel.org, tglx@...utronix.de, rafael@...nel.org,
	daniel.lezcano@...aro.org, linux-pm@...r.kernel.org
Subject: Re: [PATCHSET v6 0/4] Split iowait into two states

On Wed, Sep 04, 2024 at 08:41:23AM -0600, Jens Axboe wrote:

> > Yeah, but *WHY* !?!? I have some vague memories from last time around,
> > but patches should really keep this information.
> 
> To decouple the frequency boost on short waits from the accounting side,
> as lots of tooling equates iowait time with busy time and reports it as
> such. Yeah that's garbage and a reporting issue, but decades of
> education hasn't really improved on that. We should've dumped iowait
> once we moved away from 1-2 processor system or had preemptible kernels,
> but alas we did not and here we are in 2024.

There's 'WAIT' in the name, what broken piece of garbage reports it as
busy time? That has *NEVER* been right. Even on UP systems where IO-wait
is actually a sensible number, it is explicitly the time it *could* have
been busy, if only the IO were faster.

And are we really going to make the whole kernel situation worse just
because there's a bunch of broken userspace?

> >> Patches 1..3 are prep patches, changing the type of
> >> task_struct->nr_iowait and adding helpers to manipulate the iowait counts.
> >>
> >> Patch 4 does the actual splitting.
> >>
> >> This has been sitting for a while, would be nice to get this queued up
> >> for 6.12. Comments welcome!
> > 
> > Ufff, and all this because menu-governor does something insane :-(
> > 
> > Rafael, why can't we simply remove this from menu? All the nr_iowait*()
> > users are basically broken and I would much rather fix broken rather
> > than work around broken like this.
> > 
> > That is, from where I'm sitting this all makes the io-wait situation far
> > worse instead of better.
> 
> IMHO what we need is a way to propagate expected wait times for a
> sleeper. Right now iowait serves this purpose in a very crude way, in
> that it doesn't really tell you the expected wait, just that it's a
> short one.

Expected wait time is one thing, but you then *still* have no clue what
CPU it will get back on. Very typically it will be another CPU in the
same cache cluster. One that had no consideration of it when it went to
sleep.

A sleeping task is not associated with a CPU. There is a fundamental
mismatch there.

Using io-wait for idle state selection is very tricky because of this.

> If we simply remove iowait frequency boosting, then we'll have big
> regressions particularly for low/sync storage IO.

The frequency boosting thing I don't object to. That happend on wakeup
after we know that and where a task is going to run.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ