lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZvQo9gJgTlZ3nB03@freefall.freebsd.org>
Date: Wed, 25 Sep 2024 15:15:02 +0000
From: Suleiman Souhlal <ssouhlal@...ebsd.org>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Suleiman Souhlal <suleiman@...gle.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Paolo Bonzini <pbonzini@...hat.com>, joelaf@...gle.com,
	vineethrp@...gle.com, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, Srikar Dronamraju <srikar@...ux.ibm.com>,
	Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH v2] sched: Don't try to catch up excess steal time.

On Wed, Sep 25, 2024 at 03:26:56PM +0100, David Woodhouse wrote:
> On Wed, 2024-09-25 at 13:25 +0000, Suleiman Souhlal wrote:
> > > 
> > > I'm still not utterly convinced this is the right thing to do, though.
> > > It means you will constantly undermeasure the accounting of steal time
> > > as you discard the excess each time.
> > > 
> > > The underlying bug here is that we are sampling the steal time and the
> > > time slice at *different* times. This update_rq_clock_task() function
> > > could be called with a calculated 'delta' argument... and then
> > > experience a large amount of steal time *before* calling
> > > paravirt_steal_clock(), which is how we end up in the situation where
> > > the calculated steal time exceeds the running time of the previous
> > > task.
> > > 
> > > Which task *should* that steal time be accounted to? At the moment it
> > > ends up being accounted to the next task to run — which seems to make
> > > sense to me. In the situation I just described, we can consider the
> > > time stolen in update_rq_clock_task() itself to have been stolen from
> > > the *incoming* task, not the *outgoing* one. But that seems to be what
> > > you're objecting to?
> > 
> > This is a good description of the problem, except that the time stolen
> > in update_rq_clock_task() itself is actually being stolen from the 
> > outgoing task. This is because we are still trying to calculate how long
> > it ran for (update_curr()), and time hasn't started ticking for the
> > incoming task yet. We haven't set the incoming task's exec_start with the
> > new clock_task time yet.
> > 
> > So, in my opinion, it's wrong to give that time to the incoming task.
> 
> That makes sense. That steal time is actually stolen from *neither*
> task, since it's after the 'end' timestamp of the outgoing task, and
> before the 'start' timestamp of the incoming task.
> 
> So where *should* it be accounted?
> 
> Or is it actually correct to drop it completely?
> 
> If you can make a coherent case for the fact that dropping it is really
> the right thing to do (not *just* that it doesn't belong to the
> outgoing task, which is the case you make in your existing commit
> message), then I suppose I'm OK with your patch as-is.
> > 

Yes, that's a good way to put it: The excess steal time isn't actually
being stolen from anyone.
And since it's not being stolen from anyone, isn't the right thing to do
to drop it?

There might still be extra steal time that doesn't exceed the current
'delta' from the race between reading the two values, that would still
be erroneously accounted to the outgoing task, which this patch doesn't
address, but we know that any steal > delta is from this race and should
be dropped.

-- Suleiman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ