lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240409092104.GA2665@noisy.programming.kicks-ass.net>
Date: Tue, 9 Apr 2024 11:21:04 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Chen Yu <yu.c.chen@...el.com>
Cc: Abel Wu <wuyun.abel@...edance.com>, Ingo Molnar <mingo@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Juri Lelli <juri.lelli@...hat.com>, Tim Chen <tim.c.chen@...el.com>,
	Tiwei Bie <tiwei.btw@...group.com>,
	Honglei Wang <wanghonglei@...ichuxing.com>,
	Aaron Lu <aaron.lu@...el.com>, Chen Yu <yu.chen.surf@...il.com>,
	linux-kernel@...r.kernel.org,
	kernel test robot <oliver.sang@...el.com>
Subject: Re: [RFC PATCH] sched/eevdf: Return leftmost entity in pick_eevdf()
 if no eligible entity is found

On Mon, Apr 08, 2024 at 09:11:39PM +0800, Chen Yu wrote:
> On 2024-04-08 at 13:58:33 +0200, Peter Zijlstra wrote:
> > On Thu, Feb 29, 2024 at 05:00:18PM +0800, Abel Wu wrote:
> > 
> > > > According to the log, vruntime is 18435852013561943404, the
> > > > cfs_rq->min_vruntime is 763383370431, the load is 629 + 2048 = 2677,
> > > > thus:
> > > > s64 delta = (s64)(18435852013561943404 - 763383370431) = -10892823530978643
> > > >      delta * 2677 = 7733399554989275921
> > > > that is to say, the multiply result overflow the s64, which turns the
> > > > negative value into a positive value, thus eligible check fails.
> > > 
> > > Indeed.
> > 
> > From the data presented it looks like min_vruntime is wrong and needs
> > update. If you can readily reproduce this, dump the vruntime of all
> > tasks on the runqueue and see if min_vruntime is indeed correct.
> >
> 
> This was the dump of all the entities on the tree, from left to right,

Oh, my bad, I thought it was the pick path.

> and also from top down in middle order traverse, when this issue happens:
> 
> [  514.461242][ T8390] cfs_rq avg_vruntime:386638640128 avg_load:2048 cfs_rq->min_vruntime:763383370431
> [  514.535935][ T8390] current on_rq se 0xc5851400, deadline:18435852013562231446
> 			min_vruntime:18437121115753667698 vruntime:18435852013561943404, load:629
> 
> 
> [  514.536772][ T8390] Traverse rb-tree from left to right
> [  514.537138][ T8390]  se 0xec1234e0 deadline:763384870431 min_vruntime:763383370431 vruntime:763383370431 non-eligible  <-- leftmost se
> [  514.537835][ T8390]  se 0xec4fcf20 deadline:763762447228 min_vruntime:763760947228 vruntime:763760947228 non-eligible
> 
> [  514.538539][ T8390] Traverse rb-tree from topdown
> [  514.538877][ T8390]  middle se 0xec1234e0 deadline:763384870431 min_vruntime:763383370431 vruntime:763383370431 non-eligible   <-- root se
> [  514.539605][ T8390]  middle se 0xec4fcf20 deadline:763762447228 min_vruntime:763760947228 vruntime:763760947228 non-eligible
> 
> The tree looks like:
> 
>           se (0xec1234e0)
>                   |
>                   |
>                   ----> se (0xec4fcf20)
> 
> 
> The root se 0xec1234e0 is also the leftmost se, its min_vruntime and
> vruntime are both 763383370431, which is aligned with
> cfs_rq->min_vruntime. It seems that the cfs_rq's min_vruntime gets
> updated correctly, because it is monotonic increasing.

Right.

> My guess is that, for some reason, one newly forked se in a newly
> created task group, in the rb-tree has not been picked for a long
> time(maybe not eligible). Its vruntime stopped at the negative
> value(near (unsigned long)(-(1LL << 20)) for a long time, its vruntime
> is long behind the cfs_rq->vruntime, thus the overflow happens.

I'll have to do the math again, but that's something in the order of not
picking a task in about a day, that would be 'bad' :-)

Is there any sane way to reproduce this, and how often does it happen?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ