lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160204120412.GA29586@e106622-lin>
Date:	Thu, 4 Feb 2016 12:04:12 +0000
From:	Juri Lelli <juri.lelli@....com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Clark Williams <williams@...hat.com>,
	John Kacur <jkacur@...hat.com>,
	Daniel Bristot de Oliveira <bristot@...hat.com>,
	Juri Lelli <juri.lelli@...il.com>
Subject: Re: [BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets

On 04/02/16 09:54, Juri Lelli wrote:
> Hi Steve,
> 
> first of all thanks a lot for your detailed report, if only all bug
> reports were like this.. :)
> 
> On 03/02/16 13:55, Steven Rostedt wrote:

[...]

> 
> Right. I think this is the same thing that happens after hotplug. IIRC
> the code paths are actually the same. The problem is that hotplug or
> cpuset reconfiguration operations are destructive w.r.t. root_domains,
> so we lose bandwidth information when that happens. The problem is that
> we only store cumulative information regarding bandwidth in root_domain,
> while information about which task belongs to which cpuset is store in
> cpuset data structures.
> 
> I tried to fix this a while back, but my tentative was broken, I failed
> to get locking right and, even though it seemed to fix the issue for me,
> it was prone to race conditions. You might still want to have a look at
> that for reference: https://lkml.org/lkml/2015/9/2/162
> 

[...]

> 
> It's good that we can recover, but that's still a bug yes :/.
> 
> I'll try to see if my broken patch make what you are seeing apparently
> disappear, so that we can at least confirm that we are seeing the same
> problem; you could do the same if you want, I pushed that here
> 

No it doesn't solve this :/. I placed restoring code in the hotplug
workfn, so updates generated by toggling sched_load_balance don't get
caught, of course. But, this at least tells us that we should solve this
someplace else.

Best,

- Juri

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ