lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170515083629.kpowe7tcbnfvg6wk@e106622-lin>
Date:   Mon, 15 May 2017 09:36:29 +0100
From:   Juri Lelli <juri.lelli@....com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Byungchul Park <byungchul.park@....com>, peterz@...radead.org,
        mingo@...nel.org, linux-kernel@...r.kernel.org,
        juri.lelli@...il.com, bristot@...hat.com, kernel-team@....com
Subject: Re: [PATCH v4 1/5] sched/deadline: Refer to cpudl.elements atomically

Hi,

On 12/05/17 10:25, Steven Rostedt wrote:
> On Fri, 12 May 2017 14:48:45 +0900
> Byungchul Park <byungchul.park@....com> wrote:
> 
> > cpudl.elements is an instance that should be protected with a spin lock.
> > Without it, the code would be insane.
> 
> And how much contention will this add? Spin locks in the scheduler code
> that are shared among a domain can cause huge latency. This was why I
> worked hard not to add any in the cpupri code.
> 
> 
> > 
> > Current cpudl_find() has problems like,
> > 
> >    1. cpudl.elements[0].cpu might not match with cpudl.elements[0].dl.
> >    2. cpudl.elements[0].dl(u64) might not be referred atomically.
> >    3. Two cpudl_maximum()s might return different values.
> >    4. It's just insane.
> 
> And lockless algorithms usually are insane. But locks come with a huge
> cost. What happens when we have 32 core domains. This can cause
> tremendous contention and makes the entire cpu priority for deadlines
> useless. Might as well rip out the code.
> 

Right. So, rationale for not taking any lock in the find() path (at the
risk of getting bogus values) is that we don't want to pay to much in
terms of contention, when also considering the fact that find_lock_later_
rq() might then release the rq lock, possibly making the search useless
(if things change in the meantime anyway). The update path is instead
guarded by a lock, to ensure consistency.

Experiments on reasonably big machines (48-cores IIRC) showed that the
approach was "good enough", so we looked somewhere else to improve
things (as there are many to improve :). This of course doesn't prevent
us to look at this again now and see if we need to do something about it.

Having numbers about introduced overhead and wrong decisions caused by
the lockless find() path would help a lot understanding what (and can)
be done.

Thanks!

- Juri

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ