linux-kernel - Re: [PATCH 1/2] sched: Fix balance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200911122547.GI1362448@hirez.programming.kicks-ass.net>
Date:   Fri, 11 Sep 2020 14:25:47 +0200
From:   peterz@...radead.org
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     mingo@...nel.org, vincent.guittot@...aro.org, tglx@...utronix.de,
        linux-kernel@...r.kernel.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bristot@...hat.com, swood@...hat.com
Subject: Re: [PATCH 1/2] sched: Fix balance_callback()

On Fri, Sep 11, 2020 at 01:17:02PM +0100, Valentin Schneider wrote:
> On 11/09/20 09:17, Peter Zijlstra wrote:
> > The intent of balance_callback() has always been to delay executing
> > balancing operations until the end of the current rq->lock section.
> > This is because balance operations must often drop rq->lock, and that
> > isn't safe in general.
> >
> > However, as noted by Scott, there were a few holes in that scheme;
> > balance_callback() was called after rq->lock was dropped, which means
> > another CPU can interleave and touch the callback list.
> >
> 
> So that can be say __schedule() tail racing with some setprio; what's the
> worst that can (currently) happen here? Something like say two consecutive
> enqueuing of push_rt_tasks() to the callback list?

Yeah, but that isn't in fact the case I worry most about.

What can happen (and what I've spotted once before) is that someone
attempts to enqueue a balance_callback from a rq->lock region that
doesn't handle the calls.

Currently that 'works', that is, it will get ran _eventually_. But
ideally we'd want that to not work and issue a WARN. We want the
callbacks to be timely.

So basically all of these machinations we in order to add the WARN :-)