lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 07 Jul 2020 11:20:49 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Dave Jones <davej@...emonkey.org.uk>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Linux Kernel <linux-kernel@...r.kernel.org>, mingo@...nel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        paul.gortmaker@...driver.com, paulmck@...nel.org
Subject: Re: weird loadavg on idle machine post 5.7


On 07/07/20 09:17, Peter Zijlstra wrote:
> On Tue, Jul 07, 2020 at 12:56:04AM +0100, Valentin Schneider wrote:
>
>> > @@ -2605,8 +2596,20 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
>> >        *
>> >        * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
>> >        * __schedule().  See the comment for smp_mb__after_spinlock().
>> > +	 *
>> > +	 * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
>> > +	 * schedule()'s deactivate_task() has 'happened' and p will no longer
>> > +	 * care about it's own p->state. See the comment in __schedule().
>> >        */
>> > -	smp_rmb();
>> > +	smp_acquire__after_ctrl_dep();
>>
>> Apologies for asking again, but I'm foolishly hopeful I'll someday be able
>> to grok those things without half a dozen tabs open with documentation and
>> Paul McKenney papers.
>>
>> Do I get it right that the 'acquire' part hints this is equivalent to
>> issuing a load-acquire on whatever was needed to figure out whether or not
>> the take the branch (in this case, p->on_rq, amongst other things); IOW
>> ensures any memory access appearing later in program order has to happen
>> after the load?
>>
>> That at least explains to me the load->{load,store} wording in
>> smp_acquire__after_ctrl_dep().
>
> Yes.
>
> So the thing is that hardware MUST NOT speculate stores, or rather, if
> it does, it must take extreme measures to ensure they do not become
> visible in any way shape or form, since speculative stores lead to
> instant OOTA problems.
>
> Therefore we can say that branches order stores and if the branch
> condition depends on a load, we get a load->store order. IOW the load
> must complete before we can resolve the branch, which in turn enables
> the store to become visible/happen.
>

Right, I think that point is made clear in memory-barriers.txt.

> If we then add an smp_rmb() to the branch to order load->load, we end up
> with a load->{load,store} ordering, which is equivalent to a
> load-acquire.
>
> The reason to do it like that, is that load-aquire would otherwise
> require an smp_mb(), since for many platforms that's the only barrier
> that has load->store ordering.
>
> The down-side of doing it like this, as Paul will be quick to point out,
> is that the C standard doesn't recognise control dependencies and thus
> the compiler would be in its right to 'optimize' our conditional away.
>

Yikes!

> We're relying on the compilers not having done this in the past and
> there being sufficient compiler people interested in compiling Linux to
> avoid this from happening.
>
>
> Anyway, this patch is basically:
>
>       LOAD p->state		LOAD-ACQUIRE p->on_rq == 0
>       MB
>       STORE p->on_rq, 0	STORE p->state, TASK_WAKING
>
> which ensures the TASK_WAKING store happens after the p->state load.
> Just a wee bit complicated due to not actually adding any barriers while
> adding additional ordering.
>

Your newer changelog also helped in that regards.

Thanks a ton for the writeup!

> Anyway, let me now endeavour to write a coherent Changelog for this mess
> :-(

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ