linux-kernel - Re: [PATCH -v2 1/5] sched: Fix ttwu() race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200721113719.GI119549@hirez.programming.kicks-ass.net>
Date:   Tue, 21 Jul 2020 13:37:19 +0200
From:   peterz@...radead.org
To:     Chris Wilson <chris@...is-wilson.co.uk>
Cc:     mingo@...nel.org, tglx@...utronix.de, linux-kernel@...r.kernel.org,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, paulmck@...nel.org, frederic@...nel.org,
        torvalds@...ux-foundation.org, hch@....de
Subject: Re: [PATCH -v2 1/5] sched: Fix ttwu() race

On Tue, Jul 21, 2020 at 11:49:05AM +0100, Chris Wilson wrote:
> Quoting Peter Zijlstra (2020-06-22 11:01:23)
> > @@ -2378,6 +2385,9 @@ static inline bool ttwu_queue_cond(int c
> >  static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
> >  {
> >         if (sched_feat(TTWU_QUEUE) && ttwu_queue_cond(cpu, wake_flags)) {
> > +               if (WARN_ON_ONCE(cpu == smp_processor_id()))
> > +                       return false;
> > +
> >                 sched_clock_cpu(cpu); /* Sync clocks across CPUs */
> >                 __ttwu_queue_wakelist(p, cpu, wake_flags);
> >                 return true;
> 
> We've been hitting this warning frequently, but have never seen the
> rcu-torture-esque oops ourselves.

How easy is it to hit this? What, if anything, can I do to make my own
computer go bang?

> <4> [181.766705] RIP: 0010:ttwu_queue_wakelist+0xbc/0xd0
> <4> [181.766710] Code: 00 00 00 5b 5d 41 5c 41 5d c3 31 c0 5b 5d 41 5c 41 5d c3 31 c0 f6 c3 08 74 f2 48 c7 c2 00 ad 03 00 83 7c 11 40 01 77 e4 eb 80 <0f> 0b 31 c0 eb dc 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 bf 17
> <4> [181.766726] RSP: 0018:ffffc90000003e08 EFLAGS: 00010046
> <4> [181.766733] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: ffff888276a00000
> <4> [181.766740] RDX: 000000000003ad00 RSI: ffffffff8232045b RDI: ffffffff8233103e
> <4> [181.766747] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
> <4> [181.766754] R10: 00000000d3fa25c3 R11: 0000000053712267 R12: ffff88825b912940
> <4> [181.766761] R13: 0000000000000000 R14: 0000000000000087 R15: 000000000003ad00
> <4> [181.766769] FS:  0000000000000000(0000) GS:ffff888276a00000(0000) knlGS:0000000000000000
> <4> [181.766777] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [181.766783] CR2: 000055b8245814e0 CR3: 0000000005610003 CR4: 00000000003606f0
> <4> [181.766790] Call Trace:
> <4> [181.766794]  <IRQ>
> <4> [181.766798]  try_to_wake_up+0x21b/0x690
> <4> [181.766805]  autoremove_wake_function+0xc/0x50
> <4> [181.766858]  __i915_sw_fence_complete+0x1ee/0x250 [i915]
> <4> [181.766912]  dma_i915_sw_fence_wake+0x2d/0x40 [i915]

Please, don't trim oopses..

> We are seeing this on the ttwu_queue() path, so with p->on_cpu=0, and the
> warning is cleared up by
> 
> -               if (WARN_ON_ONCE(cpu == smp_processor_id()))
> +               if (WARN_ON_ONCE(p->on_cpu && cpu == smp_processor_id()))
> 
> which would appear to restore the old behaviour for ttwu_queue() and
> seem to be consistent with the intent of this patch. Hopefully this
> helps identify the problem correctly.

Hurmph, that's actively wrong. We should never queue to self, as that
would result in self-IPI, which is not possible on a bunch of archs. It
works for you because x86 can in fact do that.

So ttwu_queue_cond() will only return true when:

 - target-cpu and current-cpu do not share cache;
   so it cannot be this condition, because you _always_
   share cache with yourself.

 - when WF_ON_CPU and target-cpu has nr_running <= 1;
   which means p->on_cpu == true.

So now you have cpu == smp_processor_id() && p->on_cpu == 1, however
your modified WARN contradicts that.

*puzzle*