lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150420053954.GA9923@gmail.com>
Date:	Mon, 20 Apr 2015 07:39:55 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Guenter Roeck <linux@...ck-us.net>, Rabin Vincent <rabin@....in>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ibm.com>
Subject: Re: qemu:arm test failure due to commit 8053871d0f7f (smp: Fix
 smp_call_function_single_async() locking)


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Sun, Apr 19, 2015 at 11:01 AM, Ingo Molnar <mingo@...nel.org> wrote:
> >
> > That's all fine and good, but why is an IPI sent to a non-existent 
> > CPU? It's not like we don't know which CPU is up and down.
> 
> I agree that the qemu-arm behavior smells like a bug, plain and 
> simple. Nobody sane should send random IPI's to CPU's that they 
> don't even know are up or not.

Yeah, and a warning would have caught that bug a bit earlier, at the 
cost of restricting the API:

> That said, I could imagine that we would have some valid case where 
> we just do a cross-cpu call to (for example) do lock wakeup, and the 
> code would use some optimistic algorithm that prods the CPU after 
> the lock has been released, and there could be some random race 
> where the lock data structure has already been released (ie imagine 
> the kind of optimistic unlocked racy access to "sem->owner" that we 
> discussed as part of the rwsem_spin_on_owner() thread recently).
> 
> So I _could_ imagine that somebody would want to do optimistic "prod 
> other cpu" calls that in all normal cases are for existing cpus, but 
> could be racy in theory.

Yes, and I don't disagree with such optimizations in principle (it 
allows less references to be taken in the fast path), but is it really 
safe?

If a CPU is going down and we potentially race against that, and send 
off an IPI, the IPI might be 'in flight' for an indeterminate amount 
of time, especially on wildly non-deterministic hardware like virtual 
platforms.

So if the CPU goes down during that time, the typical way a CPU goes 
down is that it ignores all IPIs that arrive after that. End result: 
the sender of the IPI may hang indefinitely (for the synchronous API), 
waiting for the CSD lock to be released...

For the non-wait API we could also hang, but in an even more 
interesting way: we won't hang trying to send to the downed CPU, we'd 
hang on the _next_ cross-call call, possibly sending to another CPU, 
because the CSD_FLAG_LOCK of the sender CPU is never released by the 
offlined CPU which was supposed to unlock it.

Also note the existing warning we already have in 
flush_smp_call_function_queue():

        /* There shouldn't be any pending callbacks on an offline CPU. */
        if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) &&
                     !warned && !llist_empty(head))) {
                warned = true;
                WARN(1, "IPI on offline CPU %d\n", smp_processor_id());

Couldn't this trigger in the opportunistic, imprecise IPI case, if 
IRQs are ever enabled when or after the CPU is marked offline?

My suggested warning would simply catch this kind of unsafe looking 
race a bit sooner, instead of silently ignoring the cross-call 
attempt.

Now your suggestion could be made to work by:

  - polling for CPU status in the CSD-lock code as well. (We don't do
    this currently.)

  - making sure that if a late IPI arrives to an already-down CPU it
    does not attempt to execute CSD functions. (I.e. silence the
    above, already existing warning.)

  - auditing the CPU-down path to make sure it does not get surprised
    by late external IPIs. (I think this should already be so, given
    the existing warning.)

  - IPIs can be pending indefinitely, so make sure a pending IPI won't 
    confuse the machinery after a CPU has been onlined again.

  - pending IPIs may consume hardware resources when not received 
    properly. For example I think x86 will keep trying to resend it. 
    Not sure there's any timeout mechanism on the hardware level - the 
    sending APIC might even get stuck? Audit all hardware for this 
    kind of possibility.

So unless I'm missing something, to me it looks like that the current 
code is only safe to be used on offline CPUs if the 'offline' CPU is 
never ever brought online in the first place.

> It doesn't sound like the qemu-arm case is that kind of situation, 
> though. That one just sounds like a stupid "let's send an ipi to a 
> cpu whether it exists or not".
> 
> But Icommitted it without any new warning, because I could in theory 
> see it as being a valid use.

So my point is that we already have a 'late' warning, and I think it 
actively hid the qemu-arm bug, because the 'offline' CPU was so 
offline (due to being non-existent) that it never had a chance to 
print a warning.

With an 'early' warning we could flush out such bugs (or code 
uncleanlinesses: clearly the ARM code was at least functional before) 
a bit sooner.

But I don't have strong feelings about it, SMP cross-call users are a 
lot less frequent than say locking facility users, so the strength of 
debugging isn't nearly as important and it's fine to me either way!

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ