lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFyWiRPsfySBnEpkfHiqzBt=uZgeBnJrDA_b=ndW4BEzGw@mail.gmail.com>
Date:	Wed, 1 Apr 2015 08:36:36 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Chris J Arges <chris.j.arges@...onical.com>
Cc:	Rafael David Tinoco <inaddy@...ntu.com>,
	Ingo Molnar <mingo@...nel.org>, Peter Anvin <hpa@...or.com>,
	Jiang Liu <jiang.liu@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Jens Axboe <axboe@...nel.dk>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Gema Gomez <gema.gomez-solano@...onical.com>,
	"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: smp_call_function_single lockups

On Wed, Apr 1, 2015 at 7:32 AM, Chris J Arges
<chris.j.arges@...onical.com> wrote:
>
> I included the full patch in reply to Ingo's email, and when running with that
> I no longer get the ack_APIC_irq WARNs.

Ok. That means that the printk's themselves just change timing enough,
or change the compiler instruction scheduling so that it hides the
apic problem.

Which very much indicates that these things are interconnected.

For example, Ingo's printk patch does

                        cfg->move_in_progress =
                           cpumask_intersects(cfg->old_domain, cpu_online_mask);
+                       if (cfg->move_in_progress)
+                               pr_info("apic: vector %02x,
same-domain move in progress\n", cfg->vector);
                        cpumask_and(cfg->domain, cfg->domain, tmp_mask);

and that means that now the setting of move_in_progress is serialized
with the cpumask_and() in a way that it wasn't before.

And while the code takes the "vector_lock" and disables interrupts,
the interrupts themselves can happily continue on other cpu's, and
they don't take the vector_lock. Neither does send_cleanup_vector(),
which clears that bit, afaik.

I don't know. The locking there is odd.

> My next homework assignments are:
> - Testing with irqbalance disabled

Definitely.

> - Testing w/ the appropriate dump_stack() in Ingo's patch
> - L0 testing

Thanks,

                         Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ