lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 6 Sep 2018 09:56:32 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     pheragu@...eaurora.org
cc:     linux-kernel@...r.kernel.org, ckadabi@...eaurora.org,
        tsoni@...eaurora.org, bryanh@...eaurora.org,
        Prasad Sodagudi <psodagud@...eaurora.org>
Subject: Re: [PATCH] genirq: Avoid race between cpu hot plug and irq_desc()
 allocation paths

On Wed, 5 Sep 2018, pheragu@...eaurora.org wrote:

> On 2018-09-05 11:23, Thomas Gleixner wrote:
> > On Wed, 5 Sep 2018, Prakruthi Deepak Heragu wrote:
> > 
> > > One of the cores might have just allocated irq_desc() and other core
> > > might be doing irq migration in the hot plug path. In the hot plug path
> > > during the IRQ migration, for_each_active_irq macro is trying to get
> > > irqs whose bits are set in allocated_irqs bit map but there is no return
> > > value check after irq_to_desc for desc validity.
> > 
> > Confused. All parts involved, irq allocation/deallocation and the CPU
> > hotplug code take sparse_irq_lock to prevent exavtly that.
> > 
> Removing the NULL pointer check and adding this sparse_irq_lock
> that you suggested will solve this issue. The code looks like
> this now. Is this okay?

No, it's not okay at all. You _CANNOT_ take sparse_irq_lock() there.

I wrote that _ALL_ parts take that lock already. Care to look at the code?

takedown_cpu()
{
 	/*
	 * Prevent irq alloc/free while the dying cpu reorganizes the
         * interrupt affinities.
         */
        irq_lock_sparse();

        /*
         * So now all preempt/rcu users must observe !cpu_active().
         */
	err = stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu));

and __cpu_disable() which in turn calls irq_migrate_all_off_this_cpu() is
called from take_cpu_down(). So this deadlocks in any case.

I have no idea what kind of modifications you have in your kernel which
make

   1) The thing explode in the first place

   2) Not immediately deadlock with that hack you just sent

and honestly I don't want to know at all.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ