linux-kernel - Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4AB8743F.5080309@cn.fujitsu.com>
Date:	Tue, 22 Sep 2009 14:52:47 +0800
From:	Xiao Guangrong <xiaoguangrong@...fujitsu.com>
To:	Suresh Siddha <suresh.b.siddha@...el.com>
CC:	Peter Zijlstra <peterz@...radead.org>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"mm-commits@...r.kernel.org" <mm-commits@...r.kernel.org>,
	"jens.axboe@...cle.com" <jens.axboe@...cle.com>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"nickpiggin@...oo.com.au" <nickpiggin@...oo.com.au>,
	"rusty@...tcorp.com.au" <rusty@...tcorp.com.au>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch
 added to -mm tree



Suresh Siddha wrote:
> On Sun, 2009-09-20 at 21:04 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>> I am referring to the missing csd_lock_wait() here that you had in the
>>> first version of your patch. Let's say, if cpu X is going offline, we
>>> need to ensure that the smp_call_function() initiated by cpu X (i.e.,
>>> smp_call_function IPI sent to some other cpu's from cpu X) got serviced
>>> before cpu X goes offline. We can't do csd_lock_wait() here, as that
>>> might deadlock (as all the other cpu's are already in stop machine with
>>> interrupts disabled).
>>>
>> It not happen because the preemption is disabled while send IPI request and
>> can't schedule to stop machine path, it also stop cpu down.
> 
> Xiao, I am getting confused. I am referring to case '1' mentioned by you
> here http://marc.info/?l=linux-kernel&m=125265516529139&w=2
> 

Ah, your meaning is that we can't do csd_lock_wait() in the CPU_DEAD
notification path in my first version patch? like below:

+static int
+hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
	...
+
+#ifdef CONFIG_HOTPLUG_CPU
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		local_irq_save(flags);
+		__generic_smp_call_function_interrupt(cpu, 0);
+		__generic_smp_call_function_single_interrupt(cpu, 0);
+		local_irq_restore(flags);
+
	/* Do you mean we can't do csd_lock_wait() here??? */
+		csd_lock_wait(&cfd->csd);
+		free_cpumask_var(cfd->cpumask);
+		break;
+#endif
+	};
+
+	return NOTIFY_OK;
+}

The CPU_DEAD notification is not sent in stop machine path, you can
see _cpu_down() function in kernel/cpu.c

Suresh, If I misunderstand your words again, could your elaborate it?

My first version patch is not clean and not complete that you point out in
previous mail:
" I am referring to this latest patch only. We are calling the interrupt
  handler manually and not doing the callbacks in that context. In future,
  we might see other side affects if we miss some of these smp ipi's."
  
How about the second patch?

Thanks,
Xiao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/