linux-kernel - Re: [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 15 Feb 2011 11:39:10 +0300
From:	Cyrill Gorcunov <gorcunov@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Suresh Siddha <suresh.b.siddha@...el.com>,
	Yinghai Lu <yhlu.kernel@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes
 using cluster groups

On 2/15/11, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Cyrill Gorcunov <gorcunov@...il.com> wrote:
>
>> On 02/14/2011 02:45 PM, Ingo Molnar wrote:
>> >
>> >* Cyrill Gorcunov<gorcunov@...il.com>  wrote:
>> >
>> >>In the case of x2apic cluster mode we can group IPI register writes
>> >> based on the
>> >>cluster group instead of individual per-cpu destiantion messages. This
>> >> reduces the
>> >>apic register writes and reduces the amount of IPI messages (in the best
>> >> case we
>> >>can reduce it by a factor of 16).
>> >>
>> >>With this change, microbenchmark measuring the cost of
>> >> flush_tlb_others(), with
>> >>the flush tlb IPI being sent from a cpu in the socket-1 to all the
>> >> logical cpus in
>> >>socket-2 (on a Westmere-EX system that has 20 logical cpus in a socket)
>> >> is 3x
>> >>times better now (compared to the former 'send one-by-one' algorithm).
>> >
>> >Pretty nice!
>> >
>> >I have a few structural and nitpicking comments:
>>
>> Thanks a lot for review, Ingo! I'll address all the nits during this week.
>>
>> ...
>> >
>> >>+void x2apic_init_cpu_notifier(void)
>> >>+{
>> >>+	int cpu = smp_processor_id();
>> >>
>> >>+	zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
>> >>+	zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
>> >>+	BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu));
>> >
>> >Such a BUG_ON() is not particularly user friendly - and this could
>> > trigger during
>> >CPU hotplug events, i.e. while the system is fully booted up, right?
>> >
>> >Thanks,
>> >
>> >	Ingo
>>
>> Yup is not that much friendly but it's called during system bootup,
>> hotplug events are handled by
>>
>> +static int __cpuinit
>> +cluster_setup(struct notifier_block *nfb, unsigned long action, void
>> *hcpu)
>> +{
>> +	unsigned int cpu = (unsigned long)hcpu;
>> +	int err = 0;
>> +
>> +	switch (action) {
>> +	case CPU_UP_PREPARE:
>> +		zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
>> +		zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
>> +		if (!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu)) {
>> +			free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
>> +			free_cpumask_var(per_cpu(ipi_mask, cpu));
>> +			err = -ENOMEM;
>> +		}
>> +		break;
>>
>> so it returns -ENOMEM if failed. And btw just noted that we forgot to make
>> x2apic_init_cpu_notifier being in __init section.
>>
>> Or I miss something?
>
> Is there no GFP_NOFAIL or GFP_FAIL_ON_PANIC variant that could be used the
> 'must not
> fail' property of the boot-time allocation?
>
> Thanks,
>
> 	Ingo
>

If only i'm not missing something obvious we can set GFP_NOFAIL ending
in endless loop if allocation failed (slab should be already running
at this point of boot). Probably another option might be to switch to
no-apic mode if there is no enough memory to allocate this masks
(though i guess if allocation failed at this point we likely to fail
in further allocations in kernel anyway)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/