linux-kernel - Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1711071729270.1716@nanos>
Date:   Fri, 10 Nov 2017 01:43:23 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Prarit Bhargava <prarit@...hat.com>
cc:     linux-kernel@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Piotr Luc <piotr.luc@...el.com>,
        Kan Liang <kan.liang@...el.com>, Borislav Petkov <bp@...e.de>,
        Stephane Eranian <eranian@...gle.com>,
        Arvind Yadav <arvind.yadav.cs@...il.com>,
        Andy Lutomirski <luto@...nel.org>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Tom Lendacky <thomas.lendacky@....com>,
        He Chen <he.chen@...ux.intel.com>,
        Mathias Krause <minipli@...glemail.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>
Subject: Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id
 array

On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> virtual machines.

What has this to do with virtual machines? The very same issue is on
physcial hardware because this is called from the early CPU bringup code
with interrupts and preemption disabled.

> +	/* Allocate and copy a new array */
> +	ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_ATOMIC);
> +	BUG_ON(!ltp_pkg_map_new);

Having an allocation in that code path is a bad idea. First of all the
error handling in there is just crap, because the only thing you can do is
panic. Aside of that atomic allocations should be avoided when we can and
we can.

Sorry I missed that when looking at the patch earlier. Something along this
makes it work proper:

struct pkg_map {
	unsigned int	size;
	unsigned int	used;
	unsigned int	map[0];
};

static struct pkg_map *logical_to_physical_pkg_map __read_mostly;

static int resize_pkg_map(void)
{
	struct pkg_map *newmap, *oldmap = logical_to_physical_pkg_map;
	int size;

	if (oldmap->size > oldmap->used)
		return 0;

	size = sizeof(*oldmap) + sizeof(unsigned int) * oldmap->size;
	newmap = kzalloc(size + sizeof(unsigned int));
	if (!newmap)
		return -ENOMEM;

	memcpy(newmap, oldmap, size);
	newmap->size++;
	logical_to_physical_pkg_map = newmap;
	kfree(oldmap);
	return 0;
}

int __cpu_up(....)
{
	if (resize_pkg_map())
		return -ENOMEM;
	return smp_ops.cpu_up(....);
}

static void update_map(....)
{
	if (find_map())
		return;
	map->map[map->used] = physid;
	map->used++;
}

static void smp_init_package_map()
{
	struct pkg_map *map;

	map = kzalloc(sizeof(*newmap) + sizeof(unsigned int));
	map->size = 1;
}

See? No BUG_ON() in the early secondary cpu boot code. If memory allocation
fails the thing goes back gracefully.

Locking/barriers omitted as you have choices here:

   1) RCU

      Needs the proper RCU magic for the lookup and the pointer swap.

      That requires also a proper barrier between the assignement of the
      new id and the increment of the used count plus the corresponding one
      on the read side.

   2) mutex

      Must be held when swapping the pointers and across lookup

      Same barrier requirement as RCU

   3) raw_spinlock

      Must be held when swapping the pointers and across lookup

      No barriers as long as you hold the lock across the assignement and
      increment.

All of that works. There is no way to make sure that a lookup is fully
serialized against a concurrent update. Even if the lookup holds
cpu_read_lock() the new package might arrive right after the unlock.

Thanks,

	tglx