lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <010101745b9ada0c-eb39681f-a76f-479a-8bad-0fbbe605aea9-000000@us-west-2.amazonses.com>
Date:   Sat, 5 Sep 2020 00:11:49 +0000
From:   "Isaac J. Manjarres" <isaacm@...eaurora.org>
To:     linux-kernel@...r.kernel.org
Cc:     "Isaac J. Manjarres" <isaacm@...eaurora.org>,
        christian.brauner@...ntu.com, akpm@...ux-foundation.org,
        mingo@...nel.org, peterz@...radead.org, ebiederm@...ssion.com,
        esyr@...hat.com, tglx@...utronix.de, christian@...lner.me,
        areber@...hat.com, shakeelb@...gle.com, cyphar@...har.com,
        psodagud@...eaurora.org, pratikp@...eaurora.org
Subject: [RFC PATCH] fork: Free per-cpu cached vmalloc'ed thread stacks with

The per-cpu cached vmalloc'ed stacks are currently freed in the
CPU hotplug teardown path by the free_vm_stack_cache() callback,
which invokes vfree(), which may result in purging the list of
lazily freed vmap areas.

Purging all of the lazily freed vmap areas can take a long time
when the list of vmap areas is large. This is problematic, as
free_vm_stack_cache() is invoked prior to the offline CPU's timers
being migrated. This is not desirable as it can lead to timer
migration delays in the CPU hotplug teardown path, and timer callbacks
will be invoked long after the timer has expired.

For example, on a system that has only one online CPU (CPU 1) that is
running a heavy workload, and another CPU that is being offlined,
the online CPU will invoke free_vm_stack_cache() to free the cached
vmalloc'ed stacks for the CPU being offlined. When there are 2702
vmap areas that total to 13498 pages, free_vm_stack_cache() takes
over 2 seconds to execute:

[001]   399.335808: cpuhp_enter: cpu: 0005 target:   0 step:  67 (free_vm_stack_cache)

/* The first vmap area to be freed */
[001]   399.337157: __purge_vmap_area_lazy: [0:2702] 0xffffffc033da8000 - 0xffffffc033dad000 (5 : 13498)

/* After two seconds */
[001]   401.528010: __purge_vmap_area_lazy: [1563:2702] 0xffffffc02fe10000 - 0xffffffc02fe15000 (5 : 5765)

Instead of freeing the per-cpu cached vmalloc'ed stacks synchronously
with respect to the CPU hotplug teardown state machine, free them
asynchronously to help move along the CPU hotplug teardown state machine
quickly.

Signed-off-by: Isaac J. Manjarres <isaacm@...eaurora.org>
---
 kernel/fork.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 4d32190..68346a0 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -202,7 +202,7 @@ static int free_vm_stack_cache(unsigned int cpu)
 		if (!vm_stack)
 			continue;
 
-		vfree(vm_stack->addr);
+		vfree_atomic(vm_stack->addr);
 		cached_vm_stacks[i] = NULL;
 	}
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ