lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240916050811.473556-7-Neeraj.Upadhyay@amd.com>
Date: Mon, 16 Sep 2024 10:38:11 +0530
From: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
To: <linux-kernel@...r.kernel.org>
CC: <john.johansen@...onical.com>, <paul@...l-moore.com>, <jmorris@...ei.org>,
	<serge@...lyn.com>, <linux-security-module@...r.kernel.org>,
	<gautham.shenoy@....com>, <Santosh.Shukla@....com>, <Ananth.Narayan@....com>,
	<Raghavendra.KodsaraThimmappa@....com>, <paulmck@...nel.org>,
	<boqun.feng@...il.com>, <vinicius.gomes@...el.com>, <mjguzik@...il.com>,
	<dennis@...nel.org>, <tj@...nel.org>, <cl@...ux.com>, <linux-mm@...ck.org>,
	<rcu@...r.kernel.org>
Subject: [RFC 6/6] apparmor: Switch labels to percpu ref managed mode

Nginx performance testing with Apparmor enabled (with Nginx running in
unconfined profile), on kernel versions 6.1 and 6.5 show significant
drop in throughput scalability when Nginx workers are scaled to use
higher number of CPUs across various L3 cache domains.

Below is one sample data on the throughput scalability loss, based on
results on AMD Zen4 system with 96 CPUs with SMT core count 2:

Config      Cache Domains     apparmor=off        apparmor=on
                             scaling eff (%)      scaling eff (%)
8C16T          1                  100%             100%
16C32T         2                   95%              94%
24C48T         3                   94%              93%
48C96T         6                   92%              88%
96C192T        12                  85%              68%

There is a significant drop in scaling efficiency for 96 cores/192 SMT
threads.

Perf tool shows most of the contention coming from below places:
6.56%     nginx  [kernel.vmlinux]      [k] apparmor_current_getsecid_subj
6.22%     nginx  [kernel.vmlinux]      [k] apparmor_file_open

The majority of the CPU cycles is found to be due to memory contention
in atomic_fetch_add and atomic_fetch_sub operations from kref_get() and
kref_put() operations on AppArmor labels.

A part of the contention was fixed with commit 2516fde1fa00 ("apparmor:
Optimize retrieving current task secid"). After including this commit, the
scaling efficiency improved to below:

Config      Cache Domains     apparmor=on        apparmor=on (patched)
                             scaling eff (%)      scaling eff (%)
8C16T          1                  100%             100%
16C32T         2                   97%              93%
24C48T         3                   94%              92%
48C96T         6                   88%              88%
96C192T        12                  65%              79%

However, the scaling efficiency impact is still significant even after
including the commit. Also, the performance impact is even higher for
>192 CPUs. In addition, the memory contention impact would increase
when there is a high frequency of label update operations and labels
are marked stale more frequently.

Use the new percpu managed mode for tracking release of all Apparmor
labels. Using percpu refcount for Apparmor label's refcounting improves
throughput scalability for Nginx:

Config      Cache Domains     apparmor=on (percpuref)
                              scaling eff (%)
8C16T          1                  100%
16C32T         2                   96%
24C48T         3                   94%
48C96T         6                   93%
96C192T        12                  90%

Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
---

The apparmor_file_open() refcount contention has been resolved recently
with commit f4fee216df7d ("apparmor: try to avoid refing the label in
apparmor_file_open"). I have posted this series to get feedback on the
approach to improve refcount scalability within apparmor subsystem.


 security/apparmor/label.c     | 1 +
 security/apparmor/policy_ns.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/security/apparmor/label.c b/security/apparmor/label.c
index aa9e6eac3ecc..016a45a180b1 100644
--- a/security/apparmor/label.c
+++ b/security/apparmor/label.c
@@ -710,6 +710,7 @@ static struct aa_label *__label_insert(struct aa_labelset *ls,
 	rb_link_node(&label->node, parent, new);
 	rb_insert_color(&label->node, &ls->root);
 	label->flags |= FLAG_IN_TREE;
+	percpu_ref_switch_to_managed(&label->count);
 
 	return aa_get_label(label);
 }
diff --git a/security/apparmor/policy_ns.c b/security/apparmor/policy_ns.c
index 1f02cfe1d974..18eb58b68a60 100644
--- a/security/apparmor/policy_ns.c
+++ b/security/apparmor/policy_ns.c
@@ -124,6 +124,7 @@ static struct aa_ns *alloc_ns(const char *prefix, const char *name)
 		goto fail_unconfined;
 	/* ns and ns->unconfined share ns->unconfined refcount */
 	ns->unconfined->ns = ns;
+	percpu_ref_switch_to_managed(&ns->unconfined->label.count);
 
 	atomic_set(&ns->uniq_null, 0);
 
@@ -377,6 +378,7 @@ int __init aa_alloc_root_ns(void)
 	}
 	kernel_t = &kernel_p->label;
 	root_ns->unconfined->ns = aa_get_ns(root_ns);
+	percpu_ref_switch_to_managed(&root_ns->unconfined->label.count);
 
 	return 0;
 }
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ