[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxiRgXy-0WkTBbt6qNJ0+wbE=xBQLyOYnD7nPwQP1weV9g@mail.gmail.com>
Date: Thu, 7 Nov 2024 11:19:13 +0100
From: Amir Goldstein <amir73il@...il.com>
To: Vinicius Costa Gomes <vinicius.gomes@...el.com>, brauner@...nel.org
Cc: hu1.chen@...el.com, miklos@...redi.hu, malini.bhandaru@...el.com,
tim.c.chen@...el.com, mikko.ylinen@...el.com, linux-unionfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/4] overlayfs: Optimize override/revert creds
On Thu, Nov 7, 2024 at 1:57 AM Vinicius Costa Gomes
<vinicius.gomes@...el.com> wrote:
>
> Hi,
>
> Changes from v3:
> - Another reorganization of the series: separate the pure mechanical
> changes into their own (Amir Goldstein)
>
> The series now reads:
>
> Patch 1: Introduce the _light() version of the override/revert cred operations;
> Patch 2: Convert backing-file.c to use those;
> Patch 3: Mechanical change to introduce the ovl_revert_creds() helper;
> Patch 4: Make the ovl_{override,convert}_creds() use the _light()
> creds helpers, and fix the reference counting issue that would happen;
>
For the record, this series depends on backing_file API cleanup patch by Miklos:
https://lore.kernel.org/linux-fsdevel/20241021103340.260731-1-mszeredi@redhat.com/
> Changes from v2:
> - Removed the "convert to guard()/scoped_guard()" patches (Miklos Szeredi);
> - In the overlayfs code, convert all users of override_creds()/revert_creds() to the _light() versions by:
> 1. making ovl_override_creds() use override_creds_light();
> 2. introduce ovl_revert_creds() which calls revert_creds_light();
> 3. convert revert_creds() to ovl_revert_creds()
> (Amir Goldstein);
> - Fix an potential reference counting issue, as the lifetime
> expectations of the mounter credentials are different (Christian
> Brauner);
>
I pushed these patches to:
https://github.com/amir73il/linux/commits/ovl_creds
rebased overlayfs-next on top of them and tested.
Christian,
Since this work is mostly based on your suggestions,
I thought that you might want to author and handle this PR?
Would you like to take the patches from ovl_creds (including the backing_file
API cleanup) to a stable branch in your tree for me to base overlayfs-next on?
Or would you rather I include them in the overlayfs PR for v6.13 myself?
Thanks,
Amir.
P.S. some of the info below is relevant for the PR message and
some of it is completely stale...
> The series is now much simpler:
>
> Patch 1: Introduce the _light() version of the override/revert cred operations;
> Patch 2: Convert backing-file.c to use those;
> Patch 3: Do the conversion to use the _light() version internally;
> Patch 4: Fix a potential refcounting issue
>
> Changes from v1:
> - Re-organized the series to be easier to follow, more details below
> (Miklos Szeredi and Amir Goldstein);
>
> The series now reads as follows:
>
> Patch 1: Introduce the _light() version of the override/revert cred operations;
> Patch 2: Convert backing-file.c to use those;
> Patch 3: Introduce the overlayfs specific _light() helper;
> Patch 4: Document the cases that the helper cannot be used (critical
> section may change the cred->usage counter);
> Patch 5: Convert the "rest" of overlayfs to the _light() helpers (mostly mechanical);
> Patch 6: Introduce the GUARD() helpers;
> Patch 7: Convert backing-file.c to the GUARD() helpers;
> Patch 8-15: Convert each overlayfs/ file to use the GUARD() helpers,
> also explain the cases in which the scoped_guard() helper is
> used. Note that a 'goto' jump that crosses the guard() should
> fail to compile, gcc has a bug that fails to detect the
> error[1].
> Patch 16: Remove the helper introduced in Patch 3 to close the series,
> as it is no longer used, everything was converted to use the
> safer/shorter GUARD() helpers.
this info is stale and confusing in the context of this series.
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91951
>
> This bug was also noticed here:
>
> https://lore.kernel.org/all/20240730050927.GC5334@ZenIV/
>
> Link to v1:
>
> https://lore.kernel.org/r/20240403021808.309900-1-vinicius.gomes@intel.com/
>
> Changes from RFC v3:
> - Removed the warning "fixes" patches, as they could hide potencial
> bugs (Christian Brauner);
> - Added "cred-specific" macros (Christian Brauner), from my side,
> added a few '_' to the guards to signify that the newly introduced
> helper macros are preferred.
> - Changed a few guard() to scoped_guard() to fix the clang (17.0.6)
> compilation error about 'goto' bypassing variable initialization;
>
> Link to RFC v3:
>
> https://lore.kernel.org/r/20240216051640.197378-1-vinicius.gomes@intel.com/
>
> Changes from RFC v2:
> - Added separate patches for the warnings for the discarded const
> when using the cleanup macros: one for DEFINE_GUARD() and one for
> DEFINE_LOCK_GUARD_1() (I am uncertain if it's better to squash them
> together);
> - Reordered the series so the backing file patch is the first user of
> the introduced helpers (Amir Goldstein);
> - Change the definition of the cleanup "class" from a GUARD to a
> LOCK_GUARD_1, which defines an implicit container, that allows us
> to remove some variable declarations to store the overriden
> credentials (Amir Goldstein);
> - Replaced most of the uses of scoped_guard() with guard(), to reduce
> the code churn, the remaining ones I wasn't sure if I was changing
> the behavior: either they were nested (overrides "inside"
> overrides) or something calls current_cred() (Amir Goldstein).
>
> New questions:
> - The backing file callbacks are now called with the "light"
> overriden credentials, so they are kind of restricted in what they
> can do with their credentials, is this acceptable in general?
> - in ovl_rename() I had to manually call the "light" the overrides,
> both using the guard() macro or using the non-light version causes
> the workload to crash the kernel. I still have to investigate why
> this is happening. Hints are appreciated.
>
> Link to the RFC v2:
>
> https://lore.kernel.org/r/20240125235723.39507-1-vinicius.gomes@intel.com/
>
> Original cover letter (lightly edited):
>
> It was noticed that some workloads suffer from contention on
> increasing/decrementing the ->usage counter in their credentials,
> those refcount operations are associated with overriding/reverting the
> current task credentials. (the linked thread adds more context)
>
> In some specialized cases, overlayfs is one of them, the credentials
> in question have a longer lifetime than the override/revert "critical
> section". In the overlayfs case, the credentials are created when the
> fs is mounted and destroyed when it's unmounted. In this case of long
> lived credentials, the usage counter doesn't need to be
> incremented/decremented.
>
> Add a lighter version of credentials override/revert to be used in
> these specialized cases. To make sure that the override/revert calls
> are paired, add a cleanup guard macro. This was suggested here:
>
> https://lore.kernel.org/all/20231219-marken-pochen-26d888fb9bb9@brauner/
>
> With a small number of tweaks:
> - Used inline functions instead of macros;
> - A small change to store the credentials into the passed argument,
> the guard is now defined as (note the added '_T ='):
>
> DEFINE_GUARD(cred, const struct cred *, _T = override_creds_light(_T),
> revert_creds_light(_T));
>
> - Allow "const" arguments to be used with these kind of guards;
>
> Some comments:
> - If patch 1/5 and 2/5 are not a good idea (adding the cast), the
> alternative I can see is using some kind of container for the
> credentials;
> - The only user for the backing file ops is overlayfs, so these
> changes make sense, but may not make sense in the most general
> case;
>
> For the numbers, some from 'perf c2c', before this series:
> (edited to fit)
>
> #
> # ----- HITM ----- Shared
> # Num RmtHitm LclHitm Symbol Object Source:Line Node
> # ..... ....... ....... .......................... ................ .................. ....
> #
> -------------------------
> 0 412 1028
> -------------------------
> 41.50% 42.22% [k] revert_creds [kernel.vmlinux] atomic64_64.h:39 0 1
> 15.05% 10.60% [k] override_creds [kernel.vmlinux] atomic64_64.h:25 0 1
> 0.73% 0.58% [k] init_file [kernel.vmlinux] atomic64_64.h:25 0 1
> 0.24% 0.10% [k] revert_creds [kernel.vmlinux] cred.h:266 0 1
> 32.28% 37.16% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
> 9.47% 8.75% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
> 0.49% 0.58% [k] inode_owner_or_capable [kernel.vmlinux] mnt_idmapping.h:81 0 1
> 0.24% 0.00% [k] generic_permission [kernel.vmlinux] namei.c:354 0
>
> -------------------------
> 1 50 103
> -------------------------
> 100.00% 100.00% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
>
> -------------------------
> 2 50 98
> -------------------------
> 96.00% 96.94% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
> 2.00% 1.02% [k] update_load_avg [kernel.vmlinux] atomic64_64.h:25 0 1
> 0.00% 2.04% [k] update_load_avg [kernel.vmlinux] fair.c:4118 0
> 2.00% 0.00% [k] update_cfs_group [kernel.vmlinux] fair.c:3932 0 1
>
> after this series:
>
> #
> # ----- HITM ----- Shared
> # Num RmtHitm LclHitm Symbol Object Source:Line Node
> # ..... ....... ....... .................... ................ ................ ....
> #
> -------------------------
> 0 54 88
> -------------------------
> 100.00% 100.00% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
>
> -------------------------
> 1 48 83
> -------------------------
> 97.92% 97.59% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
> 2.08% 1.20% [k] update_load_avg [kernel.vmlinux] atomic64_64.h:25 0 1
> 0.00% 1.20% [k] update_load_avg [kernel.vmlinux] fair.c:4118 0 1
>
> -------------------------
> 2 28 44
> -------------------------
> 85.71% 79.55% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
> 14.29% 20.45% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
>
> The contention is practically gone.
>
> Link: https://lore.kernel.org/all/20231018074553.41333-1-hu1.chen@intel.com/
>
> Vinicius Costa Gomes (4):
> cred: Add a light version of override/revert_creds()
> fs/backing-file: Convert to revert/override_creds_light()
> ovl: use wrapper ovl_revert_creds()
> ovl: Optimize override/revert creds
>
> fs/backing-file.c | 20 ++++++++++----------
> fs/overlayfs/copy_up.c | 2 +-
> fs/overlayfs/dir.c | 17 +++++++++++------
> fs/overlayfs/file.c | 14 +++++++-------
> fs/overlayfs/inode.c | 20 ++++++++++----------
> fs/overlayfs/namei.c | 10 +++++-----
> fs/overlayfs/overlayfs.h | 1 +
> fs/overlayfs/readdir.c | 8 ++++----
> fs/overlayfs/util.c | 11 ++++++++---
> fs/overlayfs/xattrs.c | 9 ++++-----
> include/linux/cred.h | 18 ++++++++++++++++++
> kernel/cred.c | 6 +++---
> 12 files changed, 82 insertions(+), 54 deletions(-)
>
> --
> 2.47.0
>
Powered by blists - more mailing lists