lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240125235723.39507-1-vinicius.gomes@intel.com>
Date: Thu, 25 Jan 2024 15:57:19 -0800
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: brauner@...nel.org,
	amir73il@...il.com,
	hu1.chen@...el.com
Cc: miklos@...redi.hu,
	malini.bhandaru@...el.com,
	tim.c.chen@...el.com,
	mikko.ylinen@...el.com,
	lizhen.you@...el.com,
	linux-unionfs@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Vinicius Costa Gomes <vinicius.gomes@...el.com>
Subject: [RFC v2 0/4] overlayfs: Optimize override/revert creds

Hi,

It was noticed that some workloads suffer from contention on
increasing/decrementing the ->usage counter in their credentials,
those refcount operations are associated with overriding/reverting the
current task credentials. (the linked thread adds more context)

In some specialized cases, overlayfs is one of them, the credentials
in question have a longer lifetime than the override/revert "critical
section". In the overlayfs case, the credentials are created when the
fs is mounted and destroyed when it's unmounted. In this case of long
lived credentials, the usage counter doesn't need to be
incremented/decremented.

Add a lighter version of credentials override/revert to be used in
these specialized cases. To make sure that the override/revert calls
are paired, add a cleanup guard macro. This was suggested here:

https://lore.kernel.org/all/20231219-marken-pochen-26d888fb9bb9@brauner/

With a small number of tweaks:
 - Used inline functions instead of macros;
 - A small change to store the credentials into the passed argument,
   the guard is now defined as (note the added '_T ='):
   
      DEFINE_GUARD(cred, const struct cred *, _T = override_creds_light(_T),
                  revert_creds_light(_T));

 - Allow "const" arguments to be used with these kind of guards;

Some comments:
 - If patch 1/4 is not a good idea (adding the cast), the alternative
   I can see is using some kind of container for the credentials;
 - The only user for the backing file ops is overlayfs, so these
   changes make sense, but may not make sense in the most general
   case;

For the numbers, some from 'perf c2c', before this series:
(edited to fit)

#
#        ----- HITM -----                                        Shared                          
#   Num  RmtHitm  LclHitm                      Symbol            Object         Source:Line  Node
# .....  .......  .......  ..........................  ................  ..................  ....
#
  -------------------------
      0      412     1028  
  -------------------------
          41.50%   42.22%  [k] revert_creds            [kernel.vmlinux]  atomic64_64.h:39     0  1
          15.05%   10.60%  [k] override_creds          [kernel.vmlinux]  atomic64_64.h:25     0  1
           0.73%    0.58%  [k] init_file               [kernel.vmlinux]  atomic64_64.h:25     0  1
           0.24%    0.10%  [k] revert_creds            [kernel.vmlinux]  cred.h:266           0  1
          32.28%   37.16%  [k] generic_permission      [kernel.vmlinux]  mnt_idmapping.h:81   0  1
           9.47%    8.75%  [k] generic_permission      [kernel.vmlinux]  mnt_idmapping.h:81   0  1
           0.49%    0.58%  [k] inode_owner_or_capable  [kernel.vmlinux]  mnt_idmapping.h:81   0  1
           0.24%    0.00%  [k] generic_permission      [kernel.vmlinux]  namei.c:354          0

  -------------------------
      1       50      103  
  -------------------------
         100.00%  100.00%  [k] update_cfs_group  [kernel.vmlinux]  atomic64_64.h:15   0  1

  -------------------------
      2       50       98  
  -------------------------
          96.00%   96.94%  [k] update_cfs_group  [kernel.vmlinux]  atomic64_64.h:15   0  1
           2.00%    1.02%  [k] update_load_avg   [kernel.vmlinux]  atomic64_64.h:25   0  1
           0.00%    2.04%  [k] update_load_avg   [kernel.vmlinux]  fair.c:4118        0
           2.00%    0.00%  [k] update_cfs_group  [kernel.vmlinux]  fair.c:3932        0  1

after this series:

#
#        ----- HITM -----                                   Shared                        
#   Num  RmtHitm  LclHitm                 Symbol            Object       Source:Line  Node
# .....  .......  .......   ....................  ................  ................  ....
#
  -------------------------
      0       54       88  
  -------------------------
         100.00%  100.00%   [k] update_cfs_group  [kernel.vmlinux]  atomic64_64.h:15   0  1

  -------------------------
      1       48       83  
  -------------------------
          97.92%   97.59%   [k] update_cfs_group  [kernel.vmlinux]  atomic64_64.h:15   0  1
           2.08%    1.20%   [k] update_load_avg   [kernel.vmlinux]  atomic64_64.h:25   0  1
           0.00%    1.20%   [k] update_load_avg   [kernel.vmlinux]  fair.c:4118        0  1

  -------------------------
      2       28       44  
  -------------------------
          85.71%   79.55%   [k] generic_permission      [kernel.vmlinux]  mnt_idmapping.h:81   0  1
          14.29%   20.45%   [k] generic_permission      [kernel.vmlinux]  mnt_idmapping.h:81   0  1


The contention is practically gone.

Link: https://lore.kernel.org/all/20231018074553.41333-1-hu1.chen@intel.com/

Vinicius Costa Gomes (4):
  cleanup: Fix discarded const warning when defining guards
  cred: Add a light version of override/revert_creds()
  overlayfs: Optimize credentials usage
  fs: Optimize credentials reference count for backing file ops

 fs/backing-file.c       | 124 +++++++++++++++++++---------------------
 fs/overlayfs/copy_up.c  |   4 +-
 fs/overlayfs/dir.c      |  22 +++----
 fs/overlayfs/file.c     |  70 ++++++++++++-----------
 fs/overlayfs/inode.c    |  60 ++++++++++---------
 fs/overlayfs/namei.c    |  21 ++++---
 fs/overlayfs/readdir.c  |  18 +++---
 fs/overlayfs/util.c     |  23 ++++----
 fs/overlayfs/xattrs.c   |  34 +++++------
 include/linux/cleanup.h |   2 +-
 include/linux/cred.h    |  21 +++++++
 kernel/cred.c           |   6 +-
 12 files changed, 215 insertions(+), 190 deletions(-)

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ