lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOi1vP9q0LAtARP-cyLD3rkChqqQV=LfZARSySSJMGzpJRz0uw@mail.gmail.com>
Date: Sun, 1 Jun 2025 18:03:01 +0200
From: Ilya Dryomov <idryomov@...il.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@....com>
Cc: "ceph-devel@...r.kernel.org" <ceph-devel@...r.kernel.org>, Xiubo Li <xiubli@...hat.com>, 
	"twelho@...ho.tech" <twelho@...ho.tech>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "zohar@...ux.ibm.com" <zohar@...ux.ibm.com>, 
	"roberto.sassu@...wei.com" <roberto.sassu@...wei.com>, 
	"eric.snowberg@...cle.com" <eric.snowberg@...cle.com>, 
	"linux-integrity@...r.kernel.org" <linux-integrity@...r.kernel.org>, 
	"dmitry.kasatkin@...il.com" <dmitry.kasatkin@...il.com>, "serge@...lyn.com" <serge@...lyn.com>, 
	"linux-security-module@...r.kernel.org" <linux-security-module@...r.kernel.org>, 
	"jmorris@...ei.org" <jmorris@...ei.org>, "paul@...l-moore.com" <paul@...l-moore.com>
Subject: Re: [PATCH] fs/ceph: set superblock s_magic for IMA fsmagic matching:
 up to 60x speedup

On Thu, May 29, 2025 at 8:03 PM Viacheslav Dubeyko
<Slava.Dubeyko@....com> wrote:
>
> On Thu, 2025-05-29 at 17:45 +0000, Dennis Marttinen wrote:
> > The CephFS kernel driver forgets to set the filesystem magic signature in
> > its superblock. As a result, IMA policy rules based on fsmagic matching do
> > not apply as intended. This causes a major performance regression in Talos
> > Linux [1] when mounting CephFS volumes, such as when deploying Rook Ceph
> > [2]. Talos Linux ships a hardened kernel with the following IMA policy
> > (irrelevant lines omitted):
> >
> > # cat /sys/kernel/security/integrity/ima/policy
> > [...]
> > dont_measure fsmagic=0xc36400 # CEPH_SUPER_MAGIC
> > [...]
> > measure func=FILE_CHECK mask=^MAY_READ euid=0
> > measure func=FILE_CHECK mask=^MAY_READ uid=0
> > [...]
> >
> > Currently, IMA compares 0xc36400 == 0x0 for CephFS files, resulting in all
> > files opened with O_RDONLY or O_RDWR getting measured with SHA512 on every
> > open(2):
> >
> > # cat /data/cephfs/test-file
> > # tail -1 /sys/kernel/security/integrity/ima/ascii_runtime_measurements
> > 10 69990c87e8af323d47e2d6ae4... ima-ng sha512:<hash> /data/cephfs/test-file
> >
> > Since O_WRONLY is rare, this results in an order of magnitude lower
> > performance than expected for practically all file operations. Properly
> > setting CEPH_SUPER_MAGIC in the CephFS superblock resolves the regression.
> >
> > Tests performed on a 3x replicated Ceph v19.3.0 cluster across three
> > i5-7200U nodes each equipped with one Micron 7400 MAX M.2 disk (BlueStore)
> > and Gigabit ethernet, on Talos Linux v1.10.2:
> >
> > FS-Mark 3.3
> > Test: 500 Files, Empty
> > Files/s > Higher Is Better
> > 6.12.27-talos . 16.6  |====
> > +twelho patch . 208.4 |====================================================
> >
> > FS-Mark 3.3
> > Test: 500 Files, 1KB Size
> > Files/s > Higher Is Better
> > 6.12.27-talos . 15.6  |=======
> > +twelho patch . 118.6 |====================================================
> >
> > FS-Mark 3.3
> > Test: 500 Files, 32 Sub Dirs, 1MB Size
> > Files/s > Higher Is Better
> > 6.12.27-talos . 12.7 |===============
> > +twelho patch . 44.7 |=====================================================
> >
> > IO500 [3] 2fcd6d6 results (benchmarks within variance omitted):
> >
> > > IO500 benchmark   | 6.12.27-talos  | +twelho patch  | Speedup   |
> > > -------------------|----------------|----------------|-----------|
> > > mdtest-easy-write | 0.018524 kIOPS | 1.135027 kIOPS | 6027.33 % |
> > > mdtest-hard-write | 0.018498 kIOPS | 0.973312 kIOPS | 5161.71 % |
> > > ior-easy-read     | 0.064727 GiB/s | 0.155324 GiB/s | 139.97 %  |
> > > mdtest-hard-read  | 0.018246 kIOPS | 0.780800 kIOPS | 4179.29 % |
> >
> > This applies outside of synthetic benchmarks as well, for example, the time
> > to rsync a 55 MiB directory with ~12k of mostly small files drops from an
> > unusable 10m5s to a reasonable 26s (23x the throughput).
> >
> > [1]: https://www.talos.dev/
> > [2]: https://www.talos.dev/v1.10/kubernetes-guides/configuration/ceph-with-rook/
> > [3]: https://github.com/IO500/io500
> >
> > Signed-off-by: Dennis Marttinen <twelho@...ho.tech>
> > ---
> > It took me a year to hunt this down: profiling distributed filesystems is
> > non-trivial. Since the regression is associated with IMA use, I received a
> > hint to CC the folks associated with IMA code. The patch targets the 6.12
> > kernel series currently used by Talos Linux, but should apply on top of
> > master as well. Please note that this is an independent contribution -
> > I am not affiliated with any company or organization.
> >
> >  fs/ceph/super.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/fs/ceph/super.c b/fs/ceph/super.c
> > index 73f321b52895e..9549f97233a9e 100644
> > --- a/fs/ceph/super.c
> > +++ b/fs/ceph/super.c
> > @@ -1217,6 +1217,7 @@ static int ceph_set_super(struct super_block *s, struct fs_context *fc)
> >       s->s_time_min = 0;
> >       s->s_time_max = U32_MAX;
> >       s->s_flags |= SB_NODIRATIME | SB_NOATIME;
> > +     s->s_magic = CEPH_SUPER_MAGIC;
> >
>
> Yeah, makes sense. Thanks a lot for the fix. It's really non-trivial issue.
>
> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@....com>

Applied.

Thanks,

                Ilya

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ