lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 4 Jan 2022 10:49:04 +0100
From:   Bastian Blank <bastian.blank@...dativ.de>
To:     Jeff Layton <jlayton@...nel.org>, Ilya Dryomov <idryomov@...il.com>
Cc:     ceph-devel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: PROBLEM: SLAB use-after-free with ceph(fs)

Moin

A customer reported panics inside memory management.  Before all
occurances there are reports about SLAB missmatch in the log.  The
"crash" tool shows freelist corruption in the memory dump.  This makes
this problem a use-after-free somewhere inside the ceph module.

The crashs happen during high load situations, while copying data
between two cephfs.

| [152791.777454] ceph:  dropping dirty+flushing - state for 00000000c039d4cc 1099526092092
| [152791.777457] ------------[ cut here ]------------
| [152791.777458] cache_from_obj: Wrong slab cache. jbd2_journal_handle but object is from kmalloc-256
| [152791.777473] WARNING: CPU: 76 PID: 2676615 at mm/slab.h:521 kmem_cache_free+0x260/0x2b0
[…]
| [152791.777530] CPU: 76 PID: 2676615 Comm: kworker/76:2 Kdump: loaded Not tainted 5.4.0-81-generic #91-Ubuntu
| [152791.777531] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 10/28/2021
| [152791.777540] Workqueue: ceph-msgr ceph_con_workfn [libceph]
| [152791.777542] RIP: 0010:kmem_cache_free+0x260/0x2b0
[…]
| [152791.777550] Call Trace:
| [152791.777562]  ceph_free_cap_flush+0x1d/0x20 [ceph]
| [152791.777568]  remove_session_caps_cb+0xcf/0x4b0 [ceph]
| [152791.777573]  ceph_iterate_session_caps+0xc8/0x2a0 [ceph]
| [152791.777578]  ? wake_up_session_cb+0xe0/0xe0 [ceph]
| [152791.777583]  remove_session_caps+0x55/0x190 [ceph]
| [152791.777587]  ? cleanup_session_requests+0x104/0x130 [ceph]
| [152791.777592]  handle_session+0x4c7/0x5e0 [ceph]
| [152791.777597]  dispatch+0x279/0x610 [ceph]
| [152791.777602]  try_read+0x566/0x8c0 [libceph]

They reported the same in all tested kernels since 5.4, up to 5.15.5 or
so.  Sadly I have no tests with newer builds available.

Any ideas how I can debug this further?

Regards,
Bastian

-- 
Bastian Blank
Berater
Telefon: +49 2166 9901-194
E-Mail: bastian.blank@...dativ.de
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ