[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1c6b3cd74e303fa8ab8b4853986fd4cb8c7c8541.camel@ibm.com>
Date: Wed, 17 Sep 2025 17:34:29 +0000
From: Viacheslav Dubeyko <Slava.Dubeyko@....com>
To: "seanjc@...gle.com" <seanjc@...gle.com>,
"lyican53@...il.com"
<lyican53@...il.com>
CC: "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>, Xiubo Li <xiubli@...hat.com>,
"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
"sboyd@...nel.org"
<sboyd@...nel.org>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
"ceph-devel@...r.kernel.org"
<ceph-devel@...r.kernel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
"idryomov@...il.com" <idryomov@...il.com>,
"martin.petersen@...cle.com"
<martin.petersen@...cle.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"mturquette@...libre.com" <mturquette@...libre.com>,
"linux-clk@...r.kernel.org" <linux-clk@...r.kernel.org>
Subject: RE: [RFC] Fix potential undefined behavior in __builtin_clz usage
with GCC 11.1.0
On Wed, 2025-09-17 at 18:04 +0800, 陈华昭(Lyican) wrote:
>
>
>
> Hi Slava and Sean,
>
> Thank you for the valuable feedback!
>
> CEPH FORMAL PATCH:
> =================
>
> As requested by Slava, I've prepared a formal patch for the Ceph case.
> The patch adds proper zero checking before __builtin_clz() to prevent
> undefined behavior. Please find it attached as ceph_patch.patch.
>
> PROOF-OF-CONCEPT TEST CASE:
> ==========================
>
> I've also created a proof-of-concept test case that demonstrates the
> problematic input values that could trigger this bug. The test identifies
> specific input values where (x & 0x1FFFF) becomes zero after the increment
> and condition check.
>
> Key findings from the test:
> - Inputs like 0x7FFFF, 0x9FFFF, 0xBFFFF, 0xDFFFF, 0xFFFFF can trigger the bug
> - These correspond to x+1 values where (x+1 & 0x18000) == 0 and (x+1 & 0x1FFFF) == 0
>
> The test can be integrated into Ceph's existing test framework or adapted
> for KUnit testing as you suggested. Please find it as ceph_poc_test.c.
>
> KVM CASE CLARIFICATION:
> ======================
>
> Thank you Sean for the detailed explanation about the KVM case. You're
> absolutely right that pages and test_dirty_ring_count are guaranteed to
> be non-zero in practice. I'll remove this from my analysis and focus on
> the genuine issues.
>
> BITOPS WRAPPER DISCUSSION:
> =========================
>
> I appreciate you bringing Yuri into the discussion. The idea of using
> existing fls()/fls64() functions or creating new fls8()/fls16() variants
> sounds promising. Many __builtin_clz() calls in the kernel could indeed
> benefit from these safer alternatives.
>
> STATUS UPDATE:
> =============
>
> 1. Ceph: Formal patch and test case ready for review
> 2. KVM: Confirmed not an issue in practice (thanks Sean)
> 3. SCSI: Still investigating the drivers/scsi/elx/libefc_sli/sli4.h case
> 4. Bitops: Awaiting input from Yuri on kernel-wide improvements
>
> NEXT STEPS:
> ==========
>
> 1. Please review the Ceph patch and test case (Slava)
> 2. Happy to work with Yuri on bitops improvements if there's interest
> 3. For SCSI maintainers: would you like me to prepare a similar analysis for the sli_convert_mask_to_count() function?
> 4. Can prepare additional patches for any other confirmed cases
>
> Questions for maintainers:
> - Slava: Should the Ceph patch go through ceph-devel first, or directly to you?
Could you please send the patch to ceph-devel? You can add me to cc.
I don't review the attachments. :)
Thanks,
Slava.
> - Any specific requirements for the test case integration?
> - SCSI maintainers: Is the drivers/scsi/elx/libefc_sli/sli4.h case worth investigating further?
>
> Best regards,
> Huazhao Chen
> lyican53@...il.com
>
> ---
>
> Attachments:
> - ceph_patch.patch: Formal patch for net/ceph/crush/mapper.c
> - ceph_poc_test.c: Proof-of-concept test case demonstrating the issue
>
>
Powered by blists - more mailing lists