lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4c85df85-58f7-4e44-8201-2f0562f93439@linux.ibm.com>
Date: Fri, 9 Jan 2026 19:53:00 +0530
From: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Christoph Hellwig <hch@...radead.org>, linux-block@...r.kernel.org,
        linux-scsi@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
        James.Bottomley@...senpartnership.com, leonro@...dia.com,
        kch@...dia.com, LKML <linux-kernel@...r.kernel.org>,
        Madhavan Srinivasan <maddy@...ux.ibm.com>, riteshh@...ux.ibm.com,
        ojaswin@...ux.ibm.com
Subject: Re: [next-20260108]kernel BUG at drivers/scsi/scsi_lib.c:1173!


On 09/01/26 7:35 pm, Ming Lei wrote:
> On Fri, Jan 09, 2026 at 07:26:01PM +0530, Venkat Rao Bagalkote wrote:
>> On 09/01/26 6:28 pm, Ming Lei wrote:
>>> On Fri, Jan 09, 2026 at 05:51:15PM +0530, Venkat Rao Bagalkote wrote:
>>>> On 09/01/26 5:25 pm, Ming Lei wrote:
>>>>> On Fri, Jan 09, 2026 at 05:14:36PM +0530, Venkat Rao Bagalkote wrote:
>>>>>> On 09/01/26 12:19 pm, Ming Lei wrote:
>>>>>>> On Thu, Jan 08, 2026 at 09:56:39PM -0800, Christoph Hellwig wrote:
>>>>>>>> I've seen the same when running xfstests on xfs, and bisected it to:
>>>>>>>>
>>>>>>>> commit ee623c892aa59003fca173de0041abc2ccc2c72d
>>>>>>>> Author: Ming Lei <ming.lei@...hat.com>
>>>>>>>> Date:   Wed Dec 31 11:00:55 2025 +0800
>>>>>>>>
>>>>>>>>         block: use bvec iterator helper for bio_may_need_split()
>>>>>>>>
>>>>>>> Hi Christoph and Venkat Rao Bagalkote,
>>>>>>>
>>>>>>> Unfortunately I can't duplicate the issue in my environment, can you test
>>>>>>> the following patch?
>>>>>>>
>>>>>>> diff --git a/block/blk.h b/block/blk.h
>>>>>>> index 98f4dfd4ec75..980eef1f5690 100644
>>>>>>> --- a/block/blk.h
>>>>>>> +++ b/block/blk.h
>>>>>>> @@ -380,7 +380,7 @@ static inline bool bio_may_need_split(struct bio *bio,
>>>>>>>                     return true;
>>>>>>>             bv = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
>>>>>>> -       if (bio->bi_iter.bi_size > bv->bv_len)
>>>>>>> +       if (bio->bi_iter.bi_size > bv->bv_len - bio->bi_iter.bi_bvec_done)
>>>>>>>                     return true;
>>>>>>>             return bv->bv_len + bv->bv_offset > lim->max_fast_segment_size;
>>>>>>>      }
>>>>>> Hello Ming,
>>>>>>
>>>>>>
>>>>>> This is not helping. I am hitting this issue, during kernel build itself.
>>>>> Can you confirm if it can fix the blktests ext4/056 first?
>>>>>
>>>>> If kernel building is running over new patched kernel, please provide the
>>>>> dmesg log. And if it is reproduciable, can you confirm if it can be fixed
>>>>> by reverting ee623c892aa59003 (block: use bvec iterator helper for bio_may_need_split())?
>>>> Unfortunately, even with revert, build fails.
>>>>
>>>>
>>>>
>>>> commit c64b2ee9cddcb31546c8622ef018d344544a9388 (HEAD)
>>>> Author: Super User <root@...-zzci-1.ltc.tadn.ibm.com>
>>>> Date:   Fri Jan 9 06:51:19 2026 -0600
>>>>
>>>>       Revert "block: use bvec iterator helper for bio_may_need_split()"
>>>>
>>>>       This reverts commit ee623c892aa59003fca173de0041abc2ccc2c72d.
>>> OK, then your issue isn't related with the above change.
>>>
>>> Can you reproduce & collect dmesg log with the bad sg/rq/bio/bvec info by
>>> applying the attached debug patch?
>>>
>>> Also if possible, please collect your scsi queue's limit info before
>>> reproducing the issue:
>>>
>>> 	(cd /sys/block/$SD/queue && find . -type f -exec grep -aH . {} \;)
>> Hello Ming,
>>
>> After applying the patch shared via attachment also, I see build failure.
>>
>> I have attached the kernel config file.
>>
>>
>> git diff
>> diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
>> index 752060d7261c..33c1b6a0a738 100644
>> --- a/block/blk-mq-dma.c
>> +++ b/block/blk-mq-dma.c
>> @@ -4,8 +4,75 @@
>>    */
>>   #include <linux/blk-integrity.h>
>>   #include <linux/blk-mq-dma.h>
>> +#include <linux/scatterlist.h>
>>   #include "blk.h"
> Hi Venkat,
>
> Thanks for your test.
>
> But you didn't apply the whole debug patch in the following link:
>
> https://lore.kernel.org/linux-block/aWD7j3NR_m6EyZv1@fedora/
>
> otherwise something like "=== __blk_rq_map_sg DEBUG DUMP ===" will be
> dumped in dmesg log.
>
>> make -j 48 -s && make modules_install && make install
>> [ 5625.770436] ------------[ cut here ]------------
>> [ 5625.770476] WARNING: block/blk-mq-dma.c:309 at
> If the whole debug patch is applied correctly, the above line number should
> have become 378 instead of original 309.
>
> Please re-apply the debug patch & reproduce again.
>

Hello Ming,


Apologies for back and forth. But I did apply the whole patch. Below is 
the git diff from my machine. Let me know, if I am missing anything.


  git diff
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 752060d7261c..33c1b6a0a738 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -4,8 +4,75 @@
   */
  #include <linux/blk-integrity.h>
  #include <linux/blk-mq-dma.h>
+#include <linux/scatterlist.h>
  #include "blk.h"

+static void dump_rq_mapping_debug(struct request *rq, struct 
scatterlist *sglist,
+                                 int nsegs)
+{
+       struct scatterlist *sg;
+       struct bio *bio;
+       struct bvec_iter iter;
+       struct bio_vec bv;
+       int i;
+
+       pr_err("=== __blk_rq_map_sg DEBUG DUMP ===\n");
+       pr_err("DISK: %s\n", rq->q->disk ? rq->q->disk->disk_name : 
"(null)");
+
+       /* Dump nsegs vs expected */
+       pr_err("nsegs=%d nr_phys_segments=%u\n",
+              nsegs, blk_rq_nr_phys_segments(rq));
+
+       /* Dump request info */
+       pr_err("REQUEST: __data_len=%u __sector=%llu cmd_flags=0x%x "
+              "rq_flags=0x%x nr_phys_segments=%u phys_gap_bit=%u\n",
+              rq->__data_len, (unsigned long long)rq->__sector,
+              rq->cmd_flags, (__force unsigned int)rq->rq_flags,
+              rq->nr_phys_segments, rq->phys_gap_bit);
+
+       /* Dump each SG element */
+       pr_err("--- SG LIST (%d entries) ---\n", nsegs);
+       for_each_sg(sglist, sg, nsegs, i) {
+               pr_err("  sg[%d]: pfn=0x%lx offset=%u len=%u 
dma_addr=0x%llx\n",
+                      i, page_to_pfn(sg_page(sg)), sg->offset, sg->length,
+                      (unsigned long long)sg_dma_address(sg));
+       }
+
+       /* Dump each bio */
+       pr_err("--- BIO LIST ---\n");
+       for (bio = rq->bio; bio; bio = bio->bi_next) {
+               pr_err("  BIO %p: bi_iter={sector=%llu size=%u idx=%u 
bvec_done=%u} "
+                      "bi_flags=0x%x bi_opf=0x%x bi_vcnt=%u 
bi_bvec_gap_bit=%u\n",
+                      bio,
+                      (unsigned long long)bio->bi_iter.bi_sector,
+                      bio->bi_iter.bi_size, bio->bi_iter.bi_idx,
+                      bio->bi_iter.bi_bvec_done,
+                      bio->bi_flags, bio->bi_opf, bio->bi_vcnt,
+                      bio->bi_bvec_gap_bit);
+
+               /* Dump each bvec in this bio */
+               pr_err("    --- BVECS (bi_vcnt=%u) ---\n", bio->bi_vcnt);
+               for (i = 0; i < bio->bi_vcnt; i++) {
+                       struct bio_vec *bvp = &bio->bi_io_vec[i];
+
+                       pr_err("      bvec[%d]: pfn=0x%lx len=%u 
offset=%u\n",
+                              i, page_to_pfn(bvp->bv_page), bvp->bv_len,
+                              bvp->bv_offset);
+               }
+
+               /* Also dump effective bvecs via iterator */
+               pr_err("    --- EFFECTIVE BVECS (via iter) ---\n");
+               i = 0;
+               bio_for_each_bvec(bv, bio, iter) {
+                       pr_err("      eff_bvec[%d]: pfn=0x%lx len=%u 
offset=%u\n",
+                              i++, page_to_pfn(bv.bv_page), bv.bv_len,
+                              bv.bv_offset);
+               }
+       }
+
+       pr_err("=== END DEBUG DUMP ===\n");
+}
+
  static bool __blk_map_iter_next(struct blk_map_iter *iter)
  {
         if (iter->iter.bi_size)
@@ -306,6 +373,8 @@ int __blk_rq_map_sg(struct request *rq, struct 
scatterlist *sglist,
          * Something must have been wrong if the figured number of
          * segment is bigger than number of req's physical segments
          */
+       if (nsegs > blk_rq_nr_phys_segments(rq))
+               dump_rq_mapping_debug(rq, sglist, nsegs);
         WARN_ON(nsegs > blk_rq_nr_phys_segments(rq));

         return nsegs;
diff --git a/block/blk.h b/block/blk.h
index 98f4dfd4ec75..980eef1f5690 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -380,7 +380,7 @@ static inline bool bio_may_need_split(struct bio *bio,
                 return true;

         bv = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
-       if (bio->bi_iter.bi_size > bv->bv_len)
+       if (bio->bi_iter.bi_size > bv->bv_len - bio->bi_iter.bi_bvec_done)
                 return true;
         return bv->bv_len + bv->bv_offset > lim->max_fast_segment_size;
  }
(END)


Regards,

Venkat.

> Thanks,
> Ming
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ