[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWETXSLwAYOVdB9J@fedora>
Date: Fri, 9 Jan 2026 22:40:29 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
Cc: Christoph Hellwig <hch@...radead.org>, linux-block@...r.kernel.org,
linux-scsi@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
James.Bottomley@...senpartnership.com, leonro@...dia.com,
kch@...dia.com, LKML <linux-kernel@...r.kernel.org>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, riteshh@...ux.ibm.com,
ojaswin@...ux.ibm.com
Subject: Re: [next-20260108]kernel BUG at drivers/scsi/scsi_lib.c:1173!
On Fri, Jan 09, 2026 at 07:53:00PM +0530, Venkat Rao Bagalkote wrote:
>
> On 09/01/26 7:35 pm, Ming Lei wrote:
> > On Fri, Jan 09, 2026 at 07:26:01PM +0530, Venkat Rao Bagalkote wrote:
> > > On 09/01/26 6:28 pm, Ming Lei wrote:
> > > > On Fri, Jan 09, 2026 at 05:51:15PM +0530, Venkat Rao Bagalkote wrote:
> > > > > On 09/01/26 5:25 pm, Ming Lei wrote:
> > > > > > On Fri, Jan 09, 2026 at 05:14:36PM +0530, Venkat Rao Bagalkote wrote:
> > > > > > > On 09/01/26 12:19 pm, Ming Lei wrote:
> > > > > > > > On Thu, Jan 08, 2026 at 09:56:39PM -0800, Christoph Hellwig wrote:
> > > > > > > > > I've seen the same when running xfstests on xfs, and bisected it to:
> > > > > > > > >
> > > > > > > > > commit ee623c892aa59003fca173de0041abc2ccc2c72d
> > > > > > > > > Author: Ming Lei <ming.lei@...hat.com>
> > > > > > > > > Date: Wed Dec 31 11:00:55 2025 +0800
> > > > > > > > >
> > > > > > > > > block: use bvec iterator helper for bio_may_need_split()
> > > > > > > > >
> > > > > > > > Hi Christoph and Venkat Rao Bagalkote,
> > > > > > > >
> > > > > > > > Unfortunately I can't duplicate the issue in my environment, can you test
> > > > > > > > the following patch?
> > > > > > > >
> > > > > > > > diff --git a/block/blk.h b/block/blk.h
> > > > > > > > index 98f4dfd4ec75..980eef1f5690 100644
> > > > > > > > --- a/block/blk.h
> > > > > > > > +++ b/block/blk.h
> > > > > > > > @@ -380,7 +380,7 @@ static inline bool bio_may_need_split(struct bio *bio,
> > > > > > > > return true;
> > > > > > > > bv = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
> > > > > > > > - if (bio->bi_iter.bi_size > bv->bv_len)
> > > > > > > > + if (bio->bi_iter.bi_size > bv->bv_len - bio->bi_iter.bi_bvec_done)
> > > > > > > > return true;
> > > > > > > > return bv->bv_len + bv->bv_offset > lim->max_fast_segment_size;
> > > > > > > > }
> > > > > > > Hello Ming,
> > > > > > >
> > > > > > >
> > > > > > > This is not helping. I am hitting this issue, during kernel build itself.
> > > > > > Can you confirm if it can fix the blktests ext4/056 first?
> > > > > >
> > > > > > If kernel building is running over new patched kernel, please provide the
> > > > > > dmesg log. And if it is reproduciable, can you confirm if it can be fixed
> > > > > > by reverting ee623c892aa59003 (block: use bvec iterator helper for bio_may_need_split())?
> > > > > Unfortunately, even with revert, build fails.
> > > > >
> > > > >
> > > > >
> > > > > commit c64b2ee9cddcb31546c8622ef018d344544a9388 (HEAD)
> > > > > Author: Super User <root@...-zzci-1.ltc.tadn.ibm.com>
> > > > > Date: Fri Jan 9 06:51:19 2026 -0600
> > > > >
> > > > > Revert "block: use bvec iterator helper for bio_may_need_split()"
> > > > >
> > > > > This reverts commit ee623c892aa59003fca173de0041abc2ccc2c72d.
> > > > OK, then your issue isn't related with the above change.
> > > >
> > > > Can you reproduce & collect dmesg log with the bad sg/rq/bio/bvec info by
> > > > applying the attached debug patch?
> > > >
> > > > Also if possible, please collect your scsi queue's limit info before
> > > > reproducing the issue:
> > > >
> > > > (cd /sys/block/$SD/queue && find . -type f -exec grep -aH . {} \;)
> > > Hello Ming,
> > >
> > > After applying the patch shared via attachment also, I see build failure.
> > >
> > > I have attached the kernel config file.
> > >
> > >
> > > git diff
> > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > > index 752060d7261c..33c1b6a0a738 100644
> > > --- a/block/blk-mq-dma.c
> > > +++ b/block/blk-mq-dma.c
> > > @@ -4,8 +4,75 @@
> > > */
> > > #include <linux/blk-integrity.h>
> > > #include <linux/blk-mq-dma.h>
> > > +#include <linux/scatterlist.h>
> > > #include "blk.h"
> > Hi Venkat,
> >
> > Thanks for your test.
> >
> > But you didn't apply the whole debug patch in the following link:
> >
> > https://lore.kernel.org/linux-block/aWD7j3NR_m6EyZv1@fedora/
> >
> > otherwise something like "=== __blk_rq_map_sg DEBUG DUMP ===" will be
> > dumped in dmesg log.
> >
> > > make -j 48 -s && make modules_install && make install
> > > [ 5625.770436] ------------[ cut here ]------------
> > > [ 5625.770476] WARNING: block/blk-mq-dma.c:309 at
> > If the whole debug patch is applied correctly, the above line number should
> > have become 378 instead of original 309.
> >
> > Please re-apply the debug patch & reproduce again.
> >
>
> Hello Ming,
>
>
> Apologies for back and forth. But I did apply the whole patch. Below is the
> git diff from my machine. Let me know, if I am missing anything.
OK, the patch is correct.
But you need to boot with one good kernel(such as, distribution shipped kernel) first
for building new test kernel against -next tree with this patch.
After this new test kernel is built & installed & reboot, you can start your
kernel build workload, then the issue will be triggered, and the log is
collected.
When the issue is triggered, `WARNING: block/blk-mq-dma.c:378 ` should be
shown in dmesg log, which signals you are running the test kernel with the
debug patch for collecting log.
Please let me know if anything is clear.
Thanks,
Ming
Powered by blists - more mailing lists