[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200917230842.GA1139137@T590>
Date: Fri, 18 Sep 2020 07:08:42 +0800
From: Ming Lei <ming.lei@...hat.com>
To: "Theodore Y. Ts'o" <tytso@....edu>
Cc: Jens Axboe <axboe@...nel.dk>, linux-ext4@...r.kernel.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-block@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling
into blk_mq_get_driver_tag
On Thu, Sep 17, 2020 at 10:30:12AM -0400, Theodore Y. Ts'o wrote:
> On Thu, Sep 17, 2020 at 10:20:51AM +0800, Ming Lei wrote:
> >
> > Obviously there is other more serious issue, since 568f27006577 is
> > completely reverted in your test, and you still see list corruption
> > issue.
> >
> > So I'd suggest to find the big issue first. Once it is fixed, maybe
> > everything becomes fine.
> > ...
> > Looks it is more like a memory corruption issue, is there any helpful log
> > dumped when running kernel with kasan?
>
> Last night, I ran six VM's using -rc4 with and without KASAN; without
> Kasan, half of them hung. With KASAN enabled, all of the test VM's
> ran to completion.
>From your last email, when you run -rc4 with revert of 568f27006577, you
can observe list corruption easily.
So can you enable KASAN on -rc4 with revert of 568f27006577 and see if
it makes a difference?
>
> This strongly suggests whatever the problem is, it's timing related.
> I'll run a larger set of test runs to see if this pattern is confirmed
> today.
Looks you enable lots of other debug options, such a lockdep, which has
much much heavy runtime load. Maybe you can disable all non-KASAN debug
option(non-KASAN memory debug options, lockdep, ...) and keep KASAN
debug only and see if you are lucky.
Thanks,
Ming
Powered by blists - more mailing lists