lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Sep 2020 15:33:03 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     "Theodore Y. Ts'o" <tytso@....edu>
Cc:     Jens Axboe <axboe@...nel.dk>, linux-ext4@...r.kernel.org,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        linux-block@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling
 into blk_mq_get_driver_tag

Hi Theodore,

On Tue, Sep 15, 2020 at 12:45:19AM -0400, Theodore Y. Ts'o wrote:
> On Thu, Sep 03, 2020 at 11:55:28PM -0400, Theodore Y. Ts'o wrote:
> > Worse, right now, -rc1 and -rc2 is causing random crashes in my
> > gce-xfstests framework.  Sometimes it happens before we've run even a
> > single xfstests; sometimes it happens after we have successfully
> > completed all of the tests, and we're doing a shutdown of the VM under
> > test.  Other times it happens in the middle of a test run.  Given that
> > I'm seeing this at -rc1, which is before my late ext4 pull request to
> > Linus, it's probably not an ext4 related bug.  But it also means that
> > I'm partially blind in terms of my kernel testing at the moment.  So I
> > can't even tell Linus that I've run lots of tests and I'm 100%
> > confident your one-line change is 100% safe.
> 
> I was finally able to bisect it down to the commit:
> 
> 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

37f4a24c2469 has been reverted in:

	4e2f62e566b5 Revert "blk-mq: put driver tag when this request is completed"

And later the patch is committed as the following after being fixed:

	568f27006577 blk-mq: centralise related handling into blk_mq_get_driver_tag

So can you reproduce the issue by running kernel of commit 568f27006577?
If yes, can the issue be fixed by reverting 568f27006577?

> 
> (See below for [1] Bisect log.)
> 
> The previous commit allows the tests to run to completion.  With
> commit 37f4a24c2469 and later all 11 test scenarios (4k blocks, 1k
> blocks, ext3 compat, ext4 w/ fscrypt, nojournal mode, data=journal,
> bigalloc, etc.) the VM will get stuck.

Can you share the exact mount command line for setup the environment?
and the exact xfstest item?



Thanks,
Ming

Powered by blists - more mailing lists