lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220802165002.GA21797@1wt.eu>
Date:   Tue, 2 Aug 2022 18:50:02 +0200
From:   Willy Tarreau <w@....eu>
To:     Dipanjan Das <mail.dipanjan.das@...il.com>
Cc:     Lukas Bulwahn <lukas.bulwahn@...il.com>,
        Denis Efremov <efremov@...ux.com>,
        Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        syzkaller <syzkaller@...glegroups.com>,
        fleischermarius@...glemail.com, its.priyanka.bose@...il.com
Subject: Re: INFO: task hung in __floppy_read_block_0

Hi,

On Mon, Aug 01, 2022 at 10:04:46PM -0700, Dipanjan Das wrote:
> On Sun, Jul 31, 2022 at 2:53 AM Willy Tarreau <w@....eu> wrote:
> >
> > Thus I'm a bit confused about what to look for. It's very likely that
> > there are still bugs left in this driver, but trying to identify them
> > and to validate a fix will be difficult if they cannot be reproduced.
> > Maybe they only happen under emulation due to timing issues.
> >
> > As such, any hint about the exact setup and how long to wait to get
> > the error would be much appreciated.
> 
> We can confirm that we were able to trigger the issue on the latest
> 5.19 (commit: 3d7cb6b04c3f3115719235cc6866b10326de34cd) with the
> C-repro within a VM. We use this:
> https://syzkaller.appspot.com/text?tag=KernelConfig&x=cd73026ceaed1402
>  config to build the kernel. The issue triggers after around 143
> seconds. For all the five times we tried, we were able to reproduce
> the issue deterministically every time. Please let us know if you need
> any other information.

Yep, I could reproduce it under qemu as well. I've added traces, and
ugly things are happening with the lock (but I haven't understood what
yet). What I saw was that process_fd_request() is first called under
lock, then we drop the lock, then __floppy_read_block_0() is called
under lock, which calls process_fd_request(), then the lock is dropped,
wait_for_completion() is called, then process_fd_request() is called
again without lock this time, and from there we're looping in
fd_wait_for_completion. I need to dig into more details but it doesn't
seem right to me that process_fd_request() is sometimes called under a
lock and sometimes out, and that __floppy_read_block_0() is called with
a lock held and it's relesed under it. I could have missed certain things
due to the concurrent accesses but in any case I should probably not be
observing this.

I'll try to dig deeper. I really don't know that area and I must confess
it's not the most exciting to rediscover each time :-)

Thanks,
Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ