linux-kernel - Re: INFO: task hung in wdm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+YgLm2m0JG6qKKn9OpyXT9kKEPeyLSVGSfLbUukoCnB+g@mail.gmail.com>
Date:   Sat, 23 Nov 2019 07:52:25 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Bjørn Mork <bjorn@...k.no>
Cc:     Oliver Neukum <oneukum@...e.de>,
        syzbot <syzbot+854768b99f19e89d7f81@...kaller.appspotmail.com>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Jia-Ju Bai <baijiaju1990@...il.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Colin King <colin.king@...onical.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        USB list <linux-usb@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        yuehaibing@...wei.com
Subject: Re: INFO: task hung in wdm_flush

On Tue, Nov 19, 2019 at 12:34 PM Bjørn Mork <bjorn@...k.no> wrote:
>
> Oliver Neukum <oneukum@...e.de> writes:
> > Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb Bjørn Mork:
> >
> >> Anyway, I believe this is not a bug.
> >>
> >> wdm_flush will wait forever for the IN_USE flag to be cleared or the
> >
> > Damn. Too obvious. So you think we simply have pending output that does
> > just not complete?
>
> I do miss a lot of stuff so I might be wrong, but I can't see any other
> way this can happen.  The out_callback will unconditionally clear the
> IN_USE flag and wake up the wait_queue.
>
> >> DISCONNECTING flag to be set. The only way you can avoid this is by
> >> creating a device that works normally up to a point and then completely
> >> ignores all messages,
> >
> > Devices may crash. I don't think we can ignore that case.
>
> Sure, but I've never seen that happen without the device falling off the
> bus.  Which is a disconnect.
>
> But I am all for handling this *if* someone reproduces it with a real
> device.  I just don't think it's worth the effort if it's only a
> theoretical problem.
>
> >>  but without resetting or disconnecting. It is
> >> obviously possible to create such a device. But I think the current
> >> error handling is more than sufficient, unless you show me some way to
> >> abuse this or reproduce the issue with a real device.
> >
> > Malicious devices are real. Potentially at least.
> > But you are right, we need not bend over to handle them well, but we
> > ought to be able to handle them.
>
> Sure, we need to handle malicious devices.  But only if they can be used
> for real harm.
>
> This warning requires physical acceess and is only slightly annoying.
> Like a USB device making loud farting sounds.  You'd just disconnect the
> device.  No need for Linux to detect the sound and handle it
> automatically, I think.

Hi Bjørn,

Besides the production use you are referring to, there are 2 cases we
should take into account as well:
1. Testing.
Any kernel testing system needs a binary criteria for detecting kernel
bugs. It seems right to detect unkillable hung tasks as kernel bugs.
Which means that we need to resolve this in some way regardless of the
production scenario.
2. Reliable killing of processes.
It's a very important property that an admin or script can reliably
kill whatever process/container they need to kill for whatever reason.
This case results in an unkillable process, which means scripts will
fail, automated systems will misbehave, admins will waste time (if
they are qualified to resolve this at all).