linux-kernel - Re: BLKSECDISCARD ioctl and hung tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJmaN=kYvGWs=e_ee-DgRs2yW1UFgypKGxOTW2u1MSz1zkmHgQ@mail.gmail.com>
Date:   Wed, 12 Feb 2020 17:24:19 -0800
From:   Jesse Barnes <jsbarnes@...gle.com>
To:     Salman Qazi <sqazi@...gle.com>
Cc:     "Theodore Y. Ts'o" <tytso@....edu>, Jens Axboe <axboe@...nel.dk>,
        Ming Lei <ming.lei@...hat.com>,
        Bart Van Assche <bvanassche@....org>,
        Christoph Hellwig <hch@....de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-block@...r.kernel.org, Gwendal Grignou <gwendal@...gle.com>
Subject: Re: BLKSECDISCARD ioctl and hung tasks

On Wed, Feb 12, 2020 at 5:20 PM Salman Qazi <sqazi@...gle.com> wrote:
>
> On Wed, Feb 12, 2020 at 3:07 PM Theodore Y. Ts'o <tytso@....edu> wrote:
> >
> > This is a problem we've been strugging with in other contexts.  For
> > example, if you have the hung task timer set to 2 minutes, and the
> > system to panic if the hung task timer exceeds that, and an NFS server
> > which the client is writing to crashes, and it takes longer for the
> > NFS server to come back, that might be a situation where we might want
> > to exempt the hung task warning from panic'ing the system.  On the
> > other hand, if the process is failing to schedule for other reasons,
> > maybe we would still want the hung task timeout to go off.
> >
> > So I've been meditating over whether the right answer is to just
> > globally configure the hung task timer to something like 5 or 10
> > minutes (which would require no kernel changes, yay?), or have some
> > way of telling the hung task timeout logic that it shouldn't apply, or
> > should have a different timeout, when we're waiting for I/O to
> > complete.
>
> The problem that I anticipate in our space is that a generous timeout
> will make impatient people reboot their chromebooks, losing us
> information
> about hangs.  But, this can be worked around by having multiple
> different timeouts.  For instance, a thread that is expecting to do
> something slow, can set a flag
> to indicate that it wishes to be held against the more generous
> criteria.  This is something I am tempted to do on older kernels where
> we might not feel
> comfortable backporting io_uring.

I was going to reply along the same lines when I got distracted by a
mtg.  If anything I'd like to see a LOWER hung task timeout, generally
speaking.  And maybe that means having more operations be asynchronous
like Ted suggests (I'm generally a fan of that anyway).

[snipped good suggestion about async interface]

Jesse