netdev - Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKv+Gu8iLCB1rS7F5xcGN+6K-+kxYgfPcphX8vULWpvNfv8tpA@mail.gmail.com>
Date:   Wed, 18 Oct 2017 20:32:05 +0100
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "<netdev@...r.kernel.org>" <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

On 18 October 2017 at 19:45, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Wed, 2017-10-18 at 18:57 +0100, Ard Biesheuvel wrote:
>> On 18 October 2017 at 17:29, Eric Dumazet <eric.dumazet@...il.com> wrote:
>> > On Wed, 2017-10-18 at 16:45 +0100, Ard Biesheuvel wrote:
>> >> Even though calling dql_completed() with a count that exceeds the
>> >> queued count is a serious error, it still does not justify bringing
>> >> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
>> >> instead.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@...aro.org>
>> >> ---
>> >>  lib/dynamic_queue_limits.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
>> >> index f346715e2255..24ce495d78f3 100644
>> >> --- a/lib/dynamic_queue_limits.c
>> >> +++ b/lib/dynamic_queue_limits.c
>> >> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
>> >>       num_queued = ACCESS_ONCE(dql->num_queued);
>> >>
>> >>       /* Can't complete more than what's in queue */
>> >> -     BUG_ON(count > num_queued - dql->num_completed);
>> >> +     WARN_ON(count > num_queued - dql->num_completed);
>> >>
>> >>       completed = dql->num_completed + count;
>> >>       limit = dql->limit;
>> >
>> > So instead fixing the faulty driver, you'll have strange lockups, and
>> > force your users to reboot anyway, after annoying periods where
>> > "Internet does not work"
>> >
>> > These kinds of errors should be found when testing a new device driver
>> > or new kernel.
>> >
>> > Have you found the root cause ?
>> >
>>
>> Not yet, and I don't intend to send out any patches for this
>> particular hardware until this is fixed.
>>
>> But that still doesn't mean you should crash hard. As Linus puts it,
>> it is better to 'limp on' if you can (unless we're likely to corrupt
>> any non-volatile data, e.g., files on disk etc)
>
> How many BUG() do you plan to change to WARN() exactly ?
>

How is that relevant?

> If you want to comply to Linus wish, just compile your kernel
> with appropriate option.
>
> CONFIG_BUG=n
>

If it is essential that we crash hard in this location, without *any*
opportunity whatsoever to shutdown cleanly or perform any diagnosis on
the system while it is still up, then please disregard this patch.