lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 Oct 2017 13:21:00 +0300
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     syzbot 
        <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@...kaller.appspotmail.com>,
        dvhart@...radead.org, LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        syzkaller-bugs@...glegroups.com,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: WARNING in get_pi_state

On Tue, Oct 31, 2017 at 1:08 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
>> I understand your sentiment, but it's definitely not _at all_. The
>> system compiled this exact code, run it and triggered the bug on it.
>> Do you have suggestions on how to make this code more portable? How
>> does this setup would look on your system?
>
> So I don't see the point of that tun stuff; what was is supposed to do?
>
> All it ever did after creation was flush_tun(), which reads until empty.
> But given nobody would ever write into it, that's an 'expensive' NO-OP.

See the text below.
It does try to minimize both programs and features used (e.g. also
these clunky NONFAILING macros, and filesystem business). But if it
takes 100 seconds to reproduce, then it's hard to do minimization.
Consider that you are trying to bisect such bugs, that also will be
hard and unreliable, and you can get a wrong commit in the end.

See this for an example for much more tidy reproducer:
https://groups.google.com/forum/#!topic/syzkaller-bugs/9nYn7hpNpEk
But that's a single threaded bug that instantly triggers each time you
run the program.


>> We do try hard to get rid of unnecessary stuff in reproducers. I think
>> what happened in this case is the following. This is a hard to
>> reproduce race. The bot was able to reproduce the crash on initial
>> program that uses tun, then tried to get rid of tun code and
>> re-reproduce it, but it did not reproduce this time, so it concluded
>> that tun code is somehow necessary here. That's unfortunate
>> consequence of testing complex concurrent code. May become somewhat
>> better once we have KTSAN, the race detector.
>
> I ripped out the tun bits and it reproduced in ~100 seconds. I've now
> got it running for well over 30m on the fixed kernel while I'm trying to
> come up with a comprehensible Changelog ;-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ