lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+aEaewxA9t68qpZvqntRY8eveHkXe7TXY_YFoectRHCHg@mail.gmail.com>
Date:   Tue, 31 Oct 2017 13:23:13 +0300
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     syzbot 
        <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@...kaller.appspotmail.com>,
        dvhart@...radead.org, LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        syzkaller-bugs@...glegroups.com,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: WARNING in get_pi_state

On Tue, Oct 31, 2017 at 1:21 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Tue, Oct 31, 2017 at 1:08 PM, Peter Zijlstra <peterz@...radead.org> wrote:
>> On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
>>> I understand your sentiment, but it's definitely not _at all_. The
>>> system compiled this exact code, run it and triggered the bug on it.
>>> Do you have suggestions on how to make this code more portable? How
>>> does this setup would look on your system?
>>
>> So I don't see the point of that tun stuff; what was is supposed to do?
>>
>> All it ever did after creation was flush_tun(), which reads until empty.
>> But given nobody would ever write into it, that's an 'expensive' NO-OP.
>
> See the text below.
> It does try to minimize both programs and features used (e.g. also
> these clunky NONFAILING macros, and filesystem business). But if it
> takes 100 seconds to reproduce, then it's hard to do minimization.
> Consider that you are trying to bisect such bugs, that also will be
> hard and unreliable, and you can get a wrong commit in the end.
>
> See this for an example for much more tidy reproducer:
> https://groups.google.com/forum/#!topic/syzkaller-bugs/9nYn7hpNpEk
> But that's a single threaded bug that instantly triggers each time you
> run the program.


But having said that, the tun code is not supposed to make the
reproducer non-working either. E.g. on our systems it just setups tun
successfully and then proceeds to the actual code that triggers the
problem. What's the failure mode with tun code on your system? If we
make it more portable, then such repros will work on your system as
well.



>>> We do try hard to get rid of unnecessary stuff in reproducers. I think
>>> what happened in this case is the following. This is a hard to
>>> reproduce race. The bot was able to reproduce the crash on initial
>>> program that uses tun, then tried to get rid of tun code and
>>> re-reproduce it, but it did not reproduce this time, so it concluded
>>> that tun code is somehow necessary here. That's unfortunate
>>> consequence of testing complex concurrent code. May become somewhat
>>> better once we have KTSAN, the race detector.
>>
>> I ripped out the tun bits and it reproduced in ~100 seconds. I've now
>> got it running for well over 30m on the fixed kernel while I'm trying to
>> come up with a comprehensible Changelog ;-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ