[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Yy-bF07F7F8DoFY8=4LtLURRn1WsZzNZ9LN+N=vn7Tpw@mail.gmail.com>
Date: Mon, 7 Jan 2019 12:12:53 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc: syzbot <syzbot+ea7d9cb314b4ab49a18a@...kaller.appspotmail.com>,
David Miller <davem@...emloft.net>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
LKML <linux-kernel@...r.kernel.org>,
netdev <netdev@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Linux-MM <linux-mm@...ck.org>, Shakeel Butt <shakeelb@...gle.com>
Subject: Re: INFO: rcu detected stall in ndisc_alloc_skb
On Sun, Jan 6, 2019 at 2:47 PM Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
>
> On 2019/01/06 22:24, Dmitry Vyukov wrote:
> >> A report at 2019/01/05 10:08 from "no output from test machine (2)"
> >> ( https://syzkaller.appspot.com/text?tag=CrashLog&x=1700726f400000 )
> >> says that there are flood of memory allocation failure messages.
> >> Since continuous memory allocation failure messages itself is not
> >> recognized as a crash, we might be misunderstanding that this problem
> >> is not occurring recently. It will be nice if we can run testcases
> >> which are executed on bpf-next tree.
> >
> > What exactly do you mean by running test cases on bpf-next tree?
> > syzbot tests bpf-next, so it executes lots of test cases on that tree.
> > One can also ask for patch testing on bpf-next tree to test a specific
> > test case.
>
> syzbot ran "some tests" before getting this report, but we can't find from
> this report what the "some tests" are. If we could record all tests executed
> in syzbot environments before getting this report, we could rerun the tests
> (with manually examining where the source of memory consumption is) in local
> environments.
Filed https://github.com/google/syzkaller/issues/917 for this.
> Since syzbot is now using memcg, maybe we can test with sysctl_panic_on_oom == 1.
> Any memory consumption that triggers global OOM killer could be considered as
> a problem (e.g. memory leak or uncontrolled memory allocation).
Interesting idea. This will also alleviate the previous problem as I
think only a stream of OOMs currently produces 1+MB of output.
+Shakeel who was interested in catching more memcg-escaping allocations.
To do this we need a buy-in from kernel community to consider this as
a bug/something to fix in kernel. Systematic testing can't work gray
checks requiring humans to look at each case and some cases left as
being working-as-intended.
There are also 2 interesting points:
- testing of kernel without memcg-enabled (some kernel users
obviously do this); it's doable, but currently syzkaller have no
precedents/infrastructure to consider some output patterns as bugs or
not depending on kernel features
- false positives for minimized C reproducers that have memcg code
stripped off (people complain that reproducers are too large/complex
otherwise)
Powered by blists - more mailing lists