netdev - Re: bpf: Massive skbuff_head

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+b5yakqmKvuojJP04HT+6LvZ4k=VxHF9kFkbHaEA3D4nA@mail.gmail.com>
Date:   Thu, 27 Sep 2018 12:24:03 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        John Johansen <john.johansen@...onical.com>
Subject: Re: bpf: Massive skbuff_head_cache memory leak?

On Wed, Sep 26, 2018 at 11:09 PM, Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
> Hello, Alexei and Daniel.
>
> Can you show us how to run testcases you are testing?
>
> On 2018/09/22 22:25, Tetsuo Handa wrote:
>> Hello.
>>
>> syzbot is reporting many lockup problems on bpf.git / bpf-next.git / net.git / net-next.git trees.
>>
>>   INFO: rcu detected stall in br_multicast_port_group_expired (2)
>>   https://syzkaller.appspot.com/bug?id=15c7ad8cf35a07059e8a697a22527e11d294bc94
>>
>>   INFO: rcu detected stall in tun_chr_close
>>   https://syzkaller.appspot.com/bug?id=6c50618bde03e5a2eefdd0269cf9739c5ebb8270
>>
>>   INFO: rcu detected stall in discover_timer
>>   https://syzkaller.appspot.com/bug?id=55da031ddb910e58ab9c6853a5784efd94f03b54
>>
>>   INFO: rcu detected stall in ret_from_fork (2)
>>   https://syzkaller.appspot.com/bug?id=c83129a6683b44b39f5b8864a1325893c9218363
>>
>>   INFO: rcu detected stall in addrconf_rs_timer
>>   https://syzkaller.appspot.com/bug?id=21c029af65f81488edbc07a10ed20792444711b6
>>
>>   INFO: rcu detected stall in kthread (2)
>>   https://syzkaller.appspot.com/bug?id=6accd1ed11c31110fed1982f6ad38cc9676477d2
>>
>>   INFO: rcu detected stall in ext4_filemap_fault
>>   https://syzkaller.appspot.com/bug?id=817e38d20e9ee53390ac361bf0fd2007eaf188af
>>
>>   INFO: rcu detected stall in run_timer_softirq (2)
>>   https://syzkaller.appspot.com/bug?id=f5a230a3ff7822f8d39fddf8485931bd06ae47fe
>>
>>   INFO: rcu detected stall in bpf_prog_ADDR
>>   https://syzkaller.appspot.com/bug?id=fb4911fd0e861171cc55124e209f810a0dd68744
>>
>>   INFO: rcu detected stall in __run_timers (2)
>>   https://syzkaller.appspot.com/bug?id=65416569ddc8d2feb8f19066aa761f5a47f7451a
>>
>> The cause of lockup seems to be flood of printk() messages from memory allocation
>> failures, and one of out_of_memory() messages indicates that skbuff_head_cache
>> usage is huge enough to suspect in-kernel memory leaks.
>>
>>   [ 1554.547011] skbuff_head_cache    1847887KB    1847887KB
>>
>> Unfortunately, we cannot find from logs what syzbot is trying to do
>> because constant printk() messages is flooding away syzkaller messages.
>> Can you try running your testcases with kmemleak enabled?
>>
>
> On 2018/09/27 2:35, Dmitry Vyukov wrote:
>> I also started suspecting Apparmor. We switched to Apparmor on Aug 30:
>> https://groups.google.com/d/msg/syzkaller-bugs/o73lO4KGh0w/j9pcH2tSBAAJ
>> Now the instances that use SELinux and Smack explicitly contain that
>> in the name, but the rest are Apparmor.
>> Aug 30 roughly matches these assorted "task hung" reports. Perhaps
>> some Apparmor hook leaks a reference to skbs?
>
> Maybe. They have CONFIG_DEFAULT_SECURITY="apparmor". But I'm wondering why
> this problem is not occurring on linux-next.git when this problem is occurring
> on bpf.git / bpf-next.git / net.git / net-next.git trees. Is syzbot running
> different testcases depending on which git tree is targeted?


Yes, this is strange. Net/bpf instances run _subset_ of tests. That
is, they are more concentrated on the corresponding subsystems, but
other instances can run all these tests too, just with lower
probability.

Bpf instances are restricted to this set of syscalls:

"enable_syscalls": [
    "bpf", "mkdir", "mount$bpf", "unlink", "close",
    "perf_event_open", "ioctl$PERF*", "getpid", "gettid",
    "socketpair", "sendmsg", "recvmsg", "setsockopt$sock_attach_bpf",
    "socket$kcm", "ioctl$sock_kcm*",
    "mkdirat$cgroup*", "openat$cgroup*", "write$cgroup*",
    "openat$tun", "write$tun", "ioctl$TUN*", "ioctl$SIOCSIFHWADDR"
]

Net instances to this:

"enable_syscalls": [
    "accept", "accept4", "bind", "close", "connect", "epoll_create",
    "epoll_create1", "epoll_ctl", "epoll_pwait", "epoll_wait",
    "getpeername", "getsockname", "getsockopt", "ioctl", "listen",
    "mmap", "poll", "ppoll", "pread64", "preadv", "pselect6",
    "pwrite64", "pwritev", "read", "readv", "recvfrom", "recvmmsg",
    "recvmsg", "select", "sendfile", "sendmmsg", "sendmsg", "sendto",
    "setsockopt", "shutdown", "socket", "socketpair", "splice",
    "vmsplice", "write", "writev", "tee", "bpf", "getpid",
    "getgid", "getuid", "gettid", "unshare", "pipe",
    "syz_emit_ethernet", "syz_extract_tcp_res",
    "syz_genetlink_get_family_id", "syz_init_net_socket",
    "mkdirat$cgroup*", "openat$cgroup*", "write$cgroup*",
    "clock_gettime", "bpf"
]