linux-kernel - Re: kernel panic: Attempted to kill init!

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAADnVQKzRS6HcbOPuJRJ=8SOXDDDdy2EBN-LP6vSgB9tLb27Ug@mail.gmail.com>
Date:   Tue, 3 Jan 2023 10:33:50 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Hao Sun <sunhao.th@...il.com>
Cc:     Yonghong Song <yhs@...a.com>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        David Miller <davem@...emloft.net>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: kernel panic: Attempted to kill init!

On Tue, Jan 3, 2023 at 4:46 AM Hao Sun <sunhao.th@...il.com> wrote:
>
>
>
> > On 31 Dec 2022, at 12:55 AM, Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
> >
> > On Fri, Dec 30, 2022 at 1:54 AM Hao Sun <sunhao.th@...il.com> wrote:
> >>
> >>
> >>
> >>> On 28 Dec 2022, at 2:35 PM, Yonghong Song <yhs@...a.com> wrote:
> >>>
> >>>
> >>>
> >>> On 12/21/22 8:35 PM, Hao Sun wrote:
> >>>> Hi,
> >>>> This crash can be triggered by executing the C reproducer for
> >>>> multiple times, which just keep loading the following prog as
> >>>> raw tracepoint into kmem_cache_free().
> >>>> The prog send SIGSEGV to current via bpf_send_signal_thread(),
> >>>> after load this, whoever tries to free mem would trigger this,
> >>>> kernel crashed when this happens to init.
> >>>> Seems we should filter init out in bpf_send_signal_common() by
> >>>> is_global_init(current), or maybe we should check this in the
> >>>> verifier?
> >>>
> >>> The helper is just to send a particular signal to *current*
> >>> thread. In typical use case, it is never a good idea to send
> >>> the signal to a *random* thread. In certain cases, maybe user
> >>> indeed wants to send the signal to init thread to observe
> >>> something. Note that such destructive side effect already
> >>> exists in the bpf land. For example, for a xdp program,
> >>> it could drop all packets to make machine not responsive
> >>> to ssh etc. Therefore, I recommend to keep the existing
> >>> bpf_send_signal_common() helper behavior.
> >>
> >> Sound the two are different cases. Not responsive in XDP seems like
> >> an intended behaviour, panic caused by killing init is buggy. If the
> >> last thread of global init was killed, kernel panic immediately.
> >
> > I don't get it. How was it possible that this prog was
> > executed with current == pid 1 ?
>
> The prog is raw trace point and is attached to ‘kmem_cache_free’ event.
> When init triggered the event, the prog would be executed with pid 1.
> But, the reason of this crash is not very clear to me, because it’s
> really hard to debug with original C reproducer.
>
> The following is the corresponding Syz prog:
>
> # {Threaded:true Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:false NetInjection:true NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:true Sysctl:true UseTmpDir:true HandleSegv:true Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
> r0 = bpf$BPF_PROG_RAW_TRACEPOINT_LOAD(0x5, &(0x7f0000000000)={0x11, 0xe, &(0x7f0000000400)=ANY=[@ANYBLOB="18000000000000000000000000000000180600000000000000000000000000001807000000000000000000000000000018080000000000000000000000000000180900000000000000000000000000002d00020000000000b70100000b000000850000007500000095"], &(0x7f00000000c0)}, 0x80)
> bpf$BPF_RAW_TRACEPOINT_OPEN(0x11, &(0x7f0000000100)={&(0x7f0000000080)='kmem_cache_free\x00', r0}, 0x10)

Does syzbot running without any user space?
Is syzbot itself a pid=1 ? and the only process ?
If so, the error would makes sense.
I guess we can add a safety check to bpf_send_signal_common
to prevent syzbot from killing itself.