lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQLMPPavJQR6JFsi3dtaaLHB816JN4HCV_TFWohJ61D+wQ@mail.gmail.com>
Date: Mon, 5 Aug 2024 10:00:40 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Jiri Olsa <olsajiri@...il.com>
Cc: Juri Lelli <juri.lelli@...hat.com>, bpf <bpf@...r.kernel.org>, 
	LKML <linux-kernel@...r.kernel.org>, Artem Savkov <asavkov@...hat.com>
Subject: Re: NULL pointer deref when running BPF monitor program (6.11.0-rc1)

On Mon, Aug 5, 2024 at 9:50 AM Jiri Olsa <olsajiri@...il.com> wrote:
>
> On Mon, Aug 05, 2024 at 11:20:11AM +0200, Juri Lelli wrote:
>
> SNIP
>
> > [  154.566882] BUG: kernel NULL pointer dereference, address: 000000000000040c
> > [  154.573844] #PF: supervisor read access in kernel mode
> > [  154.578982] #PF: error_code(0x0000) - not-present page
> > [  154.584122] PGD 146fff067 P4D 146fff067 PUD 10fc00067 PMD 0
> > [  154.589780] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [  154.594659] CPU: 28 UID: 0 PID: 2234 Comm: thread0-13 Kdump: loaded Not tainted 6.11.0-rc1 #8
> > [  154.603179] Hardware name: Dell Inc. PowerEdge R740/04FC42, BIOS 2.10.2 02/24/2021
> > [  154.610744] RIP: 0010:bpf_prog_ec8173ca2868eb50_handle__sched_pi_setprio+0x22/0xd7
> > [  154.618310] Code: cc cc cc cc cc cc cc cc 0f 1f 44 00 00 66 90 55 48 89 e5 48 81 ec 30 00 00 00 53 41 55 41 56 48 89 fb 4c 8b 6b 00 4c 8b 73 08 <41> 8b be 0c 04 00 00 48 83 ff 06 0f 85 9b 00 00 00 41 8b be c0 09
> > [  154.637052] RSP: 0018:ffffabac60aebbc0 EFLAGS: 00010086
> > [  154.642278] RAX: ffffffffc03fba5c RBX: ffffabac60aebc28 RCX: 000000000000001f
> > [  154.649411] RDX: ffff95a90b4e4180 RSI: ffffabac4e639048 RDI: ffffabac60aebc28
> > [  154.656544] RBP: ffffabac60aebc08 R08: 00000023fce7674a R09: ffff95a91d85af38
> > [  154.663674] R10: ffff95a91d85a0c0 R11: 000000003357e518 R12: 0000000000000000
> > [  154.670807] R13: ffff95a90b4e4180 R14: 0000000000000000 R15: 0000000000000001
> > [  154.677939] FS:  00007ffa6d600640(0000) GS:ffff95c01bf00000(0000) knlGS:0000000000000000
> > [  154.686026] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  154.691769] CR2: 000000000000040c CR3: 000000014b9f2005 CR4: 00000000007706f0
> > [  154.698903] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  154.706035] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  154.713168] PKRU: 55555554
> > [  154.715879] Call Trace:
> > [  154.718332]  <TASK>
> > [  154.720439]  ? __die+0x20/0x70
> > [  154.723498]  ? page_fault_oops+0x75/0x170
> > [  154.727508]  ? sysvec_irq_work+0xb/0x90
> > [  154.731348]  ? exc_page_fault+0x64/0x140
> > [  154.735275]  ? asm_exc_page_fault+0x22/0x30
> > [  154.739461]  ? 0xffffffffc03fba5c
> > [  154.742780]  ? bpf_prog_ec8173ca2868eb50_handle__sched_pi_setprio+0x22/0xd7
>
> hi,
> reproduced.. AFAICS looks like the bpf program somehow lost the booster != NULL
> check and just load the policy field without it and crash when booster is rubbish
>
> int handle__sched_pi_setprio(u64 * ctx):
> ; int handle__sched_pi_setprio(u64 *ctx)
>    0: (bf) r6 = r1
> ; struct task_struct *boosted = (void *) ctx[0];
>    1: (79) r7 = *(u64 *)(r6 +0)
> ; struct task_struct *booster = (void *) ctx[1];
>    2: (79) r8 = *(u64 *)(r6 +8)
> ; if (booster->policy != SCHED_DEADLINE)
>
> curious why the check disappeared, because object file has it, so I guess verifier
> took it out for some reason, will check

Juri,

Thanks for flagging!

Jiri,

the verifier removes the check because it assumes that pointers
passed by the kernel into tracepoint are valid and trusted.
In this case:
        trace_sched_pi_setprio(p, pi_task);

pi_task can be NULL.

We cannot make all tracepoint pointers to be PTR_TRUSTED | PTR_MAYBE_NULL
by default, since it will break a bunch of progs.
Instead we can annotate this tracepoint arg as __nullable and
teach the verifier to recognize such special arguments of tracepoints.

Let's think how to workaround such verifier eagerness to remove != null check.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ