lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dd139846-830e-9363-91d3-1dc31be7702c@oracle.com>
Date:   Tue, 21 Dec 2021 13:17:20 -0500
From:   George Kennedy <george.kennedy@...cle.com>
To:     Daniel Borkmann <daniel@...earbox.net>, sdf@...gle.com,
        ast@...nel.org
Cc:     netdev@...r.kernel.org, bpf@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bpf: check size before calling kvmalloc



On 12/20/2021 8:50 AM, George Kennedy wrote:
>
>
> On 12/17/2021 5:45 PM, Daniel Borkmann wrote:
>> On 12/17/21 7:48 PM, George Kennedy wrote:
>>> ZERO_SIZE_PTR ((void *)16) is returned by kvmalloc() instead of NULL
>>> if size is zero. Currently, return values from kvmalloc() are only
>>> checked for NULL. Before calling kvmalloc() check for size of zero
>>> and return error if size is zero to avoid the following crash.
>>>
>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>>> PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
>>> Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
>>> CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
>>> Hardware name: Red Hat KVM, BIOS
>>> RIP: 0010:0x0
>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
>>> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
>>> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
>>> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
>>> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
>>> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
>>> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
>>> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000) 
>>> knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
>>> Call Trace:
>>>   <TASK>
>>>   map_get_next_key kernel/bpf/syscall.c:1279 [inline]
>>>   __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
>>>   __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
>>>   __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
>>>   __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
>>>   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>>   do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
>>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>
>>> Reported-by: syzkaller <syzkaller@...glegroups.com>
>>> Signed-off-by: George Kennedy <george.kennedy@...cle.com>
>>
>> Could you provide some more details, e.g. which map type is this 
>> where we
>> have to assume zero-sized keys everywhere?
>>
>> (Or link to syzkaller report could also work alternatively if public.)
>
> I don't think the report is public. Here's the report and C reproducer:
>
> #ifdef REF
> Syzkaller hit 'BUG: unable to handle kernel NULL pointer dereference 
> in bpf' bug.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
> Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
> Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 
> 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
> Call Trace:
>  <TASK>
>  map_get_next_key kernel/bpf/syscall.c:1279 [inline]
>  __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
>  __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
>  __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
>  __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f62bc36f289
> Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffccaa211e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f62bc36f289
> RDX: 0000000000000020 RSI: 0000000020000080 RDI: 0000000000000004
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0
> R13: 00007ffccaa212d0 R14: 0000000000000000 R15: 0000000000000000
>  </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace d203e5a1836d64aa ]---
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
>
>
> Syzkaller reproducer:
> # {Threaded:false Collide:false Repeat:false RepeatTimes:0 Procs:1 
> Slowdown:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 Leak:false 
> NetInjection:false NetDevices:false NetReset:false Cgroups:false 
> BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false USB:false 
> VhciInjection:false Wifi:false IEEE802154:false Sysctl:false 
> UseTmpDir:false HandleSegv:false Repro:false Trace:false}
> r0 = bpf$MAP_CREATE(0x0, &(0x7f0000001480)={0x1e, 0x0, 0x2, 0x2, 0x0, 
> 0x1}, 0x40)
> bpf$MAP_GET_NEXT_KEY(0x4, &(0x7f0000000080)={r0, 0x0, 0x0}, 0x20)
>
>
> C reproducer:
> #endif /* REF */
> // autogenerated by syzkaller (https://github.com/google/syzkaller)
>
> #define _GNU_SOURCE
>
> #include <endian.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> #ifndef __NR_bpf
> #define __NR_bpf 321
> #endif
>
> uint64_t r[1] = {0xffffffffffffffff};
>
> int main(void)
> {
>         syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>     syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>     syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>                 intptr_t res = 0;
> *(uint32_t*)0x20001480 = 0x1e;
> *(uint32_t*)0x20001484 = 0;
> *(uint32_t*)0x20001488 = 2;
> *(uint32_t*)0x2000148c = 2;
> *(uint32_t*)0x20001490 = 0;
> *(uint32_t*)0x20001494 = 1;
> *(uint32_t*)0x20001498 = 0;
> memset((void*)0x2000149c, 0, 16);
> *(uint32_t*)0x200014ac = 0;
> *(uint32_t*)0x200014b0 = -1;
> *(uint32_t*)0x200014b4 = 0;
> *(uint32_t*)0x200014b8 = 0;
> *(uint32_t*)0x200014bc = 0;
>     res = syscall(__NR_bpf, 0ul, 0x20001480ul, 0x40ul);
>     if (res != -1)
>         r[0] = res;
> *(uint32_t*)0x20000080 = r[0];
> *(uint64_t*)0x20000088 = 0;
> *(uint64_t*)0x20000090 = 0;
> *(uint64_t*)0x20000098 = 0;
>     syscall(__NR_bpf, 4ul, 0x20000080ul, 0x20ul);
>     return 0;
> }
>
> George
>
Hi Daniel,

I missed another set of kvmallocs. Here's another report and reproducer:

Syzkaller hit 'WARNING: kmalloc bug in bpf' bug.

------------[ cut here ]------------
WARNING: CPU: 1 PID: 15091 at mm/util.c:597 kvmalloc_node+0x11d/0x130 mm/util.c:597
Modules linked in:
CPU: 1 PID: 15091 Comm: syz-executor949 Not tainted 5.16.0-rc5-syzk #1
Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 04/01/2014
RIP: 0010:kvmalloc_node+0x11d/0x130 mm/util.c:597
Code: 01 00 00 00 48 89 df e8 01 4f 0c 00 49 89 c5 e9 68 ff ff ff e8 b4 82 ca ff 45 89 e5 41 81 cd 00 20 01 00 eb 95 e8 a3 82 ca ff <0f> 0b e9 4b ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44
RSP: 0018:ffff888017687b50 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000080000001 RCX: ffffffff81b63b8a
RDX: 0000000000000000 RSI: ffff888101916500 RDI: 0000000000000002
RBP: ffff888017687b70 R08: 0000000000112cc0 R09: 00000000ffffffff
R10: 0000000000000000 R11: ffffed1004a71db0 R12: 0000000000102cc0
R13: 0000000000000000 R14: 00000000ffffffff R15: ffff888025092800
FS:  00007f0794bc3740(0000) GS:ffff888107880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000500 CR3: 00000000299d0000 CR4: 00000000000006e0
Call Trace:
  <TASK>
  kvmalloc include/linux/slab.h:741 [inline]
  map_lookup_elem kernel/bpf/syscall.c:1099 [inline]
  __sys_bpf+0x415b/0x5a80 kernel/bpf/syscall.c:4618
  __do_sys_bpf kernel/bpf/syscall.c:4737 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:4735 [inline]
  __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4735
  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
  do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f07944ba289
Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc3a07dcd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f07944ba289
RDX: 0000000000000020 RSI: 0000000020000240 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0
R13: 00007ffc3a07ddc0 R14: 0000000000000000 R15: 0000000000000000
  </TASK>
---[ end trace 67ed3be15b904c13 ]---


Syzkaller reproducer:
# {Threaded:false Collide:false Repeat:false RepeatTimes:0 Procs:1 Slowdown:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false UseTmpDir:false HandleSegv:false Repro:false Trace:false}
r0 = bpf$MAP_CREATE(0x0, &(0x7f0000000500)={0x1e, 0x0, 0x80000001, 0x1, 0x0, 0x1}, 0x40)
bpf$MAP_LOOKUP_ELEM(0x1, &(0x7f0000000240)={r0, 0x0, 0x0}, 0x20)


C reproducer:
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#ifndef __NR_bpf
#define __NR_bpf 321
#endif

uint64_t r[1] = {0xffffffffffffffff};

int main(void)
{
		syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
	syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
	syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
				intptr_t res = 0;
*(uint32_t*)0x20000500 = 0x1e;
*(uint32_t*)0x20000504 = 0;
*(uint32_t*)0x20000508 = 0x80000001;
*(uint32_t*)0x2000050c = 1;
*(uint32_t*)0x20000510 = 0;
*(uint32_t*)0x20000514 = 1;
*(uint32_t*)0x20000518 = 0;
memset((void*)0x2000051c, 0, 16);
*(uint32_t*)0x2000052c = 0;
*(uint32_t*)0x20000530 = -1;
*(uint32_t*)0x20000534 = 0;
*(uint32_t*)0x20000538 = 0;
*(uint32_t*)0x2000053c = 0;
	res = syscall(__NR_bpf, 0ul, 0x20000500ul, 0x40ul);
	if (res != -1)
		r[0] = res;
*(uint32_t*)0x20000240 = r[0];
*(uint64_t*)0x20000248 = 0;
*(uint64_t*)0x20000250 = 0;
*(uint64_t*)0x20000258 = 0;
	syscall(__NR_bpf, 1ul, 0x20000240ul, 0x20ul);
	return 0;
}


It seems like kvmalloc and its friends are used with no size check
throughout the kernel. It seems like the commit that returned
ZERO_SIZE_PTR ((void *)16) should be backed out.

Should I send out a v2 of the patch including the other kvmalloc
calls or do you have a suggested fix?

Thanks,
George

>>
>> Thanks,
>> Daniel
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ