lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87plpwe6kx.ffs@tglx>
Date: Sun, 25 Aug 2024 21:38:22 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Xingyu Li <xli399@....edu>, anna-maria@...utronix.de,
 frederic@...nel.org, linux-kernel@...r.kernel.org
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>, Jakub
 Kicinski <kuba@...nel.org>
Subject: Re: BUG: general protection fault in hrtimer_try_to_cancel

On Sat, Aug 24 2024 at 21:50, Xingyu Li wrote:

Cc=: network folks.

> We found a bug in Linux 6.10. It is probably a null pointer dereference bug.
> The reason is probably that in line 1615 of kernel/time/hrtimer.c,
> before "seq = raw_read_seqcount_begin(&base->seq);", there is no null
> pointer check for 'base'.

So something in the network code invokes hrtimer_cancel() in a teardown
operation on a non-initialized hrtimer and hrtimers contrary to the timer
wheel does not check for initialization. That's trivial to fix, see
below.

But that does not explain the actual root cause. That NULL pointer
dereference is just the messenger.

This is gro_cells related:

>  hrtimer_cancel+0x12/0x50 kernel/time/hrtimer.c:1447
>  napi_disable+0x1b6/0x210 net/core/dev.c:6648
>  gro_cells_destroy+0x12a/0x3d0 net/core/gro_cells.c:116

gro_cells_init() invokes netif_napi_add() for each per cpu
cell. netif_napi_add() invokes hrtimer_init().

So how can gro_cells_destroy() have a non-initialized hrtimer?

I defer that question to the network people

Thanks,

        tglx

> The bug report is as follows, but unfortunately there is no generated
> syzkaller reproducer.
>
> bridge0: port 2(bridge_slave_1) entered disabled state
> bridge_slave_0: left allmulticast mode
> bridge_slave_0: left promiscuous mode
> bridge0: port 1(bridge_slave_0) entered disabled state
> Oops: general protection fault, probably for non-canonical address
> 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
> KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> CPU: 0 PID: 29 Comm: kworker/u4:2 Not tainted 6.10.0 #13
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Workqueue: netns cleanup_net
> RIP: 0010:__seqprop_raw_spinlock_sequence include/linux/seqlock.h:226 [inline]
> RIP: 0010:hrtimer_active kernel/time/hrtimer.c:1615 [inline]
> RIP: 0010:hrtimer_try_to_cancel+0x7c/0x410 kernel/time/hrtimer.c:1332
> Code: 2f 12 00 48 8b 5c 24 10 48 8b 44 24 08 42 80 3c 30 00 74 08 48
> 89 df e8 02 3b 75 00 4c 8b 3b 4d 8d 67 10 4c 89 e3 48 c1 eb 03 <42> 8a
> 04 33 84 c0 0f 85 f4 00 00 00 41 8b 2c 24 89 ee 83 e6 01 31
> RSP: 0018:ffffc9000080f7a0 EFLAGS: 00010202
> RAX: 1ffffd1ffff88a14 RBX: 0000000000000002 RCX: 1ffffd1ffff88a15
> RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffe8ffffc45070
> RBP: ffffe8ffffc45070 R08: ffffc9000080f867 R09: 1ffff92000101f0c
> R10: dffffc0000000000 R11: fffff52000101f0d R12: 0000000000000010
> R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f09772124b0 CR3: 000000001f978000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  hrtimer_cancel+0x12/0x50 kernel/time/hrtimer.c:1447
>  napi_disable+0x1b6/0x210 net/core/dev.c:6648
>  gro_cells_destroy+0x12a/0x3d0 net/core/gro_cells.c:116
>  unregister_netdevice_many_notify+0x10a5/0x16d0 net/core/dev.c:11239
>  cleanup_net+0x764/0xcd0 net/core/net_namespace.c:635
>  process_one_work kernel/workqueue.c:3248 [inline]
>  process_scheduled_works+0x977/0x1410 kernel/workqueue.c:3329
>  worker_thread+0xaa0/0x1020 kernel/workqueue.c:3409
>  kthread+0x2eb/0x380 kernel/kthread.c:389
>  ret_from_fork+0x49/0x80 arch/x86/kernel/process.c:147
>  ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244
>  </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:__seqprop_raw_spinlock_sequence include/linux/seqlock.h:226 [inline]
> RIP: 0010:hrtimer_active kernel/time/hrtimer.c:1615 [inline]
> RIP: 0010:hrtimer_try_to_cancel+0x7c/0x410 kernel/time/hrtimer.c:1332
> Code: 2f 12 00 48 8b 5c 24 10 48 8b 44 24 08 42 80 3c 30 00 74 08 48
> 89 df e8 02 3b 75 00 4c 8b 3b 4d 8d 67 10 4c 89 e3 48 c1 eb 03 <42> 8a
> 04 33 84 c0 0f 85 f4 00 00 00 41 8b 2c 24 89 ee 83 e6 01 31
> RSP: 0018:ffffc9000080f7a0 EFLAGS: 00010202
> RAX: 1ffffd1ffff88a14 RBX: 0000000000000002 RCX: 1ffffd1ffff88a15
> RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffe8ffffc45070
> RBP: ffffe8ffffc45070 R08: ffffc9000080f867 R09: 1ffff92000101f0c
> R10: dffffc0000000000 R11: fffff52000101f0d R12: 0000000000000010
> R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000563cd2cf2058 CR3: 000000001d166000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess), 1 bytes skipped:
>    0: 12 00                 adc    (%rax),%al
>    2: 48 8b 5c 24 10       mov    0x10(%rsp),%rbx
>    7: 48 8b 44 24 08       mov    0x8(%rsp),%rax
>    c: 42 80 3c 30 00       cmpb   $0x0,(%rax,%r14,1)
>   11: 74 08                 je     0x1b
>   13: 48 89 df             mov    %rbx,%rdi
>   16: e8 02 3b 75 00       call   0x753b1d
>   1b: 4c 8b 3b             mov    (%rbx),%r15
>   1e: 4d 8d 67 10           lea    0x10(%r15),%r12
>   22: 4c 89 e3             mov    %r12,%rbx
>   25: 48 c1 eb 03           shr    $0x3,%rbx
> * 29: 42 8a 04 33           mov    (%rbx,%r14,1),%al <-- trapping instruction
>   2d: 84 c0                 test   %al,%al
>   2f: 0f 85 f4 00 00 00     jne    0x129
>   35: 41 8b 2c 24           mov    (%r12),%ebp
>   39: 89 ee                 mov    %ebp,%esi
>   3b: 83 e6 01             and    $0x1,%esi
>   3e: 31                   .byte 0x31

---
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index a023946f8558..448bce5e6a05 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -485,6 +485,11 @@ static inline void debug_deactivate(struct hrtimer *timer)
 	trace_hrtimer_cancel(timer);
 }
 
+static inline bool hrtimer_initialized(const struct hrtimer *timer)
+{
+	return timer->base && timer->function;
+}
+
 static struct hrtimer_clock_base *
 __next_base(struct hrtimer_cpu_base *cpu_base, unsigned int *active)
 {
@@ -1285,7 +1290,7 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 	struct hrtimer_clock_base *base;
 	unsigned long flags;
 
-	if (WARN_ON_ONCE(!timer->function))
+	if (WARN_ON_ONCE(!hrtimer_initialized(timer)))
 		return;
 	/*
 	 * Check whether the HRTIMER_MODE_SOFT bit and hrtimer.is_soft
@@ -1612,6 +1617,9 @@ bool hrtimer_active(const struct hrtimer *timer)
 	struct hrtimer_clock_base *base;
 	unsigned int seq;
 
+	if (WARN_ON_ONCE(!hrtimer_initialized(timer)))
+		return false;
+
 	do {
 		base = READ_ONCE(timer->base);
 		seq = raw_read_seqcount_begin(&base->seq);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ