netdev - Re: RCU callback crashes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6840ce4-d4a8-19c5-f8a5-3dfc00aa7e4b@gmail.com>
Date:   Wed, 20 Dec 2017 10:04:17 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Jakub Kicinski <kubakici@...pl>, Jiri Pirko <jiri@...nulli.us>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Cong Wang <xiyou.wangcong@...il.com>
Subject: Re: RCU callback crashes

On 12/19/2017 10:34 PM, Jakub Kicinski wrote:
> On Tue, 19 Dec 2017 22:22:27 -0800, Jakub Kicinski wrote:
>>>> I get this:    
>>>
>>> Could you try to run it with kasan on?  
>>
>> I didn't manage to reproduce it with KASAN on so far :(  Even enabling
>> object debugging to get the second splat in my email (which is more
>> useful) actually makes the crash go away, I only see the warning...
> 
> Ah, no object debug but KASAN on produces this:
> 

@Jakub, This is with mq and pfifo_fast I guess?

> [   39.268209] BUG: KASAN: use-after-free in cpu_needs_another_gp+0x246/0x2b0
> [   39.275965] Read of size 8 at addr ffff8803aa64f138 by task swapper/13/0
> [   39.283524] 
> [   39.285256] CPU: 13 PID: 0 Comm: swapper/13 Not tainted 4.15.0-rc3-perf-00955-g1d0b01347dd5-dirty #8
> [   39.295535] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
> [   39.303969] Call Trace:
> [   39.306769]  <IRQ>
> [   39.309088]  dump_stack+0xa6/0x118
> [   39.312957]  ? _atomic_dec_and_lock+0xe8/0xe8
> [   39.317895]  ? cpu_needs_another_gp+0x246/0x2b0
> [   39.323030]  print_address_description+0x6a/0x270
> [   39.328380]  ? cpu_needs_another_gp+0x246/0x2b0
> [   39.333510]  kasan_report+0x23f/0x350
> [   39.337672]  cpu_needs_another_gp+0x246/0x2b0
> ...
> [   39.383026]  rcu_process_callbacks+0x1a0/0x620
> ...
> [   39.426713]  __do_softirq+0x17f/0x4de
> ...
> [   39.463841]  irq_exit+0xe1/0xf0
> [   39.467437]  smp_apic_timer_interrupt+0xd9/0x290
> [   39.472685]  ? smp_call_function_single_interrupt+0x230/0x230
> [   39.479195]  ? smp_reschedule_interrupt+0x240/0x240
> [   39.484736]  apic_timer_interrupt+0x8c/0xa0
> [   39.489497]  </IRQ>
> [   39.491929] RIP: 0010:cpuidle_enter_state+0x12a/0x510
> [   39.497660] RSP: 0018:ffff88086bf9fd08 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
> [   39.506228] RAX: 0000000000000000 RBX: ffffe8ffffb060e0 RCX: ffffffff921329f5
> [   39.514291] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88086f3246e8
> [   39.522354] RBP: 1ffff1010d7f3fa6 R08: fffffbfff2742768 R09: fffffbfff2742768
> [   39.530418] R10: ffff88086bf9fcc8 R11: fffffbfff2742767 R12: 0000000924148b4b
> [   39.538480] R13: 0000000000000004 R14: 0000000000000004 R15: ffffffff9383eb80
> [   39.546545]  ? sched_idle_set_state+0x25/0x30
> [   39.551502]  ? cpuidle_enter_state+0x106/0x510
> [   39.556556]  ? cpuidle_enter_s2idle+0x130/0x130
> [   39.561706]  ? rcu_eqs_enter_common.constprop.62+0xd1/0x1e0
> [   39.568037]  ? rcu_gp_init+0xf70/0xf70
> [   39.572331]  ? sched_set_stop_task+0x160/0x160
> [   39.577384]  do_idle+0x1af/0x200
> [   39.581076]  cpu_startup_entry+0xd2/0xe0
> [   39.585545]  ? cpu_in_idle+0x20/0x20
> [   39.589626]  ? _raw_spin_trylock+0xe0/0xe0
> [   39.594292]  ? memcpy+0x34/0x50
> [   39.597890]  start_secondary+0x271/0x2b0
> [   39.602361]  ? set_cpu_sibling_map+0x840/0x840
> [   39.607416]  secondary_startup_64+0xa5/0xb0
> [   39.612180] 
> [   39.613929] Allocated by task 1358:
> [   39.617914]  __kmalloc_node+0x183/0x2c0
> [   39.622290]  qdisc_alloc+0xbd/0x3f0
> [   39.626274]  qdisc_create+0xd8/0x720
> [   39.630355]  tc_modify_qdisc+0x657/0x910
> [   39.634826]  rtnetlink_rcv_msg+0x37c/0x7e0
> [   39.639491]  netlink_rcv_skb+0x122/0x230
> [   39.643960]  netlink_unicast+0x2ae/0x360
> [   39.648443]  netlink_sendmsg+0x5d5/0x620
> [   39.652915]  sock_sendmsg+0x64/0x80
> [   39.656900]  ___sys_sendmsg+0x4a8/0x500
> [   39.661272]  __sys_sendmsg+0xa9/0x140
> [   39.665450]  entry_SYSCALL_64_fastpath+0x1e/0x81
> [   39.670695] 
> [   39.672441] Freed by task 1370:
> [   39.676052]  kfree+0x8d/0x1c0
> [   39.679454]  qdisc_graft+0x208/0x670
> [   39.683535]  tc_get_qdisc+0x229/0x350
> [   39.687713]  rtnetlink_rcv_msg+0x37c/0x7e0
> [   39.692411]  netlink_rcv_skb+0x122/0x230
> [   39.696881]  netlink_unicast+0x2ae/0x360
> [   39.701350]  netlink_sendmsg+0x5d5/0x620
> [   39.705819]  sock_sendmsg+0x64/0x80
> [   39.709801]  ___sys_sendmsg+0x4a8/0x500
> [   39.714172]  __sys_sendmsg+0xa9/0x140
> [   39.718351]  entry_SYSCALL_64_fastpath+0x1e/0x81
> [   39.723597] 
> [   39.725347] The buggy address belongs to the object at ffff8803aa64ef80
> [   39.725347]  which belongs to the cache kmalloc-512 of size 512
> [   39.739453] The buggy address is located 440 bytes inside of
> [   39.739453]  512-byte region [ffff8803aa64ef80, ffff8803aa64f180)
> [   39.752684] The buggy address belongs to the page:
> [   39.758127] page:0000000042b3124b count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
> [   39.769222] flags: 0x2ffff0000008100(slab|head)
> [   39.774365] raw: 02ffff0000008100 0000000000000000 0000000000000000 0000000180190019
> [   39.783129] raw: dead000000000100 dead000000000200 ffff8803afc0ed80 0000000000000000
> [   39.791986] page dumped because: kasan: bad access detected
> [   39.798300] 
> [   39.800063] Memory state around the buggy address:
> [   39.805503]  ffff8803aa64f000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   39.813684]  ffff8803aa64f080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   39.821866] >ffff8803aa64f100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   39.830045]                                         ^
> [   39.835778]  ffff8803aa64f180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   39.843958]  ffff8803aa64f200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 

So with lockless qdisc support we really do need to wait a
RCU grace period before free'ing the qdisc. I missed this initially
in the lockless qdisc set but we need to revert this,

commit 752fbcc33405d6f8249465e4b2c4e420091bb825
Author: Cong Wang <xiyou.wangcong@...il.com>
Date:   Tue Sep 19 13:15:42 2017 -0700

    net_sched: no need to free qdisc in RCU callback
    
    gen estimator has been rewritten in commit 1c0d32fde5bd
    ("net_sched: gen_estimator: complete rewrite of rate estimators"),
    the caller no longer needs to wait for a grace period. So this
    patch gets rid of it.
    
    Cc: Jamal Hadi Salim <jhs@...atatu.com>
    Cc: Eric Dumazet <edumazet@...gle.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
    Acked-by: Eric Dumazet <edumazet@...gle.com>
    Signed-off-by: David S. Miller <davem@...emloft.net>

Thanks,
John