lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 May 2019 01:16:42 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To:     David Ahern <dsahern@...il.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        "David S. Miller" <davem@...emloft.net>
Cc:     Julian Anastasov <ja@....bg>, Cong Wang <xiyou.wangcong@...il.com>,
        syzbot <syzbot+30209ea299c09d8785c9@...kaller.appspotmail.com>,
        ddstreet@...e.org, dvyukov@...gle.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: unregister_netdevice: waiting for DEV to become free (2)

On 2019/05/01 23:52, David Ahern wrote:
> On 5/1/19 7:38 AM, Tetsuo Handa wrote:
>> On 2019/04/30 3:43, David Ahern wrote:
>>>> The attached patch adds a tracepoint to notifier_call_chain. If you have
>>>> KALLSYMS enabled it will show the order of the function handlers:
>>>>
>>>> perf record -e notifier:* -a -g &
>>>>
>>>> ip netns del <NAME>
>>>> <wait a few seconds>
>>>>
>>>> fg
>>>> <ctrl-c on perf-record>
>>>>
>>>> perf script
>>>>
>>>
>>> with the header file this time.
>>>
>>
>> What is the intent of your patch? I can see that many notifiers are called. But
>> how does this help identify which event is responsible for dropping the refcount?
>>
> 
> In a previous response you stated: "Since I'm not a netdev person, I
> appreciate if you can explain that shutdown sequence using a flow chart."

Yes, I said. But

> 
> The notifier sequence tells you the order of cleanup handlers and what
> happens when a namespace is destroyed.
> 
> The dev_hold / dev_put tracepoint helps find the refcnt leak but
> requires some time analyzing the output to match up hold / put stack traces.
> 

I already observed that fib_netdev_event() calls rt_flush_dev() which becomes a no-op
after the refcount of a dev is moved to the loopback device in that namespace.
I think that there is no event which can drop the loopback device in that namespace.

[   71.388104][ T7620] rt_flush_dev(00000000cd35e96a)->(00000000d9f4ea20)
[   71.391757][ T7620] dev_hold(00000000d9f4ea20): 7->8
[   71.394725][ T7620] CPU: 4 PID: 7620 Comm: a.out Not tainted 5.1.0-rc5+ #177
[   71.398094][ T7620] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   71.403711][ T7620] Call Trace:
[   71.405912][ T7620]  dump_stack+0xaa/0xd8
[   71.408252][ T7620]  rt_flush_dev+0x177/0x1b0
[   71.410802][ T7620]  fib_netdev_event+0x150/0x1b0
[   71.413270][ T7620]  notifier_call_chain+0x47/0xd0
[   71.415849][ T7620]  raw_notifier_call_chain+0x2d/0x40
[   71.418491][ T7620]  ? tun_show_group+0x90/0x90
[   71.421108][ T7620]  call_netdevice_notifiers_info+0x32/0x70
[   71.423854][ T7620]  rollback_registered_many+0x421/0x680
[   71.426583][ T7620]  rollback_registered+0x68/0xb0
[   71.429244][ T7620]  unregister_netdevice_queue+0xa5/0x100
[   71.432191][ T7620]  __tun_detach+0x576/0x590
[   71.435533][ T7620]  tun_chr_close+0x41/0x80
[   71.437957][ T7620]  ? __tun_detach+0x590/0x590
[   71.440500][ T7620]  __fput+0xeb/0x2d0
[   71.442816][ T7620]  ____fput+0x15/0x20
[   71.445090][ T7620]  task_work_run+0xa9/0xd0
[   71.447467][ T7620]  do_exit+0x37a/0xf40
[   71.449623][ T7620]  do_group_exit+0x57/0xe0
[   71.451826][ T7620]  get_signal+0x114/0x950
[   71.453989][ T7620]  do_signal+0x2f/0x700
[   71.456126][ T7620]  ? handle_mm_fault+0x1a8/0x360
[   71.458323][ T7620]  ? __x64_sys_futex+0x179/0x210
[   71.460620][ T7620]  exit_to_usermode_loop+0x159/0x180
[   71.462956][ T7620]  do_syscall_64+0x15d/0x180
[   71.465110][ T7620]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   89.873350][ T4478] rt_flush_dev(00000000d9f4ea20)->(00000000d9f4ea20)
[   89.876311][ T4478] dev_hold(00000000d9f4ea20): 34->35
[   89.878712][ T4478] CPU: 2 PID: 4478 Comm: kworker/u128:28 Not tainted 5.1.0-rc5+ #177
[   89.881981][ T4478] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   89.887273][ T4478] Workqueue: netns cleanup_net
[   89.889712][ T4478] Call Trace:
[   89.891737][ T4478]  dump_stack+0xaa/0xd8
[   89.894127][ T4478]  rt_flush_dev+0x177/0x1b0
[   89.896477][ T4478]  fib_netdev_event+0x150/0x1b0
[   89.898810][ T4478]  notifier_call_chain+0x47/0xd0
[   89.901348][ T4478]  raw_notifier_call_chain+0x2d/0x40
[   89.903974][ T4478]  call_netdevice_notifiers_info+0x32/0x70
[   89.906450][ T4478]  rollback_registered_many+0x421/0x680
[   89.909125][ T4478]  unregister_netdevice_many.part.119+0x17/0x90
[   89.911833][ T4478]  default_device_exit_batch+0x1a1/0x1d0
[   89.914287][ T4478]  ? do_wait_intr_irq+0xb0/0xb0
[   89.916720][ T4478]  ? unregister_netdevice_many+0x30/0x30
[   89.919258][ T4478]  ? dev_change_net_namespace+0x4e0/0x4e0
[   89.921759][ T4478]  ops_exit_list.isra.6+0x75/0x90
[   89.924396][ T4478]  cleanup_net+0x20d/0x380
[   89.926632][ T4478]  process_one_work+0x202/0x4f0
[   89.929045][ T4478]  worker_thread+0x3c/0x4b0
[   89.931398][ T4478]  kthread+0x139/0x160
[   89.933448][ T4478]  ? process_one_work+0x4f0/0x4f0
[   89.935887][ T4478]  ? kthread_destroy_worker+0x70/0x70
[   89.938243][ T4478]  ret_from_fork+0x35/0x40
[   89.940530][ T4478] dev_put(00000000d9f4ea20): 35->34
[   89.943031][ T4478] CPU: 2 PID: 4478 Comm: kworker/u128:28 Not tainted 5.1.0-rc5+ #177
[   89.946064][ T4478] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   89.951324][ T4478] Workqueue: netns cleanup_net
[   89.953646][ T4478] Call Trace:
[   89.955652][ T4478]  dump_stack+0xaa/0xd8
[   89.957857][ T4478]  rt_flush_dev+0x19f/0x1b0
[   89.960187][ T4478]  fib_netdev_event+0x150/0x1b0
[   89.962706][ T4478]  notifier_call_chain+0x47/0xd0
[   89.965044][ T4478]  raw_notifier_call_chain+0x2d/0x40
[   89.967503][ T4478]  call_netdevice_notifiers_info+0x32/0x70
[   89.970139][ T4478]  rollback_registered_many+0x421/0x680
[   89.972618][ T4478]  unregister_netdevice_many.part.119+0x17/0x90
[   89.975364][ T4478]  default_device_exit_batch+0x1a1/0x1d0
[   89.977827][ T4478]  ? do_wait_intr_irq+0xb0/0xb0
[   89.980066][ T4478]  ? unregister_netdevice_many+0x30/0x30
[   89.982761][ T4478]  ? dev_change_net_namespace+0x4e0/0x4e0
[   89.985231][ T4478]  ops_exit_list.isra.6+0x75/0x90
[   89.987756][ T4478]  cleanup_net+0x20d/0x380
[   89.990090][ T4478]  process_one_work+0x202/0x4f0
[   89.992384][ T4478]  worker_thread+0x3c/0x4b0
[   89.994702][ T4478]  kthread+0x139/0x160
[   89.996749][ T4478]  ? process_one_work+0x4f0/0x4f0
[   89.999116][ T4478]  ? kthread_destroy_worker+0x70/0x70
[   90.001580][ T4478]  ret_from_fork+0x35/0x40

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ