lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 9 Apr 2019 14:33:28 +0800
From:   Rundong Ge <rdong.ge@...il.com>
To:     Pablo Neira Ayuso <pablo@...filter.org>
Cc:     kadlec@...ckhole.kfki.hu, fw@...len.de,
        Roopa Prabhu <roopa@...ulusnetworks.com>,
        nikolay@...ulusnetworks.com, davem@...emloft.net,
        netfilter-devel@...r.kernel.org, coreteam@...filter.org,
        bridge@...ts.linux-foundation.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] netfilter:bridge: Hold bridge dev for fake_rtable to
 avoid the dangling pointer

Hi Pablo

I've tested on mainline v5.0. The dangling pointer access of fake_rtable
still exists.

My env is like this: client0--box--client1.
The box runs ubuntu with v5.0 kernel, br_netfilter and nfnetlink_queue
are inserted.

Reproduce steps:
1.Create a bridge on the box.
2.echo 1 >/proc/sys/net/bridge/bridge-nf-call-iptables
3.Add a netfilter hook function to queue the packets to nfqueuenum 0.
  The hook point must between <NF_BR_PRE_ROUTING,NF_BR_PRI_BRNF> and
  <NF_BR_FORWARD,NF_BR_PRI_BRNF - 1>.
4.Add a userspace process "nfqueue_rcv" to continuously read and
  set_verdict "NF_ACCEPT" to packets from queue 0.
5.Continuosly ping client1 from client0
6.Send "Ctrl + Z" to pause the "nfqueue_rcv" to simulate the queue
  congestion.
7.Using "ifconfig br0 down&&brctl delbr br0" to delete the bridge.
8.At this time the _skb_refdst of skbs in the nfqueue become dangling
  pointer. If we send "fg" to resume the "nfqueue_rcv", the kernel
  may try to access the freed memory.

Debug log:
Here I add debug logs in "netdev_freemem" and "dst_release" to prove
the freed memory access. As the log shows, the "dst_release" accessed
bridge's fake_rtable after the bridge was freed.

Apr  8 22:25:14 raydon kernel: [62139.005062] netdev_freemem name:br0,
fake_rtable:000000009d76cef0
Apr  8 22:25:21 raydon kernel: [62145.967133] dst_release
dst:000000009d76cef0 dst->dev->name: řKU¡TH
Apr  8 22:25:21 raydon kernel: [62145.967154] dst_release
dst:000000009d76cef0 dst->dev->name: řKU¡TH
Apr  8 22:25:21 raydon kernel: [62145.967180] dst_release
dst:000000009d76cef0 dst->dev->name: řKU¡TH
Apr  8 22:25:21 raydon kernel: [62145.967197] dst_release
dst:000000009d76cef0 dst->dev->name: řKU¡TH



The reason why the hook point should be after <NF_BR_PRE_ROUTING,
NF_BR_PRI_BRNF> is skbs reference bridge's fake_rtable in
"br_nf_pre_routing_finish" hooked at <NF_BR_PRE_ROUTING,NF_BR_PRI_BRNF>.

And the reason why the hook point should be before <NF_BR_FORWARD,
NF_BR_PRI_BRNF - 1> is "br_nf_forward_ip" will set the state.out to
bridge dev. After this hook point, the "nfqnl_dev_drop" triggered by
the bridge's NETDEV_DOWN event can flush the queued skbs before
bridge's memory is freed, because the state.out now matches the
bridge's dev.

So the root cause is "nfqnl_dev_drop" didn't flush the skbs properly
queued between <NF_BR_PRE_ROUTING,NF_BR_PRI_BRNF> and <NF_BR_FORWARD,
NF_BR_PRI_BRNF - 1>.

As you mentioned, hold the bridge dev for these skbs is not a proper
solution. I will send another patch to let "nfqnl_dev_drop" can flush
these skbs.

Thanks
Rundong


Pablo Neira Ayuso <pablo@...filter.org> 于2019年4月4日周四 上午1:44写道:
>
> On Tue, Apr 02, 2019 at 12:56:09PM +0000, Rundong Ge wrote:
> > Problem:
> > When bridge-nf-call-iptables is enabled, skb_dst(skb) of packets that
> > in the nfqueue may be a dangling pointer if user delete the bridge.
> > Because packets go through the br_nf_pre_routing_finish will set the dst
> > pointer to the br->fake_rtable. But the br struct will be freed
> > without the reference check for these skbs.
> >
> > User impact:
> > Kernel panic may happen when user delete the bridge if there are
> > continuous traffics go through the nfqueue.
> > Here is a panic in my device which using kernel v3.10.
>
> This kernel is _very old_.
>
> Could you provide the steps to reproduce this issue?
>
> Holding the device doesn't seem the way to go to me, we have a of
> netdevice_notifier that is dropping packets for an interface that is
> gone in nfnetlink_queue. We also drop packets whenever a hook in gone.
>
> So I wonder if this is still a problem in mainline kernels.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ