[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <72692f32471b5d2eeef9514bb2c9ba51@linux.vnet.ibm.com>
Date: Wed, 03 Jun 2020 23:00:18 -0700
From: dwilder <dwilder@...ibm.com>
To: Florian Westphal <fw@...len.de>
Cc: netdev@...r.kernel.org, netfilter-devel@...r.kernel.org,
wilder@...ibm.com, mkubecek@...e.com
Subject: RE: [(RFC) PATCH ] NULL pointer dereference on rmmod iptable_mangle.
On 2020-06-03 15:05, Florian Westphal wrote:
> David Wilder <dwilder@...ibm.com> wrote:
>> This crash happened on a ppc64le system running ltp network tests when
>> ltp script ran "rmmod iptable_mangle".
>>
>> [213425.602369] BUG: Kernel NULL pointer dereference at 0x00000010
>> [213425.602388] Faulting instruction address: 0xc008000000550bdc
> [..]
>
>> In the crash we find in iptable_mangle_hook() that
>> state->net->ipv4.iptable_mangle=NULL causing a NULL pointer
>> dereference. net->ipv4.iptable_mangle is set to NULL in
>> iptable_mangle_net_exit() and called when ip_mangle modules is
>> unloaded. A rmmod task was found in the crash dump. A 2nd crash
>> showed the same problem when running "rmmod iptable_filter"
>> (net->ipv4.iptable_filter=NULL).
>>
>> Once a hook is registered packets will picked up a pointer from:
>> net->ipv4.iptable_$table. The patch adds a call to synchronize_net()
>> in ipt_unregister_table() to insure no packets are in flight that have
>> picked up the pointer before completing the un-register.
>>
>> This change has has prevented the problem in our testing. However, we
>> have concerns with this change as it would mean that on netns cleanup,
>> we would need one synchronize_net() call for every table in use. Also,
>> on module unload, there would be one synchronize_net() for every
>> existing netns.
>
> Yes, I agree with the analysis.
>
>> Signed-off-by: David Wilder <dwilder@...ibm.com>
>> ---
>> net/ipv4/netfilter/ip_tables.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/netfilter/ip_tables.c
>> b/net/ipv4/netfilter/ip_tables.c
>> index c2670ea..97c4121 100644
>> --- a/net/ipv4/netfilter/ip_tables.c
>> +++ b/net/ipv4/netfilter/ip_tables.c
>> @@ -1800,8 +1800,10 @@ int ipt_register_table(struct net *net, const
>> struct xt_table *table,
>> void ipt_unregister_table(struct net *net, struct xt_table *table,
>> const struct nf_hook_ops *ops)
>> {
>> - if (ops)
>> + if (ops) {
>> nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
>> + synchronize_net();
>> + }
>
> I'd wager ebtables, arptables and ip6tables have the same bug.
>
> The extra synchronize_net() isn't ideal. We could probably do it this
> way and then improve in a second patch.
>
> One way to fix this without a new synchronize_net() is to switch all
> iptable_foo.c to use ".pre_exit" hook as well.
>
> pre_exit would unregister the underlying hook and .exit would to the
> table freeing.
>
> Since the netns core already does an unconditional synchronize_rcu
> after
> the pre_exit hooks this would avoid the problem as well.
Something like this? (un-tested)
diff --git a/net/ipv4/netfilter/iptable_mangle.c
b/net/ipv4/netfilter/iptable_mangle.c
index bb9266ea3785..0d448e4d5213 100644
--- a/net/ipv4/netfilter/iptable_mangle.c
+++ b/net/ipv4/netfilter/iptable_mangle.c
@@ -100,15 +100,26 @@ static int __net_init
iptable_mangle_table_init(struct net *net)
return ret;
}
+static void __net_exit iptable_mangle_net_pre_exit(struct net *net)
+{
+ struct xt_table *table = net->ipv4.iptable_mangle;
+
+ if (mangle_ops)
+ nf_unregister_net_hooks(net, mangle_ops,
+ hweight32(table->valid_hooks));
+}
+
+
static void __net_exit iptable_mangle_net_exit(struct net *net)
{
if (!net->ipv4.iptable_mangle)
return;
- ipt_unregister_table(net, net->ipv4.iptable_mangle, mangle_ops);
+ ipt_unregister_table(net, net->ipv4.iptable_mangle, NULL);
net->ipv4.iptable_mangle = NULL;
}
static struct pernet_operations iptable_mangle_net_ops = {
+ .pre_exit = iptable_mangle_net_pre_exit,
.exit = iptable_mangle_net_exit,
};
Powered by blists - more mailing lists