netdev - RE: [(RFC) PATCH ] NULL pointer dereference on rmmod iptable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <72692f32471b5d2eeef9514bb2c9ba51@linux.vnet.ibm.com>
Date:   Wed, 03 Jun 2020 23:00:18 -0700
From:   dwilder <dwilder@...ibm.com>
To:     Florian Westphal <fw@...len.de>
Cc:     netdev@...r.kernel.org, netfilter-devel@...r.kernel.org,
        wilder@...ibm.com, mkubecek@...e.com
Subject: RE: [(RFC) PATCH ] NULL pointer dereference on rmmod iptable_mangle.

On 2020-06-03 15:05, Florian Westphal wrote:
> David Wilder <dwilder@...ibm.com> wrote:
>> This crash happened on a ppc64le system running ltp network tests when 
>> ltp script ran "rmmod iptable_mangle".
>> 
>> [213425.602369] BUG: Kernel NULL pointer dereference at 0x00000010
>> [213425.602388] Faulting instruction address: 0xc008000000550bdc
> [..]
> 
>> In the crash we find in iptable_mangle_hook() that 
>> state->net->ipv4.iptable_mangle=NULL causing a NULL pointer 
>> dereference. net->ipv4.iptable_mangle is set to NULL in 
>> iptable_mangle_net_exit() and called when ip_mangle modules is 
>> unloaded. A rmmod task was found in the crash dump.  A 2nd crash 
>> showed the same problem when running "rmmod iptable_filter" 
>> (net->ipv4.iptable_filter=NULL).
>> 
>> Once a hook is registered packets will picked up a pointer from: 
>> net->ipv4.iptable_$table. The patch adds a call to synchronize_net() 
>> in ipt_unregister_table() to insure no packets are in flight that have 
>> picked up the pointer before completing the un-register.
>> 
>> This change has has prevented the problem in our testing.  However, we 
>> have concerns with this change as it would mean that on netns cleanup, 
>> we would need one synchronize_net() call for every table in use. Also, 
>> on module unload, there would be one synchronize_net() for every 
>> existing netns.
> 
> Yes, I agree with the analysis.
> 
>> Signed-off-by: David Wilder <dwilder@...ibm.com>
>> ---
>>  net/ipv4/netfilter/ip_tables.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/ipv4/netfilter/ip_tables.c 
>> b/net/ipv4/netfilter/ip_tables.c
>> index c2670ea..97c4121 100644
>> --- a/net/ipv4/netfilter/ip_tables.c
>> +++ b/net/ipv4/netfilter/ip_tables.c
>> @@ -1800,8 +1800,10 @@ int ipt_register_table(struct net *net, const 
>> struct xt_table *table,
>>  void ipt_unregister_table(struct net *net, struct xt_table *table,
>>  			  const struct nf_hook_ops *ops)
>>  {
>> -	if (ops)
>> +	if (ops) {
>>  		nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
>> +		synchronize_net();
>> +	}
> 
> I'd wager ebtables, arptables and ip6tables have the same bug.
> 
> The extra synchronize_net() isn't ideal.  We could probably do it this
> way and then improve in a second patch.
> 
> One way to fix this without a new synchronize_net() is to switch all
> iptable_foo.c to use ".pre_exit" hook as well.
> 
> pre_exit would unregister the underlying hook and .exit would to the
> table freeing.
> 
> Since the netns core already does an unconditional synchronize_rcu 
> after
> the pre_exit hooks this would avoid the problem as well.

Something like this?  (un-tested)

diff --git a/net/ipv4/netfilter/iptable_mangle.c 
b/net/ipv4/netfilter/iptable_mangle.c
index bb9266ea3785..0d448e4d5213 100644
--- a/net/ipv4/netfilter/iptable_mangle.c
+++ b/net/ipv4/netfilter/iptable_mangle.c
@@ -100,15 +100,26 @@ static int __net_init 
iptable_mangle_table_init(struct net *net)
         return ret;
  }

+static void __net_exit iptable_mangle_net_pre_exit(struct net *net)
+{
+       struct xt_table *table = net->ipv4.iptable_mangle;
+
+       if (mangle_ops)
+               nf_unregister_net_hooks(net, mangle_ops,
+                       hweight32(table->valid_hooks));
+}
+
+
  static void __net_exit iptable_mangle_net_exit(struct net *net)
  {
         if (!net->ipv4.iptable_mangle)
                 return;
-       ipt_unregister_table(net, net->ipv4.iptable_mangle, mangle_ops);
+       ipt_unregister_table(net, net->ipv4.iptable_mangle, NULL);
         net->ipv4.iptable_mangle = NULL;
  }

  static struct pernet_operations iptable_mangle_net_ops = {
+       .pre_exit = iptable_mangle_net_pre_exit,
         .exit = iptable_mangle_net_exit,
  };