netdev - Re: Race condition in ipv6 code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1pqeov4u1.fsf@fess.ebiederm.org>
Date:	Thu, 12 Jan 2012 16:11:50 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Francesco Ruggeri <fruggeri@...stanetworks.com>,
	netdev@...r.kernel.org
Subject: Re: Race condition in ipv6 code

Eric Dumazet <eric.dumazet@...il.com> writes:

> Le mercredi 11 janvier 2012 à 18:13 -0800, Francesco Ruggeri a écrit :
>> We have hit a race condition in ipv6 code when setting
>> /proc/sys/net/ipv6/conf/*/forwarding. This happens when the syscall
>> has to be restarted.
>> 
>> I wonder if anyone else has run into the same issue.
>> 
>> The current sequence in addrconf_sysctl_forward() and
>> addrconf_fixup_forwarding()  is as follows:
>> - change the parameter in idev->cnf.forwarding (using proc_dointvec())
>> - try to get the rtnl lock
>> - if cannot get the lock then restore the original value in
>> idev->cnf.forwarding and restart the syscall.
>> 
>> While this is going on, the ipv6 code may access idev->cnf.forwarding
>> and get an incorrect value.
>> In our case we were in addrconf_ifdown (holding the rtnl lock)  and
>> calling __ipv6_ifa_notify(RTM_DELADDR, ifa) on the idev->addr_list
>> entries.
>> __ipv6_ifa_notify() only invokes addrconf_leave_anycast() if
>> idev->cnf.forwarding is set. Because a process trying to set
>> forwarding to 0 was stuck in the restart_syscall sequence above
>> flipping the flag on and off, we erroneously read the flag as 0, with
>> the result that addrconf_leave_anycast() was not invoked, some
>> idev->ac_list entries were never released, idev was never freed and
>> kept a reference to its net_device, and the net_device was never freed
>> and caused the "unregister_netdevice: waiting for xxx to become free"
>> message forever. In our case this was a vlan interfaces that was being
>> deleted, so we ended up getting stuck in vlan_ioctl_handler() holding
>> vlan_ioctl_mutex with further bad consequences.
>> The following diffs (for 2.6.38, but the same logic seems to be used
>> in 3.2) address the issue by modifying idev->cnf.forwarding only after
>> the rtnl lock is acquired. There is a similar situation for
>> disable_ipv6.
>> Any comments are appreciated.
>> 
>> Francesco Ruggeri
>
> Real question is : why are we using this horrible thing at all
>
> if (!rtnl_trylock())
> 	return restart_syscall();

Because the rtnl_lock is broad and we have ABBA deadlocks if we don't in
particular we hold the rtnl_lock over sysctl registration and removal.
sysctl removal blocks until all of the callers into the sysctl methods
namely addrconf_sysctl_forward in this case finish executing.

CPU 0                                         CPU 1

rtnl_lock()                                   use_count++
unregister_netdevice()                           addrconf_ctl_foward
  unregister_sysctl_table()                        rtnl_lock()
     wait for use_count of addrconf_ctl_forward
       to == 0
     

I smacked lockdep around so it would warn about the sysfs ones.
The proc and sysctl ones I never did manage to get lockdep warnings
but a ABBA deadlock is most definitely possible.

Any solutions better than simply restarting the system call are welcome.

Perhaps for these heavy weigh methods we should create a work struct
and go schedule work to perform the change instead of trying to do the
work synchronously in the sysctl handler.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html