[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57eab627-8cc5-4834-a865-1970a290821a@gmail.com>
Date: Mon, 30 Sep 2019 20:23:42 -0600
From: David Ahern <dsahern@...il.com>
To: Eric Dumazet <eric.dumazet@...il.com>,
David Ahern <dsahern@...nel.org>, davem@...emloft.net,
jakub.kicinski@...ronome.com
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH net] ipv6: Handle race in addrconf_dad_work
On 9/30/19 8:01 PM, Eric Dumazet wrote:
>
>
> On 9/30/19 6:37 PM, David Ahern wrote:
>> From: David Ahern <dsahern@...il.com>
>>
>> Rajendra reported a kernel panic when a link was taken down:
>>
>> [ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
>> [ 6870.271856] IP: [<ffffffff8efc5764>] __ipv6_ifa_notify+0x154/0x290
>>
>> <snip>
>>
>> [ 6870.570501] Call Trace:
>> [ 6870.573238] [<ffffffff8efc58c6>] ? ipv6_ifa_notify+0x26/0x40
>> [ 6870.579665] [<ffffffff8efc98ec>] ? addrconf_dad_completed+0x4c/0x2c0
>> [ 6870.586869] [<ffffffff8efe70c6>] ? ipv6_dev_mc_inc+0x196/0x260
>> [ 6870.593491] [<ffffffff8efc9c6a>] ? addrconf_dad_work+0x10a/0x430
>> [ 6870.600305] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
>> [ 6870.606732] [<ffffffff8ea93a7a>] ? process_one_work+0x18a/0x430
>> [ 6870.613449] [<ffffffff8ea93d6d>] ? worker_thread+0x4d/0x490
>> [ 6870.619778] [<ffffffff8ea93d20>] ? process_one_work+0x430/0x430
>> [ 6870.626495] [<ffffffff8ea99dd9>] ? kthread+0xd9/0xf0
>> [ 6870.632145] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
>> [ 6870.638573] [<ffffffff8ea99d00>] ? kthread_park+0x60/0x60
>> [ 6870.644707] [<ffffffff8f01ae77>] ? ret_from_fork+0x57/0x70
>> [ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0
>>
>> addrconf_dad_work is kicked to be scheduled when a device is brought
>> up. There is a race between addrcond_dad_work getting scheduled and
>> taking the rtnl lock and a process taking the link down (under rtnl).
>> The latter removes the host route from the inet6_addr as part of
>> addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
>> to use the host route in ipv6_ifa_notify. If the down event removes
>> the host route due to the race to the rtnl, then the BUG listed above
>> occurs.
>>
>> This scenario does not occur when the ipv6 address is not kept
>> (net.ipv6.conf.all.keep_addr_on_down = 0) as addrconf_ifdown sets the
>> state of the ifp to DEAD. Handle when the addresses are kept by checking
>> IF_READY which is reset by addrconf_ifdown.
>>
>> Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
>> Reported-by: Rajendra Dendukuri <rajendra.dendukuri@...adcom.com>
>> Signed-off-by: David Ahern <dsahern@...il.com>
>> ---
>> net/ipv6/addrconf.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index 6a576ff92c39..e2759ef73b03 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4032,6 +4032,12 @@ static void addrconf_dad_work(struct work_struct *w)
>>
>> rtnl_lock();
>>
>> + /* check if device was taken down before this delayed work
>> + * function could be canceled
>> + */
>> + if (!(idev->if_flags & IF_READY))
>> + goto out;
>> +
>> spin_lock_bh(&ifp->lock);
>> if (ifp->state == INET6_IFADDR_STATE_PREDAD) {
>> action = DAD_BEGIN;
>>
>
>
> Do we need to keep the test on IF_READY done later in this function ?
>
> If IF_READY can disappear only under RTNL, we might clean this.
>
> (unless addrconf_dad_work() releases rtnl and reacquires it)
Unless I am missing something none of the functions called by dad_work
release the rtnl, but your comment did have me second guessing the locking.
The interesting cases for changing the idev flag are addrconf_notify
(NETDEV_UP and NETDEV_CHANGE) and addrconf_ifdown (reset the flag). The
former does not have the idev lock - only rtnl. The latter has both.
Checking the flag is inconsistent with respect to locks.
As for your suggestion, the 'dead' flag is set only under rtnl in
addrconf_ifdown and it means the device is getting removed (or IPv6 is
disabled). Based on that I think the existing:
if (idev->dead || !(idev->if_flags & IF_READY))
goto out;
can be moved to right after the rtnl_lock in addrconf_dad_work in place
of the above change, so the end result is:
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 6a576ff92c39..dd3be06d5a06 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4032,6 +4032,12 @@ static void addrconf_dad_work(struct work_struct *w)
rtnl_lock();
+ /* check if device was taken down before this delayed work
+ * function could be canceled
+ */
+ if (idev->dead || !(idev->if_flags & IF_READY))
+ goto out;
+
spin_lock_bh(&ifp->lock);
if (ifp->state == INET6_IFADDR_STATE_PREDAD) {
action = DAD_BEGIN;
@@ -4077,11 +4083,6 @@ static void addrconf_dad_work(struct work_struct *w)
goto out;
write_lock_bh(&idev->lock);
- if (idev->dead || !(idev->if_flags & IF_READY)) {
- write_unlock_bh(&idev->lock);
- goto out;
- }
-
spin_lock(&ifp->lock);
if (ifp->state == INET6_IFADDR_STATE_DEAD) {
spin_unlock(&ifp->lock);
agree?
Powered by blists - more mailing lists