[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20120730.174044.968236168485282216.davem@davemloft.net>
Date: Mon, 30 Jul 2012 17:40:44 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: eric.dumazet@...il.com
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH] net: ipv4: fix RCU races on dst refcounts
From: David Miller <davem@...emloft.net>
Date: Mon, 30 Jul 2012 14:56:37 -0700 (PDT)
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Mon, 30 Jul 2012 11:20:37 +0200
>
>> From: Eric Dumazet <edumazet@...gle.com>
>>
>> We need instead regular dst refcounting (dst_release()) and make
>> sure dst_release() is aware of RCU grace periods :
Eric, we really cannot do this.
We absolutely must call dst_free() directly when cached entries
are flushed.
Your delayed scheme using dst_release() doesn't work, it exposes us to
the "stale netdevice references" issue even for cached entries. This
exact issue is why I moved away from a dst_release() based scheme to
a !DST_NOCACHE + dst_free() one.
dst_free() is special. It is special in that if there are existing
references, it adds the dst onto the generic GC list of busy dsts in
net/core/dst.c
This is important, because it means that even if references are still
held on the dst, we will still be able to purge spurious netdevice
references when those netdevices try to go down or unregister
themselves.
You can't defer the dst_free() to the final refcount drop like your
code does now. A socket, or other dst caching entity, can hold onto
the dst forever, do no socket operations at all, and therefore hold
onto a netdevice for an infinite amount of time.
Look at how net/core/dst.c:dst_dev_event() walks dst_busy_list.
That's why we must propagate cached dsts into dst_busy_list no later
than when we trim them from ->nh_rth_{input,output}
I noticed this because I'm trying to work on a fix the lingering dst
netdevice reference problem for non-cached dsts.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists