[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1403512412.16682.23.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Mon, 23 Jun 2014 01:33:32 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: dormando <dormando@...ia.net>
Cc: Alexey Preobrazhensky <preobr@...gle.com>,
Steffen Klassert <steffen.klassert@...unet.com>,
David Miller <davem@...emloft.net>, paulmck@...ux.vnet.ibm.com,
netdev@...r.kernel.org, Kostya Serebryany <kcc@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Lars Bull <larsbull@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Bruce Curtis <brutus@...gle.com>,
Maciej Żenczykowski <maze@...gle.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Subject: Re: [PATCH] ipv4: fix a race in ip4_datagram_release_cb()
On Sun, 2014-06-22 at 12:07 -0700, dormando wrote:
> Update on testing:
>
> I only have two machines that crash on their own frequently (more like
> one, even). Unfortunately something happened to the datacenter it's in and
> it was offline for a week. The machine normally crashes after 1.5-4d,
> averaging 2d.
>
> It's done about three days total time without a new crash. I also have the
> kernel running in another datacenter for ~10 days.. but it takes 30-150
> days to crash in that one.
>
> So, inconclusive, but still promising. If the machine survives the week it
> probably means it's fixed, or at least greatly reduced.
>
> I saw that one of your patches got queued for stable, but all three were
> necessary to fix udpkill. What's your plan for cleanup/upstreaming?
>
> Did you folks end up running udpkill under the tester thing?
I did not test udpkill, as the known problem is the DST_NOCACHE flag.
We end up calling sk_dst_set(sk, dst) with a dst having this flag set.
So maybe DST_NOCACHE should be renamed, if we _can _ cache a dst like
this. Its meaning is really that dst_release() has to track when
refcount reaches 0 so that last owner fress dst, but we need to respect
rcu grace period.
Fixing sk_dst_set() as I did is not enough, as it is only reducing race
window.
Something like following :
include/net/sock.h | 4 ++--
net/core/dst.c | 16 +++++++++++-----
net/ipv4/ip_tunnel.c | 7 +------
3 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 07b7fcd60d80..173cae485de1 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1730,8 +1730,8 @@ sk_dst_get(struct sock *sk)
rcu_read_lock();
dst = rcu_dereference(sk->sk_dst_cache);
- if (dst)
- dst_hold(dst);
+ if (dst && !atomic_inc_not_zero(&dst->__refcnt))
+ dst = NULL;
rcu_read_unlock();
return dst;
}
diff --git a/net/core/dst.c b/net/core/dst.c
index 80d6286c8b62..a028409ee438 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -269,6 +269,15 @@ again:
}
EXPORT_SYMBOL(dst_destroy);
+static void dst_destroy_rcu(struct rcu_head *head)
+{
+ struct dst_entry *dst = container_of(head, struct dst_entry, rcu_head);
+
+ dst = dst_destroy(dst);
+ if (dst)
+ __dst_free(dst);
+}
+
void dst_release(struct dst_entry *dst)
{
if (dst) {
@@ -276,11 +285,8 @@ void dst_release(struct dst_entry *dst)
newrefcnt = atomic_dec_return(&dst->__refcnt);
WARN_ON(newrefcnt < 0);
- if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) {
- dst = dst_destroy(dst);
- if (dst)
- __dst_free(dst);
- }
+ if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt)
+ call_rcu(&dst->rcu_head, dst_destroy_rcu);
}
}
EXPORT_SYMBOL(dst_release);
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 097b3e7c1e8f..cc3b7fd34555 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -73,12 +73,7 @@ static void __tunnel_dst_set(struct ip_tunnel_dst *idst,
{
struct dst_entry *old_dst;
- if (dst) {
- if (dst->flags & DST_NOCACHE)
- dst = NULL;
- else
- dst_clone(dst);
- }
+ dst_clone(dst);
old_dst = xchg((__force struct dst_entry **)&idst->dst, dst);
dst_release(old_dst);
}
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists