lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 27 Jan 2011 11:20:12 +0100 From: Eric Dumazet <eric.dumazet@...il.com> To: David Miller <davem@...emloft.net> Cc: netdev@...r.kernel.org Subject: Re: [RFC PATCH] net: Implement read-only protection and COW'ing of metrics. Le jeudi 27 janvier 2011 à 11:01 +0100, Eric Dumazet a écrit : > Le mercredi 26 janvier 2011 à 15:25 -0800, David Miller a écrit : > > Eric, thanks again for your feedback. I've taken a stab at fixing the > > various races, in particular the one you discovered about metrics > > sharing and how this interacts with fib_info releases. > > > > What I've choosen to do is two-fold: > > > > 1) Update ->_metrics atomically with cmpxchg once a route becomes publicly > > visible. > > > > 2) Remember and grab a reference to the fib_info for shared read-only > > metrics in rt->fi, then release it once the metrics regerence goes > > away. > > > > It sounds expensive but hear me out :-) > > > > First of all, at rt_set_nexthop() time, the atomic we use to grab a > > ref to the fib_info is replacing a 60-byte memcpy() into the dst > > metrics. > > > > Next, the ->_metrics atomic to un-COW the metrics at destroy time > > might in fact be overkill. Especially once writable metrics live in > > the inetpeer cache (that's the next set of patches after this one). > > > > Finally, once this change is stabilized we can be a lot smarter about > > what we do at the time an entry is created. For example, when a route > > is looked up for a TCP socket, we essentially know we are going to COW > > the route %99.99999 of the time. So we can pass a hint into TCP's > > route lookups in the flow flags field telling it to pre-COW the route. > > > > TCP pre-COW'ing of metrics will thus save several atomics. > > > > Anyways, here is the patch, it is only build tested at this point, but > > I wanted to get feedback from you about the basic gist of things > > as soon as possible. > > > > Thanks! > > > > Thanks David, I read this (I am a bit busy preparing my travel to > Reunion/Maurice islands). This looks pretty nice. I have one comment : > > > > > +u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old) > > +{ > > + u32 *p = kmalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC); > > + > > + if (p) { > > + u32 *old_p = __DST_METRICS_PTR(old); > > + unsigned long prev, new; > > + > > + memcpy(p, old_p, sizeof(u32) * RTAX_MAX); > > + > > + new = (unsigned long) p; > > + prev = cmpxchg(&dst->_metrics, old, new); > > + > > + if (prev != old) { > > + kfree(p); > > + p = __DST_METRICS_PTR(prev); > > + if (prev & DST_METRICS_READ_ONLY) > > + p = NULL; > > + } > > + } > > + return p; > > +} > > +EXPORT_SYMBOL(dst_cow_metrics_generic); > > + > ... > > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > > index 3e5b7cc..7fc6301 100644 > > --- a/net/ipv4/route.c > > +++ b/net/ipv4/route.c > > @@ -152,6 +152,36 @@ static void ipv4_dst_ifdown(struct dst_entry *dst, struct net_device *dev, > > { > > } > > > > +static u32 *ipv4_cow_metrics(struct dst_entry *dst, unsigned long old) > > +{ > > + u32 *p = kmalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC); > > + > > + if (p) { > > + u32 *old_p = __DST_METRICS_PTR(old); > > + unsigned long prev, new; > > + > > + memcpy(p, old_p, sizeof(u32) * RTAX_MAX); > > + > > + new = (unsigned long) p; > > + prev = cmpxchg(&dst->_metrics, old, new); > > + > > + if (prev != old) { > > + kfree(p); > > + p = __DST_METRICS_PTR(prev); > > + if (prev & DST_METRICS_READ_ONLY) > > + p = NULL; > > + } else { > > Hmm, I first asked myself why you dont use dst_cow_metrics_generic() to > perform the generic allocation, but saw following : > > > + struct rtable *rt = (struct rtable *) dst; > > + > > Since you use cmpxchg() to permut the dst->_metrics, I feel this rt->fi > needs some protection as well. Maybe store fi pointer inside the metrics > instead of dst, or else you need a spinlock to perform the whole > transaction (change dst->_metrics & rt->fi) ? > > > + if (rt->fi) { > > + fib_info_put(rt->fi); > > + rt->fi = NULL; > > + } > > + } > > + } > > + return p; > > +} > > + > Hmm, reading again, I realize the rt->fi was set only when installing the readonly metrics, so ignore my previous mail. Thanks ! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists