lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1296123612.3027.15.camel@edumazet-laptop>
Date:	Thu, 27 Jan 2011 11:20:12 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org
Subject: Re: [RFC PATCH] net: Implement read-only protection and COW'ing of
 metrics.

Le jeudi 27 janvier 2011 à 11:01 +0100, Eric Dumazet a écrit :
> Le mercredi 26 janvier 2011 à 15:25 -0800, David Miller a écrit :
> > Eric, thanks again for your feedback.  I've taken a stab at fixing the
> > various races, in particular the one you discovered about metrics
> > sharing and how this interacts with fib_info releases.
> > 
> > What I've choosen to do is two-fold:
> > 
> > 1) Update ->_metrics atomically with cmpxchg once a route becomes publicly
> >    visible.
> > 
> > 2) Remember and grab a reference to the fib_info for shared read-only
> >    metrics in rt->fi, then release it once the metrics regerence goes
> >    away.
> > 
> > It sounds expensive but hear me out :-)
> > 
> > First of all, at rt_set_nexthop() time, the atomic we use to grab a
> > ref to the fib_info is replacing a 60-byte memcpy() into the dst
> > metrics.
> > 
> > Next, the ->_metrics atomic to un-COW the metrics at destroy time
> > might in fact be overkill.  Especially once writable metrics live in
> > the inetpeer cache (that's the next set of patches after this one).
> > 
> > Finally, once this change is stabilized we can be a lot smarter about
> > what we do at the time an entry is created.  For example, when a route
> > is looked up for a TCP socket, we essentially know we are going to COW
> > the route %99.99999 of the time.  So we can pass a hint into TCP's
> > route lookups in the flow flags field telling it to pre-COW the route.
> > 
> > TCP pre-COW'ing of metrics will thus save several atomics.
> > 
> > Anyways, here is the patch, it is only build tested at this point, but
> > I wanted to get feedback from you about the basic gist of things
> > as soon as possible.
> > 
> > Thanks!
> > 
> 
> Thanks David, I read this (I am a bit busy preparing my travel to
> Reunion/Maurice islands). This looks pretty nice. I have one comment :
> 
> >  
> > +u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old)
> > +{
> > +	u32 *p = kmalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC);
> > +
> > +	if (p) {
> > +		u32 *old_p = __DST_METRICS_PTR(old);
> > +		unsigned long prev, new;
> > +
> > +		memcpy(p, old_p, sizeof(u32) * RTAX_MAX);
> > +
> > +		new = (unsigned long) p;
> > +		prev = cmpxchg(&dst->_metrics, old, new);
> > +
> > +		if (prev != old) {
> > +			kfree(p);
> > +			p = __DST_METRICS_PTR(prev);
> > +			if (prev & DST_METRICS_READ_ONLY)
> > +				p = NULL;
> > +		}
> > +	}
> > +	return p;
> > +}
> > +EXPORT_SYMBOL(dst_cow_metrics_generic);
> > +
> ...
> 
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 3e5b7cc..7fc6301 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -152,6 +152,36 @@ static void ipv4_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
> >  {
> >  }
> >  
> > +static u32 *ipv4_cow_metrics(struct dst_entry *dst, unsigned long old)
> > +{
> > +	u32 *p = kmalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC);
> > +
> > +	if (p) {
> > +		u32 *old_p = __DST_METRICS_PTR(old);
> > +		unsigned long prev, new;
> > +
> > +		memcpy(p, old_p, sizeof(u32) * RTAX_MAX);
> > +
> > +		new = (unsigned long) p;
> > +		prev = cmpxchg(&dst->_metrics, old, new);
> > +
> > +		if (prev != old) {
> > +			kfree(p);
> > +			p = __DST_METRICS_PTR(prev);
> > +			if (prev & DST_METRICS_READ_ONLY)
> > +				p = NULL;
> > +		} else {
> 
> Hmm, I first asked myself why you dont use dst_cow_metrics_generic() to
> perform the generic allocation, but saw following :
> 
> > +			struct rtable *rt = (struct rtable *) dst;
> > +
> 
> Since you use cmpxchg() to permut the dst->_metrics, I feel this rt->fi
> needs some protection as well. Maybe store fi pointer inside the metrics
> instead of dst, or else you need a spinlock to perform the whole
> transaction (change dst->_metrics & rt->fi) ?
> 
> > +			if (rt->fi) {
> > +				fib_info_put(rt->fi);
> > +				rt->fi = NULL;
> > +			}
> > +		}
> > +	}
> > +	return p;
> > +}
> > +
> 


Hmm, reading again, I realize the rt->fi was set only when installing
the readonly metrics, so ignore my previous mail.

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ