lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160708165131.GA23650@obsidianresearch.com>
Date:	Fri, 8 Jul 2016 10:51:31 -0600
From:	Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To:	Roland Dreier <roland@...estorage.com>
Cc:	Alexander Duyck <alexander.duyck@...il.com>,
	Konstantin Khlebnikov <koct9i@...il.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
	Florian Westphal <fw@...len.de>,
	Thadeu Lima de Souza Cascardo <cascardo@...hat.com>,
	Cong Wang <xiyou.wangcong@...il.com>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>,
	David Miller <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Resurrecting due to huge ipoib perf regression - [BUG] skb
 corruption and kernel panic at forwarding with fragmentation

On Fri, Jul 08, 2016 at 07:18:11AM -0700, Roland Dreier wrote:
> On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpe
> <jgunthorpe@...idianresearch.com> wrote:
> > We have neighbour_priv, and ndo_neigh_construct/destruct now ..
> >
> > A first blush that would seem to be enough to let ipoib store the AH
> > and other path information in the neigh and avoid the cb? At least the
> > example in clip sure looks like what ipoib needs to do.
> 
> Do you think those new facilities let us go back to using the neigh
> and still avoid the issues that led to commit b63b70d87741 ("IPoIB:
> Use a private hash table for path lookup in xmit path")?

Well, the priv stuff were brought up in the discussion around
b63b70d87741 but never fully analyzed. Maybe it could have been used
to solve that problem, who knows.. I guess it doesn't help this exact
issue because we don't have a dst at hard header time anyhow.

But, DaveM suggested how to handle our current problem in the above thread:

http://marc.info/?l=linux-rdma&m=132813323907877&w=2

Which is the same route CLIP took:

331         struct dst_entry *dst = skb_dst(skb);
347         rt = (struct rtable *) dst;
348         if (rt->rt_gateway)
349                 daddr = &rt->rt_gateway;
350         else
351                 daddr = &ip_hdr(skb)->daddr;
352         n = dst_neigh_lookup(dst, daddr);

(DaveM said it should be &ip/ipv6_hdr(skb)->daddr, not the rtable cast)

Last time this was brought up you were concerned about ARP, ARP
sets skb_dst after calling dev_hard_header:

310         skb = arp_create(type, ptype, dest_ip, dev, src_ip,
311                          dest_hw, src_hw, target_hw);
312         if (!skb)
313                 return;
314
315         skb_dst_set(skb, dst_clone(dst));

However, there is at least one fringe case (arp_send) where the dst is
left NULL. Presumably there are other fringe cases too..

So, it appears, the dst and neigh can be used for all performances cases.

For the non performance dst == null case, can we just burn cycles and
stuff the daddr in front of the packet at hardheader time, even if we
have to copy?

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ