lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 3 Feb 2020 12:12:03 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Eric Dumazet' <eric.dumazet@...il.com>,
        netdev <netdev@...r.kernel.org>
Subject: RE: Freeing 'temporary' IPv4 route table entries.

From: Eric Dumazet
> Sent: 31 January 2020 15:54
> On 1/31/20 2:26 AM, David Laight wrote:
> > If I call sendmsg() on a raw socket (or probably
> > an unconnected UDP one) rt_dst_alloc() is called
> > in the bowels of ip_route_output_flow() to hold
> > the remote address.
> >
> > Much later __dev_queue_xmit() calls dst_release()
> > to delete the 'dst' referenced from the skb.
> >
> > Prior to f8864972 it did just that.
> > Afterwards the actual delete is 'laundered' through the
> > rcu callbacks.
> > This is probably ok for dst that are actually attached
> > to sockets or tunnels (which aren't freed very often).
> > But it leads to horrid long rcu callback sequences
> > when a lot of messages are sent.
> > (A sample of 1 gave nearly 100 deletes in one go.)
> > There is also the additional cost of deferring the free
> > (and the extra retpoline etc).
> >
> > ISTM that the dst_alloc() done during a send should
> > set a flag so that the 'dst' can be immediately
> > freed since it is known that no one can be picking up
> > a reference as it is being freed.
> >
> > Thoughts?
> >
> 
> I thought these routes were cached in per-cpu caches.
> 
> At least for UDP I do not see rcu callbacks being queueed.

I've done a bit more investigation.

For raw_ip sockets with inet->hdrincl set (ie the application
builds the IPv4 header) flowi4_flags has FLOWI_FLAG_KNOWN_NH set.

This is detected inside __mkroute_output() (in ipv4/route.c)
and forces rt_dst_alloc() be called instead of using the
'dst' from (I think) fib_select_path().
(Is this basically the arp table entry?)

rt_set_nexthop() then calls rt_add_uncached_list().

I suspect that dst_release() is called after every
transmit - but normally just decrements the ref count.
However for raw sends it frees the uncached route.

I think the 'fault' is down to c27c9322d which fixed
an issue where the code was using the IP address from the
pre-built packet instead of the one from the destination
address.

I think there are two issues:
1) if __mkroute_output() creates an 'uncached' route
   it can be freed without waiting for rcu grace.
2) if a raw packets destination address matches then
   the cached route can be used.

Oh - nothing seems to check DST_HOST any more

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ