netdev - Re: [RFC PATCH]xfrm: fix perpetual bundles

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 02 Mar 2010 07:11:45 -0500
From:	jamal <hadi@...erus.ca>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	davem@...emloft.net, kaber@...sh.net, yoshfuji@...ux-ipv6.org,
	nakam@...ux-ipv6.org, eric.dumazet@...il.com,
	netdev@...r.kernel.org,
	Steffen Klassert <steffen.klassert@...unet.com>
Subject: Re: [RFC PATCH]xfrm: fix perpetual bundles

On Tue, 2010-03-02 at 19:27 +0800, Herbert Xu wrote:
> On Wed, Feb 24, 2010 at 08:19:24AM -0500, jamal wrote:

> > 1)In the connect() stage, in the slow path a route cache is
> > created with the rth->fl.fl4_src of 0.0.0.0...
> > ==> policy->bundles is empty, so we do a lookup, fail, create
> > one.. (remember rth->fl.fl4_src of 0.0.0.0 at this stage and
> > thats what we end storing in the bundle/xdst for later comparison
> > instead of the skb's fl)
> 
> So this is root number 1.  When this stuff was first written this
> case simply wasn't possible.  So I think the question we need to
> ask here is can we get a valid address there at the connect stage?

fl->fl4_src is valid non-zero. But in xfrm4_fill_dst()
we do wholesale copy of xdst->u.rt.fl = rt->fl; and rt->fl.fl4_src is
0.0.0.0.


> After all, for non-IPsec connect(2)s, you do get a valid IP address.
> So I don't see why the IPsec case should be different.
>
> Creating a bundle with a zero source address is just a hack to
> make connect(2) succeed immediately.  AFAICS getting a valid IP
> address can also be done without waiting for the whole IPsec state
> to be created.
> 

I did try to "fix it" above via:
+               if (!xdst->u.rt.fl.fl4_src) {
+                       xdst->u.rt.fl.fl4_src = fl->fl4_src;
+               }

But this breaks again later in sendmsg bundle lookup because of 
XFRM_SUB_POLICY. If i turned off config XFRM_SUB_POLICY, then
all works. I didnt look closely, but SUB_POLICY does do a memcpy
or two off the dst passed in connect() - which has the wrong src.
So i would have to "fix" a few more spots for it to work. This is
where i gave up concluding that i was just plugging with band-aids.

> Of course if anybody is still interested we could also revisit
> the neighbouresque queueing idea.

not plugged into that discussion..

> > 2)ping sends a packet (does a sendmsg) 
> > ==> xfrm_find_bundle() ends up comparing skb's->fl (non-zero
> > fl->fl4_src) with what we stored from #1b. Fails.
> > ==> we create a new bundle at attach the old one at the end of it.
> > ...and now policy->bundles has two xdst entries
> 
> This is the way it's supposed to work.

> > 3) Repeat #2, and now we have 3 xdsts in policy bundles
> 
> This is what I don't understand.  The code is supposed to look
> at every bundle attached to the policy.  So why doesn't it find
> the one we created at step #2?

The issue is that the comparison is between xdst->u.rt.fl.fl4_src and
fl->fl4_src. fl->fl4_src is always non-zero. stored
xdst->u.rt.fl.fl4_src is always zero 

> In conclusion, I think we have two problems, with the second
> one being the most immediate cause of your symptoms.

Remember the route cache (refer to dst copying above) is created at
connect time;->

So Steffen (on CC) tried to "fix it" by fixing at route cache creation
time. His approach:

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2778,15 +2778,26 @@ int ip_route_output_flow(struct net *net, struct
rtable **rp, struct flowi *flp,
                         struct sock *sk, int flags)
 {
        int err;
+       int update_route = 0;
 
        if ((err = __ip_route_output_key(net, rp, flp)) != 0)
                return err;
 
        if (flp->proto) {
-               if (!flp->fl4_src)
+               if (!flp->fl4_src) {
                        flp->fl4_src = (*rp)->rt_src;
-               if (!flp->fl4_dst)
+                       update_route = 1;
+               }
+               if (!flp->fl4_dst) {
                        flp->fl4_dst = (*rp)->rt_dst;
+                       update_route = 1;
+               }
+               if (update_route) {
+                       dst_release(&(*rp)->u.dst);
+                       if ((err = __ip_route_output_key(net, rp,
flp)) != 0)
+                               return err;
+               }
+
                err = __xfrm_lookup(net, (struct dst_entry **)rp, flp,
sk,
                                    flags ? XFRM_LOOKUP_WAIT : 0);
                if (err == -EREMOTE)
--

I was worried about the impact of this on something else that expects
the behavior.

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html