netdev - Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20160909084834.7067784c@halley>
Date:   Fri, 9 Sep 2016 08:48:34 +0300
From:   Shmulik Ladkani <shmulik.ladkani@...il.com>
To:     netdev@...r.kernel.org
Cc:     Florian Westphal <fw@...len.de>,
        "David S. Miller" <davem@...emloft.net>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Alexander Duyck <alexander.h.duyck@...el.com>
Subject: Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size
 clamping if segments exceed mtu

On Thu, 25 Aug 2016 12:05:33 +0300 Shmulik Ladkani <shmulik.ladkani@...il.com> wrote:
> The BUG occurs when GRO occurs on the ingress, and only if GRO merges
> skbs into the frag_list (OTOH when segments are only placed into frags[]
> of a single skb, skb_segment succeeds even if gso_size was altered).
> 
> This is due to an assumption that the frag_list members terminate on
> exact MSS boundaries (which no longer holds during gso_size clamping).
> 
> We have few alternatives for gso_size clamping:
> 
> 1 Fix 'skb_segment' arithmentics to support inputs that do not match
>   the "frag_list members terminate on exact MSS" assumption.
> 
> 2 Perform gso_size clamping in 'ip_finish_output_gso' for non-GROed skbs.
>   Other usecases will still benefit: (a) packets arriving from
>   virtualized interfaces, e.g. tap and friends; (b) packets arriving from
>   veth of other namespaces (packets are locally generated by TCP stack
>   on a different netns).
> 
> 3 Ditch the idea, again ;)
> 
> We can persue (1), especially if there are other benefits doing so.
> OTOH due to the current complexity of 'skb_segment' this is bit risky.
> 
> Going with (2) could be reasonable too, as it brings value for
> the virtualized environmnets that need gso_size clamping, while
> presenting minimal risk.

Summarizing actions taken, in case someone refers to this thread.

- Re (1): Spent a short while massaging skb_segment().
  Code is not prepared to support various gso_size inputs.
  Main issue is that if nskb's frags[] get exausted (but original
  frag_skb's frags[] not yet fully traversed), there's no generation of
  a new skb. Code expects interation of both nskb's frags[] and
  frag_skb's frags[] to terminate together; the following allocated new
  skb is always a clone of next frag_skb in the original head_skb.
  Supporting various gso_size inputs required an intrusive rewrite.

- Re (2): There's no easy way for ip_finish_output_gso() to detect that
  the skb is safe for "gso_size clamping" while preserving GSO/GRO
  transparency:
  We can know it is "gso_size clamping safe" PER SKB, but it doesn't
  suffice; to preserve GRO transparecy rule, we must know skb arrived
  from a code flow that is ALWAYS safe for gso_size clamping.

So I ended up identifying the relevant code-flow of the use-case I'm
interested on, verified it is indeed safe for altering gso_size (while
taking a slight risk that this might not hold true in the future).
I've used that mark as the criteria for safe "gso_size clamping" in
'ip_finish_output_gso'. Yep, not too elegant.

Regards,
Shmulik