netdev - Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <be0d17cf-12d5-440c-adee-e943ccb199c9@default>
Date:   Thu, 15 Mar 2018 05:23:41 -0700 (PDT)
From:   Liran Alon <liran.alon@...cle.com>
To:     <daniel@...earbox.net>
Cc:     <netdev@...r.kernel.org>, <shmulik.ladkani@...il.com>,
        <davem@...emloft.net>, <linux-kernel@...r.kernel.org>,
        <yuval.shaia@...cle.com>, <idan.brown@...cle.com>
Subject: Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info
 only when crossing netns


----- daniel@...earbox.net wrote:

> On 03/15/2018 10:21 AM, Shmulik Ladkani wrote:
> > Regarding the premise of this commit, this "reduces" the
> > ipvs/orphan/mark scrubbing in the following *non* xnet situations:
> > 
> >  1. mac2vlan port xmit to other macvlan ports in Bridge Mode
> >  2. similarly for ipvlan
> >  3. veth xmit
> >  4. l2tp_eth_dev_recv
> >  5. bpf redirect/clone_redirect ingress actions
> > 
> > Regarding l2tp recv, this commit seems to align the srubbing
> behavior
> > with ip tunnels (full scrub only if crossing netns, see
> ip_tunnel_rcv).
> > 
> > Regarding veth xmit, it does makes sense to preserve the fields if
> not
> > crossing netns. This is also the case when one uses tc mirred.
> > 
> > Regarding bpf redirect, well, it depends on the expectations of each
> bpf
> > program.
> > I'd argue that preserving the fields (at least the mark field) in
> the
> > *non* xnet makes sense and provides more information and therefore
> more
> > capabilities; Alas this might change behavior already being relied
> on.
> > 
> > Maybe Daniel can comment on the matter.
> 
> Overall I think it might be nice to not need scrubbing skb in such
> cases,
> although my concern would be that this has potential to break
> existing
> setups when they would expect mark being zero on other veth peer in
> any
> case since it's the behavior for a long time already. The safer
> option
> would be to have some sort of explicit opt-in e.g. on link creation to
> let
> the skb->mark pass through unscrubbed. This would definitely be a
> useful
> option e.g. when mark is set in the netns facing veth via
> clsact/egress
> on xmit and when the container is unprivileged anyway.
> 
> Thanks,
> Daniel

I see your point in regards to backwards comparability.
However, not scrubbing skb when it cross netns via some kernel functions compared to
others is basically a bug which could easily break with a little bit of more refactoring.
Therefore, it seems a bit weird to me to from now on, we will force
every user on link creation to consider that once there was a bug leading
to this weird behavior on specific netdevs.
Thus, I suggest to maybe control this via a global /proc/sys/net file instead.

-Liran