netdev - Re: [RFC bpf-next 0/8] BPF 'force to MPTCP'

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <22c30f70a632afb65b6cb2a7554e919673d48871.camel@redhat.com>
Date: Fri, 14 Jul 2023 09:56:06 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Stanislav Fomichev <sdf@...gle.com>, Alexei Starovoitov
	 <alexei.starovoitov@...il.com>
Cc: Geliang Tang <geliang.tang@...e.com>, Alexei Starovoitov
 <ast@...nel.org>,  Daniel Borkmann <daniel@...earbox.net>, John Fastabend
 <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>, Martin
 KaFai Lau <martin.lau@...ux.dev>, Song Liu <song@...nel.org>, Yonghong Song
 <yhs@...com>, KP Singh <kpsingh@...nel.org>,  Hao Luo <haoluo@...gle.com>,
 Jiri Olsa <jolsa@...nel.org>, bpf <bpf@...r.kernel.org>, MPTCP Upstream
 <mptcp@...ts.linux.dev>, netdev@...r.kernel.org
Subject: Re: [RFC bpf-next 0/8] BPF 'force to MPTCP'

On Thu, 2023-07-13 at 16:09 -0700, Stanislav Fomichev wrote:
> On 07/13, Alexei Starovoitov wrote:
> > imo all 3 options including this 4th one are too hacky.
> > I understand ld_preload limitations and desire to have it per-cgroup,
> > but messing this much with user space feels a little bit too much.
> > What side effects will it cause?
> 
> Maybe all that is really needed is some new per-netns sysctl to automatically
> upgrade from IPPROTO_TCP to IPPROTO_MPTCP? Or is it too broad of a
> brush?

I think it would be actually too broad, see below...

> I've also CC'd netdev for visibility...
> 
> > Meaning is this enough to just change the proto?
> > Nothing in user space later on needs to be aware the protocol is so different?
> 
> IIUC, if you use IPPROTO_MPTCP, you just get regular TCP until you start
> adding extra routes (via netlink). That's why their current
> unconditional IPPROTO_TCP->IPPROTO_MPTCP rewrite via ld_preload also somewhat
> works.

FTR, it the other way around: when using IPPROTO_MPTCP you always get
MPTCP protocol handshake that downgrade gracefully to TCP if the peer
does not support it. Then multiple paths can be added/enabled by
different means, but that is another matter - a quite orthogonal one.

The transition to TCP in currently not completely for free: active
(client) MPTCP sockets fallen-back to TCP will keep some overhead vs
plain TCP ones.

Being able to control the IPPROTO_TCP->IPPROTO_MPTCP change on per
socket basis do offer some advantages e.g. constraining the change to
the sockets that are likely to complete successfully the MPTCP
handshake. 

> > I feel the consequences are too drastic to support such thing
> > through an official/stable hook.
> > We can consider an fmod_ret unstable hook somewhere in the kernel
> > that bpf prog can attach to and tweak the ret value and/or args,
> > but the production environment won't be using it.
> > It will be a temporary gap until user space is properly converted to mptcp.
> 
> Asking every app to do s/IPPROTO_TCP/IPPROTO_MPTCP/ might be annoying
> though? (don't have a horse in this race, but have some v4->v6 migration
> vibes from this)

I can do only wild guesses, but I also expect such "transition" to be
extremely long and/or incomplete.

Cheers,

Paolo