[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b1f63c3f-5a26-4c69-af8f-216eaac19b69@tessares.net>
Date: Mon, 31 Jul 2023 15:56:40 +0200
From: Matthieu Baerts <matthieu.baerts@...sares.net>
To: Stanislav Fomichev <sdf@...gle.com>
Cc: Paul Moore <paul@...l-moore.com>, Geliang Tang <geliang.tang@...e.com>,
Alexei Starovoitov <ast@...nel.org>, bpf@...r.kernel.org,
netdev@...r.kernel.org, mptcp@...ts.linux.dev, apparmor@...ts.ubuntu.com,
linux-security-module@...r.kernel.org, selinux@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [RFC bpf-next v5] bpf: Force to MPTCP
Hi Stanislav,
On 28/07/2023 18:51, Stanislav Fomichev wrote:
> On 07/28, Matthieu Baerts wrote:
>> Hi Stanislav,
>>
>> On 27/07/2023 20:01, Stanislav Fomichev wrote:
>>> On 07/27, Matthieu Baerts wrote:
>>>> Hi Paul, Stanislav,
>>>>
>>>> On 18/07/2023 18:14, Paul Moore wrote:
>>>>> On Tue, Jul 18, 2023 at 11:21 AM Geliang Tang <geliang.tang@...e.com> wrote:
>>>>>>
>>>>>> As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
>>>>>>
>>>>>> "Your app can create sockets with IPPROTO_MPTCP as the proto:
>>>>>> ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
>>>>>> forced to create and use MPTCP sockets instead of TCP ones via the
>>>>>> mptcpize command bundled with the mptcpd daemon."
>>>>>>
>>>>>> But the mptcpize (LD_PRELOAD technique) command has some limitations
>>>>>> [2]:
>>>>>>
>>>>>> - it doesn't work if the application is not using libc (e.g. GoLang
>>>>>> apps)
>>>>>> - in some envs, it might not be easy to set env vars / change the way
>>>>>> apps are launched, e.g. on Android
>>>>>> - mptcpize needs to be launched with all apps that want MPTCP: we could
>>>>>> have more control from BPF to enable MPTCP only for some apps or all the
>>>>>> ones of a netns or a cgroup, etc.
>>>>>> - it is not in BPF, we cannot talk about it at netdev conf.
>>>>>>
>>>>>> So this patchset attempts to use BPF to implement functions similer to
>>>>>> mptcpize.
>>>>>>
>>>>>> The main idea is add a hook in sys_socket() to change the protocol id
>>>>>> from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/multipath-tcp/mptcp_net-next/wiki
>>>>>> [2]
>>>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/79
>>>>>>
>>>>>> v5:
>>>>>> - add bpf_mptcpify helper.
>>>>>>
>>>>>> v4:
>>>>>> - use lsm_cgroup/socket_create
>>>>>>
>>>>>> v3:
>>>>>> - patch 8: char cmd[128]; -> char cmd[256];
>>>>>>
>>>>>> v2:
>>>>>> - Fix build selftests errors reported by CI
>>>>>>
>>>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
>>>>>> Signed-off-by: Geliang Tang <geliang.tang@...e.com>
>>>>>> ---
>>>>>> include/linux/bpf.h | 1 +
>>>>>> include/linux/lsm_hook_defs.h | 2 +-
>>>>>> include/linux/security.h | 6 +-
>>>>>> include/uapi/linux/bpf.h | 7 +
>>>>>> kernel/bpf/bpf_lsm.c | 2 +
>>>>>> net/mptcp/bpf.c | 20 +++
>>>>>> net/socket.c | 4 +-
>>>>>> security/apparmor/lsm.c | 8 +-
>>>>>> security/security.c | 2 +-
>>>>>> security/selinux/hooks.c | 6 +-
>>>>>> tools/include/uapi/linux/bpf.h | 7 +
>>>>>> .../testing/selftests/bpf/prog_tests/mptcp.c | 128 ++++++++++++++++--
>>>>>> tools/testing/selftests/bpf/progs/mptcpify.c | 17 +++
>>>>>> 13 files changed, 187 insertions(+), 23 deletions(-)
>>>>>> create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
>>>>>
>>>>> ...
>>>>>
>>>>>> diff --git a/security/security.c b/security/security.c
>>>>>> index b720424ca37d..bbebcddce420 100644
>>>>>> --- a/security/security.c
>>>>>> +++ b/security/security.c
>>>>>> @@ -4078,7 +4078,7 @@ EXPORT_SYMBOL(security_unix_may_send);
>>>>>> *
>>>>>> * Return: Returns 0 if permission is granted.
>>>>>> */
>>>>>> -int security_socket_create(int family, int type, int protocol, int kern)
>>>>>> +int security_socket_create(int *family, int *type, int *protocol, int kern)
>>>>>> {
>>>>>> return call_int_hook(socket_create, 0, family, type, protocol, kern);
>>>>>> }
>>>>>
>>>>> Using the LSM to change the protocol family is not something we want
>>>>> to allow. I'm sorry, but you will need to take a different approach.
>>>>
>>>> @Paul: Thank you for your feedback. It makes sense and I understand.
>>>>
>>>> @Stanislav: Despite the fact the implementation was smaller and reusing
>>>> more code, it looks like we cannot go in the direction you suggested. Do
>>>> you think what Geliang suggested before in his v3 [1] can be accepted?
>>>>
>>>> (Note that the v3 is the same as the v1, only some fixes in the selftests.)
>>>
>>> We have too many hooks in networking, so something that doesn't add
>>> a new one is preferable :-(
>>
>> Thank you for your reply and the explanation, I understand.
>>
>>> Moreover, we already have a 'socket init' hook, but it runs a bit late.
>>
>> Indeed. And we cannot move it before the creation of the socket.
>>
>>> Is existing cgroup/sock completely unworkable? Is it possible to
>>> expose some new bpf_upgrade_socket_to(IPPROTO_MPTCP) kfunc which would
>>> call some new net_proto_family->upgrade_to(IPPROTO_MPTCP) to do the surgery?
>>> Or is it too hacky?
>>
>> I cannot judge if it is too hacky or not but if you think it would be
>> OK, please tell us :)
>
> Maybe try and see how it goes? Doing the surgery to convert from tcp
> to mptcp is probably hard, but it seems that we should be able to
> do something like:
>
> int upgrade_to(sock, sk) {
> if (sk is not a tcp one) return -EINVAL;
>
> sk_common_release(sk);
> return inet6_create(net, sock, IPPROTO_MPTCP, false);
> }
>
> ?
>
> The only thing I'm not sure about is whether you can call inet6_create
> on a socket that has seen sk_common_release'd...
Oh sorry, now I better understand your suggestion and the fact it is
hacky. Good workaround, we can keep this in mind if there is no other
solutions to avoid these create-release-create operations.
>>> Another option Alexei suggested is to add some fentry-like thing:
>>>
>>> noinline int update_socket_protocol(int protocol)
>>> {
>>> return protocol;
>>> }
>>> /* TODO: ^^^ add the above to mod_ret set */
>>>
>>> int __sys_socket(int family, int type, int protocol)
>>> {
>>> ...
>>>
>>> protocol = update_socket_protocol(protocol);
>>>
>>> ...
>>> }
>>>
>>> But it's also too problem specific it seems? And it's not cgroup-aware.
>>
>> It looks like it is what Geliang did in his v6. If it is the only
>> acceptable solution, I guess we can do without cgroup support. We can
>> continue the discussions in his v6 if that's easier.
>
> Ack, that works too, let's see how other people feel about it. I'm
> assuming in the bpf program we can always do bpf_get_current_cgroup_id()
> to filter by cgroup.
Good point, that works too and looks enough!
Cheers,
Matt
--
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
Powered by blists - more mailing lists