lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d87b06a38f6a8fa61abc36b06f108568b82bfa21.camel@mailbox.tu-berlin.de>
Date:   Tue, 14 Jun 2022 12:51:23 +0200
From:   Jörn-Thorben Hinz <jthinz@...lbox.tu-berlin.de>
To:     Martin KaFai Lau <kafai@...com>
CC:     <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        "Daniel Borkmann" <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>, <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next 2/2] bpf: Require only one of cong_avoid() and
 cong_control() from a TCP CC

On Thu, 2022-06-09 at 11:55 -0700, Martin KaFai Lau wrote:
> On Thu, Jun 09, 2022 at 10:55:25AM +0200, Jörn-Thorben Hinz wrote:
> > Thanks for the feedback, Martin.
> > 
> > On Wed, 2022-06-08 at 11:33 -0700, Martin KaFai Lau wrote:
> > > On Wed, Jun 08, 2022 at 07:48:43PM +0200, Jörn-Thorben Hinz
> > > wrote:
> > > > When a CC implements tcp_congestion_ops.cong_control(), the
> > > > alternate
> > > > cong_avoid() is not in use in the TCP stack. Do not force a BPF
> > > > CC
> > > > to
> > > > implement cong_avoid() as a no-op by always requiring it.
> > > > 
> > > > An incomplete BPF CC implementing neither cong_avoid() nor
> > > > cong_control() will still get rejected by
> > > > tcp_register_congestion_control().
> > > > 
> > > > Signed-off-by: Jörn-Thorben Hinz <jthinz@...lbox.tu-berlin.de>
> > > > ---
> > > >  net/ipv4/bpf_tcp_ca.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
> > > > index 1f5c53ede4e5..37290d0bf134 100644
> > > > --- a/net/ipv4/bpf_tcp_ca.c
> > > > +++ b/net/ipv4/bpf_tcp_ca.c
> > > > @@ -17,6 +17,7 @@ extern struct bpf_struct_ops
> > > > bpf_tcp_congestion_ops;
> > > >  static u32 optional_ops[] = {
> > > >         offsetof(struct tcp_congestion_ops, init),
> > > >         offsetof(struct tcp_congestion_ops, release),
> > > > +       offsetof(struct tcp_congestion_ops, cong_avoid),
> > > At least one of the cong_avoid() or cong_control() is needed.
> > > It is better to remove is_optional(moff) check and its
> > > optional_ops[]
> > > here.  Only depends on the tcp_register_congestion_control()
> > > which
> > > does a similar check at the beginning.
> > You mean completely remove this part of the validation from
> > bpf_tcp_ca.c and just rely on tcp_register_congestion_control()?
> > True,
> Yes.
> 
> > that would be even easier to maintain at this point, make
> > tcp_register_congestion_control() the one-and-only place that has
> > to
> > know about required and optional functions.
> > 
> > Will rework the second patch.
> > 
> > > 
> > > Patch 1 looks good.  tcp_bbr.c also needs the sk_pacing fields.
> > > 
> > > A selftest is needed.  Can you share your bpf tcp-cc and
> > > use it as a selftest to exercise the change in this patch
> > > set ?
> > I cannot do that just now, unfortunately. It’s still earlier work
> > in
> > progress. Also, it will have an additional, external dependency
> > which
> > might make it unfit to be included here/as a selftest. I will keep
> > it
> > in mind for later this year, though.
> What is the external dependency ?  Could you share some high level
> of the CC you are developing ?
> The reason for this question is to see if there is something
> missing from the kernel side to write the tcp-cc in bpf that you
> are developing.
The algorithm is PowerTCP[1], I’m currently implementing it with eBPF.
It requires telemetry from the network.

The mentioned dependency (not developed by us, not yet released) will
provide such telemetry. Its end-host part is implemented with eBPF
itself and does not require any additional kernel patches.

At the moment I’m not aware of any other bits missing from the kernel
side. Will propose another patch if that changes or at least report it.

[1] https://www.usenix.org/system/files/nsdi22-paper-addanki_3.pdf

> 
> > In the meantime, I could look into adding a more naive/trivial
> > test,
> > that implements cong_control() without cong_avoid() and relies on
> > sk_pacing_* being writable, if you would prefer that? Would that be
> > fine as a follow-up patch (might take me a moment) or better be
> > included in this series?
> Yeah, it will do and the test should be submitted together in
> this series.
Please see v3 of the series.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ