lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQ++0o-YoFR=yUcbJvaLX2rmWjC2LHjc3yyCp+bkgWGn1Q@mail.gmail.com>
Date: Wed, 26 Feb 2025 09:30:18 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Jakub Kicinski <kuba@...nel.org>
Cc: bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, 
	Network Development <netdev@...r.kernel.org>, "Karlsson, Magnus" <magnus.karlsson@...el.com>, 
	Martin KaFai Lau <martin.lau@...ux.dev>, Amery Hung <ameryhung@...il.com>
Subject: Re: [PATCH bpf-next 0/3] bpf: introduce skb refcount kfuncs

On Tue, Feb 25, 2025 at 6:27 AM Maciej Fijalkowski
<maciej.fijalkowski@...el.com> wrote:
>
> On Fri, Feb 21, 2025 at 05:55:57PM -0800, Alexei Starovoitov wrote:
> > On Thu, Feb 20, 2025 at 5:45 AM Maciej Fijalkowski
> > <maciej.fijalkowski@...el.com> wrote:
> > >
> > > Hi!
> > >
> > > This patchset provides what is needed for storing skbs as kptrs in bpf
> > > maps. We start with necessary kernel change as discussed at [0] with
> > > Martin, then next patch adds kfuncs for handling skb refcount and on top
> > > of that a test case is added where one program stores skbs and then next
> > > program is able to retrieve them from map.
> > >
> > > Martin, regarding the kernel change I decided to go with boolean
> > > approach instead of what you initially suggested. Let me know if it
> > > works for you.
> > >
> > > Thanks,
> > > Maciej
> > >
> > > [0]: https://lore.kernel.org/bpf/Z0X%2F9PhIhvQwsgfW@boxer/
> >
> > Before we go further we need a lot more details on "why" part.
> > In the old thread I was able to find:
> >
> > > On TC egress hook skb is stored in a map ...
> > > During TC ingress hook on the same interface, the skb that was previously
> > stored in map is retrieved ...
> >
> > This is too cryptic. I see several concerns with such use case
> > including netns crossing, L2/L3 mismatch, skb_scrub.
> >
> > I doubt we can make such "skb stash in a map" safe without
> > restricting the usage, so please provide detailed
> > description of the use case.
>
> Hi Alexei,
>
> We have a system with two nodes: one is a fully fledged Linux system (big node)
> and the other one a smaller embedded system. The big node runs Linux PTP for
> time synchronization, the smaller node we have no control over (might run Linux
> or something else). The problem is that we would like to use the Tx timestamps
> from the small node in the Linux PTP application on the big node. When a packet
> is sent out from the big node it arrives at the small node that send it out one
> of its interfaces. It then replies with another packet back to the big node
> with the Tx timestamp in it.
>
> Our current PoC for attacking this is to store the skb in a map (using this
> patch set) when it is sent out from the big node then retrieve it from the map
> when the reply from the small node is received. We then take the timestamp from
> the packet and put it in the skb and send it up to the socket error queue so
> that Linux PTP works out of the box.

This sounds like you're actually xmit-ing the skb out of the big node
and storing it in a map via simple refcnt++.
That may work in some setups, but in general is not quite correct
from networking stack pov.
You need to skb_clone() it and keep the clone, so only cloned skb
can go into the socket error queue and up to user space.
xmit-ing the same skb and sending to user space
is going to cause issues.

Cleaner design would probably involve bpf_clone_redirect()
and may be some form of bpf-qdisc where packet is waiting in the queue
until its hwtimestamp is adjusted and its released from the queue
into user space.

What happens when small node doesn't send that timestamped packet?
The map will eventually overflow, right?
Overall it feels that stashing skb-s in a map isn't the right approach.

> Note that for the purpose of setting skb->hwtstamp and sending it up to the
> error queue we are adding a kfunc (bpf_tx_tstamp) responsible for it, which is
> not included in this set, so I understand it is not obvious what we achieved
> with the current form of this patch set being discussed.
>
> We did not consider the restrictions that should be implemented from netns POV,
> so that is a good point. How would you see this being fixed?
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ