lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 8 Sep 2020 10:55:30 +0200
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Xie He <xie.he.0141@...il.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        John Ogness <john.ogness@...utronix.de>,
        Eric Dumazet <edumazet@...gle.com>,
        Or Cohen <orcohen@...oaltonetworks.com>,
        Arnd Bergmann <arnd@...db.de>,
        Network Development <netdev@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Brian Norris <briannorris@...omium.org>,
        Cong Wang <xiyou.wangcong@...il.com>
Subject: Re: [PATCH net] net/packet: Fix a comment about hard_header_len and
 headroom allocation

On Tue, Sep 8, 2020 at 2:07 AM Xie He <xie.he.0141@...il.com> wrote:
>
> Thank you for your comment!
>
> On Mon, Sep 7, 2020 at 2:41 AM Willem de Bruijn
> <willemdebruijn.kernel@...il.com> wrote:
> >
> > On Sun, Sep 6, 2020 at 5:18 AM Xie He <xie.he.0141@...il.com> wrote:
> > >
> > > This comment is outdated and no longer reflects the actual implementation
> > > of af_packet.c.
> >
> > If it was previously true, can you point to a commit that changes the behavior?
>
> This is my understanding about the history of "af_packet.c":
>
> 1. Pre git history
>
> At first, before "needed_headroom" was introduced, "hard_header_len"
> was the only way for a driver to request headroom. However,
> "hard_header_len" was also used in "af_packet.c" for processing the
> header. There was a confusion / disagreement between "af_packet.c"
> developers and driver developers about the use of "hard_header_len".
> "af_packet.c" developers would assume that all headers were visible to
> them through dev->header_ops (called dev->hard_header at that time?).
> But the developers of some drivers were not able to expose all their
> headers to "af_packet.c" through header_ops (for example, in tunnel
> drivers). These drivers still requested the headroom via
> "hard_header_len" but this created bugs for "af_packet.c" because
> "af_packet.c" would assume "hard_header_len" was the length of the
> header visible to them through header_ops.
>
> Therefore, in Linux version 2.1.43pre1, the FIXME comment was added.
> In this comment, "af_packet.c" developers clearly stated that not
> exposing the header through header_ops was a bug that needed to be
> fixed in the drivers. But I think driver developers were not able to
> agree because some drivers really had a need to add their own header
> without using header_ops (for example in tunnel drivers).
>
> In Linux version 2.1.68, the developer of "af_packet.c" compromised
> and recognized the use of "hard_header_len" even when there is no
> header_ops, by adding the comment I'm trying to change now. But I
> guess some other developers of "af_packet.c" continued to treat
> "hard_header_len" to be the length of header of header_ops and created
> a lot of problems.
>
> 2. Introduction of "needed_headroom"
>
> Because this issue has troubled for developers for long, in 2008,
> developers introduced "needed_headroom" to solve this problem.
> "needed_headroom" has only one purpose - reserve headroom. It is not
> used in af_packet.c for processing so drivers can safely use it to
> request headroom without exposing the header via header_ops.
>
> The commit was:
> commit f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom")
>
> After "needed_headroom" was introduced, all drivers that needed to
> reserve the headroom but didn't want "af_packet.c" to interfere should
> change to "needed_headroom".
>
> From this point on, "af_packet.c" developers were able to assume
> "hard_header_len" was only used for header processing purposes in
> "af_packet.c".

Very nice archeology!

Thanks for summarizing.

> 3. Not reserving the headroom of hard_header_len for RAW sockets
>
> Another very important point in history is these two commits in 2018:
> commit b84bbaf7a6c8 ("packet: in packet_snd start writing at link
> layer allocation")
> commit 9aad13b087ab ("packet: fix reserve calculation")
>
> These two commits changed packet_snd to the present state and made it
> no long reserve the headroom of hard_header_len for RAW sockets. This
> made drivers' switching from hard_header_len to needed_headroom became
> urgent because otherwise they might have a kernel panic when used with
> RAW sockets.
>
> > > In this file, the function packet_snd first reserves a headroom of
> > > length (dev->hard_header_len + dev->needed_headroom).
> > > Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header,
> > > which calls dev->header_ops->create, to create the link layer header.
> > > If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of
> > > length (dev->hard_header_len), and checks if the user has provided a
> > > header of length (dev->hard_header_len) (in dev_validate_header).
> >
> > Not entirely, a header greater than dev->min_header_len that passes
> > dev_validate_header.
>
> Yes, I understand. The function checks both hard_header_len and
> min_header_len. I want to explain the role of hard_header_len in
> dev_validate_header. But I feel a little hard to concisely explain
> this without simplifying a little bit.

Ack.

> > >  /*
> > >     Assumptions:
> > > -   - if device has no dev->hard_header routine, it adds and removes ll header
> > > -     inside itself. In this case ll header is invisible outside of device,
> > > -     but higher levels still should reserve dev->hard_header_len.
> > > -     Some devices are enough clever to reallocate skb, when header
> > > -     will not fit to reserved space (tunnel), another ones are silly
> > > -     (PPP).
> > > +   - If the device has no dev->header_ops, there is no LL header visible
> > > +     outside of the device. In this case, its hard_header_len should be 0.
> >
> > Such a constraint is more robustly captured with a compile time
> > BUILD_BUG_ON check. Please do add a comment that summarizes why the
> > invariant holds.
>
> I'm not sure how to do this. I guess both header_ops and
> hard_header_len are assigned at runtime. (Right?) I guess we are not
> able to check this at compile-time.

header_ops should be compile constant, and most devices use
struct initializers for hard_header_len, but of course you're right.

Perhaps a WARN_ON_ONCE, then.

> > More about the older comment, but if reusing: it's not entirely clear
> > to me what "outside of the device" means. The upper layers that
> > receive data from the device and send data to it, including
> > packet_snd, I suppose? Not the lower layers, clearly. Maybe that can
> > be more specific.
>
> Yes, right. If a header is visible "outside of the device", it means
> the header is exposed to upper layers via "header_ops". If a header is
> not visible "outside of the device" and is only used "internally", it
> means the header is not exposed to upper layers via "header_ops".
> Maybe we can change it to "outside of the device driver"? We can
> borrow the idea of encapsulation in object-oriented programming - some
> things that happen inside a software component should not be visible
> outside of that software component.

How about "above"? If sketched as a network stack diagram, the code
paths and devices below the (possibly tunnel) device do see packets
with link layer header.

Powered by blists - more mailing lists