netdev - Re: SOF_TIMESTAMPING_OPT_ID is unreliable when sendmsg fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVM6ooAH7eaZc6Ugh3FOon3M-ohWAS_CVQFc_194Vj9GA@mail.gmail.com>
Date: Thu, 8 Feb 2024 12:40:36 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Cc: Willem de Bruijn <willemb@...gle.com>, "David S. Miller" <davem@...emloft.net>, 
	Network Development <netdev@...r.kernel.org>, Jakub Kicinski <kuba@...nel.org>
Subject: Re: SOF_TIMESTAMPING_OPT_ID is unreliable when sendmsg fails

On Thu, Feb 8, 2024 at 12:05 PM Andy Lutomirski <luto@...capital.net> wrote:
>
> On Thu, Feb 8, 2024 at 11:55 AM Vadim Fedorenko
> <vadim.fedorenko@...ux.dev> wrote:
> >
> > On 08/02/2024 18:02, Andy Lutomirski wrote:
> > > I’ve been using OPT_ID-style timestamping for years, but for some
> > > reason this issue only bit me last week: if sendmsg() fails on a UDP
> > > or ping socket, sk_tskey is poorly.  It may or may not get incremented
> > > by the failed sendmsg().
> > >
> > Well, there are several error paths, for sure. For the sockets you
> > mention the increment of tskey happens at __ip{,6}_append_data. There
> > are 2 different types of failures which can happen after the increment.
> > The first is MTU check fail, another one is memory allocation failures.
> > I believe we can move increment to a later position, after MTU check in
> > both functions to avoid first type of problem.
>
> For reasons that I still haven't deciphered, I'm sporadically getting
> EHOSTUNREACH after the increment.  I can't find anything in the code
> that would cause that, and every time I try to instrument it, it stops
> happening :(  I sendmsg to the same destination several times in rapid
> succession, and at most one of them will get EHOSTUNREACH.

I caught it in strace, finally.  And I also finally grepped the right
part of the kernel tree to (I think) find the offending call chain.

__ip_append_data first increments sk_tskey.  Then it does:

            if (transhdrlen) {
                skb = sock_alloc_send_skb(sk, alloclen,
                        (flags & MSG_DONTWAIT), &err);

(I have no idea why the transhdrlen path is different.)  That does:

static inline struct sk_buff *sock_alloc_send_skb(struct sock *sk,
                          unsigned long size,
                          int noblock, int *errcode)
{
    return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0);
}

That does:

struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
                     unsigned long data_len, int noblock,
                     int *errcode, int max_page_order)
{
    struct sk_buff *skb;
    long timeo;
    int err;

    timeo = sock_sndtimeo(sk, noblock);
    for (;;) {
        err = sock_error(sk);

I'm utterly baffled why that check makes any sense whatsoever.  git
blame informs me that it predates 2002.

I'll contemplate a bit more and send a patch.