[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVM6ooAH7eaZc6Ugh3FOon3M-ohWAS_CVQFc_194Vj9GA@mail.gmail.com>
Date: Thu, 8 Feb 2024 12:40:36 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Cc: Willem de Bruijn <willemb@...gle.com>, "David S. Miller" <davem@...emloft.net>,
Network Development <netdev@...r.kernel.org>, Jakub Kicinski <kuba@...nel.org>
Subject: Re: SOF_TIMESTAMPING_OPT_ID is unreliable when sendmsg fails
On Thu, Feb 8, 2024 at 12:05 PM Andy Lutomirski <luto@...capital.net> wrote:
>
> On Thu, Feb 8, 2024 at 11:55 AM Vadim Fedorenko
> <vadim.fedorenko@...ux.dev> wrote:
> >
> > On 08/02/2024 18:02, Andy Lutomirski wrote:
> > > I’ve been using OPT_ID-style timestamping for years, but for some
> > > reason this issue only bit me last week: if sendmsg() fails on a UDP
> > > or ping socket, sk_tskey is poorly. It may or may not get incremented
> > > by the failed sendmsg().
> > >
> > Well, there are several error paths, for sure. For the sockets you
> > mention the increment of tskey happens at __ip{,6}_append_data. There
> > are 2 different types of failures which can happen after the increment.
> > The first is MTU check fail, another one is memory allocation failures.
> > I believe we can move increment to a later position, after MTU check in
> > both functions to avoid first type of problem.
>
> For reasons that I still haven't deciphered, I'm sporadically getting
> EHOSTUNREACH after the increment. I can't find anything in the code
> that would cause that, and every time I try to instrument it, it stops
> happening :( I sendmsg to the same destination several times in rapid
> succession, and at most one of them will get EHOSTUNREACH.
I caught it in strace, finally. And I also finally grepped the right
part of the kernel tree to (I think) find the offending call chain.
__ip_append_data first increments sk_tskey. Then it does:
if (transhdrlen) {
skb = sock_alloc_send_skb(sk, alloclen,
(flags & MSG_DONTWAIT), &err);
(I have no idea why the transhdrlen path is different.) That does:
static inline struct sk_buff *sock_alloc_send_skb(struct sock *sk,
unsigned long size,
int noblock, int *errcode)
{
return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0);
}
That does:
struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
unsigned long data_len, int noblock,
int *errcode, int max_page_order)
{
struct sk_buff *skb;
long timeo;
int err;
timeo = sock_sndtimeo(sk, noblock);
for (;;) {
err = sock_error(sk);
I'm utterly baffled why that check makes any sense whatsoever. git
blame informs me that it predates 2002.
I'll contemplate a bit more and send a patch.
Powered by blists - more mailing lists