[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSc3O4XQAmtyY5Fwy96nL17ewdCouvwAJ=6DeMUcQUiz8A@mail.gmail.com>
Date: Wed, 18 Sep 2019 11:35:35 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Josh Hunt <johunt@...mai.com>
Cc: netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
David Miller <davem@...emloft.net>
Subject: Re: udp sendmsg ENOBUFS clarification
On Tue, Sep 17, 2019 at 4:20 PM Josh Hunt <johunt@...mai.com> wrote:
>
> I was running some tests recently with the udpgso_bench_tx benchmark in
> selftests and noticed that in some configurations it reported sending
> more than line rate! Looking into it more I found that I was overflowing
> the qdisc queue and so it was sending back NET_XMIT_DROP however this
> error did not propagate back up to the application and so it assumed
> whatever it sent was done successfully. That's when I learned about
> IP_RECVERR and saw that the benchmark isn't using that socket option.
>
> That's all fairly straightforward, but what I was hoping to get
> clarification on is where is the line drawn on when or when not to send
> ENOBUFS back to the application if IP_RECVERR is *not* set? My guess
> based on going through the code is that as long as the packet leaves the
> stack (in this case sent to the qdisc) that's where we stop reporting
> ENOBUFS back to the application, but can someone confirm?
Once a packet is queued the system call may return, so any subsequent
drops after dequeue are not propagated back. The relevant rc is set in
__dev_xmit_skb on q->enqueue. On setups with multiple devices, such as
a tunnel or bonding path, enqueue on the lower device is similar not
propagated.
> For example, we sanitize the error in udp_send_skb():
> send:
> err = ip_send_skb(sock_net(sk), skb);
> if (err) {
> if (err == -ENOBUFS && !inet->recverr) {
> UDP_INC_STATS(sock_net(sk),
> UDP_MIB_SNDBUFERRORS, is_udplite);
> err = 0;
> }
> } else
>
>
> but in udp_sendmsg() we don't:
>
> if (err == -ENOBUFS || test_bit(SOCK_NOSPACE,
> &sk->sk_socket->flags)) {
> UDP_INC_STATS(sock_net(sk),
> UDP_MIB_SNDBUFERRORS, is_udplite);
> }
> return err;
That's interesting. My --incorrect-- understanding until now had been
that IP_RECVERR does nothing but enable optional extra detailed error
reporting on top of system call error codes.
But indeed it enables backpressure being reported as a system call
error that is suppressed otherwise. I don't know why. The behavior
precedes git history.
> In the case above it looks like we may only get ENOBUFS for allocation
> failures inside of the stack in udp_sendmsg() and so that's why we
> propagate the error back up to the application?
Both the udp lockless fast path and the slow corked path go through
udp_send_skb, so the backpressure is suppressed consistently across
both cases.
Indeed the error handling in udp_sendmsg then is not related to
backpressure, but to other causes of ENOBUF, i.e., allocation failure.
Powered by blists - more mailing lists