[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20161129.192233.392359849404330043.davem@davemloft.net>
Date: Tue, 29 Nov 2016 19:22:33 -0500 (EST)
From: David Miller <davem@...emloft.net>
To: hannes@...essinduktion.org
Cc: pabeni@...hat.com, netdev@...r.kernel.org, edumazet@...gle.com,
brouer@...hat.com, sd@...asysnail.net
Subject: Re: [PATCH net-next 5/5] udp: add recvmmsg implementation
From: Hannes Frederic Sowa <hannes@...essinduktion.org>
Date: Fri, 25 Nov 2016 18:09:00 +0100
> During review we discussed on how to handle major errors in the kernel:
>
> The old code and the new code still can report back success even though
> the kernel got back an EFAULT while copying from kernel space to user
> space (due to bad pointers).
>
> I favor that we drop all packets (also the already received batches) in
> this case and let the code report -EFAULT and increase sk_drops for all
> dropped packets from the queue.
>
> Currently sk_err is set so the next syscall would get an -EFAULT, which
> seems very bad and can also be overwritten by incoming icmp packets, so
> we never get a notification that we actually had a bad pointer somewhere
> in the mmsghdr. Also delivering -EFAULT on the follow-up syscalls really
> will make people confused that use strace.
>
> If people would like to know the amount of packets dropped we can make
> sk_drops readable by an getsockopt.
>
> Thoughts?
>
> Unfortunately the interface doesn't allow for better error handling.
I think this is a major problem.
If, as a side effect of batch dequeueing the SKBs from the socket,
you cannot stop properly mid-transfer if an error occurs, well then
you simply cannot batch like that.
You have to stop the exact byte where an error occurs mid-stream,
return the successful amount of bytes transferred, and then return
the error on the next recvmmsg call.
There is no other sane error reporting strategy.
If I get 4 frames, and the kernel can successfully copy the first
three and get an -EFAULT on the 4th. Dammit you better tell the
application this so it can properly process the first 3 packets and
then determine how it is going to error out and recover for the 4th
one.
If we need to add prioritized sk_err stuff, or another value like
"sk_app_err" to handle the ICMP vs. -EFAULT issue, so be it.
I know what you guys are thinking, in that you can't figure out a
way to avoid the transactional overhead if it is necessary to
"put back" some SKBs if one of them in the batch gets a fault.
That's too bad, we need a proper implementation and proper error
reporting. Those performance numbers are useless if we effectively
lose error notifications.
Powered by blists - more mailing lists