[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d25962b44b2d4204a7251d97c331fcf8@AcuMS.aculab.com>
Date: Wed, 12 Apr 2023 07:39:26 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'David Ahern' <dsahern@...nel.org>, Jens Axboe <axboe@...nel.dk>,
"Breno Leitao" <leitao@...ian.org>
CC: Willem de Bruijn <willemb@...gle.com>,
"io-uring@...r.kernel.org" <io-uring@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kuba@...nel.org" <kuba@...nel.org>,
"asml.silence@...il.com" <asml.silence@...il.com>,
"leit@...com" <leit@...com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"dccp@...r.kernel.org" <dccp@...r.kernel.org>,
"mptcp@...ts.linux.dev" <mptcp@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"willemdebruijn.kernel@...il.com" <willemdebruijn.kernel@...il.com>,
"matthieu.baerts@...sares.net" <matthieu.baerts@...sares.net>,
"marcelo.leitner@...il.com" <marcelo.leitner@...il.com>
Subject: RE: [PATCH 0/5] add initial io_uring_cmd support for sockets
From: David Ahern
> Sent: 11 April 2023 16:28
....
> Christoph's patch set a few years back that removed set_fs broke the
> ability to do in-kernel ioctl and {s,g}setsockopt calls. I did not
> follow that change; was it a deliberate intent to not allow these
> in-kernel calls vs wanting to remove the set_fs? e.g., can we add a
> kioctl variant for in-kernel use of the APIs?
I think that was a side effect, and with no in-tree in-kernel
users (apart from limited calls in bpf) it was deemed acceptable.
(It is a PITA for any code trying to use SCTP in kernel.)
One problem is that not all sockopt calls pass the correct length.
And some of them can have very long buffers.
Not to mention the ones that are read-modify-write.
A plausible solution is to pass a 'fat pointer' that contains
some, or all, of:
- A userspace buffer pointer.
- A kernel buffer pointer.
- The length supplied by the user.
- The length of the kernel buffer.
= The number of bytes to copy on completion.
For simple user requests the syscall entry/exit code
would copy the data to a short on-stack buffer.
Kernel users just pass the kernel address.
Odd requests can just use the user pointer.
Probably needs accessors that add in an offset.
It might also be that some of the problematic sockopt
were in decnet - now removed.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists