[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141106032533.GU7996@ZenIV.linux.org.uk>
Date: Thu, 6 Nov 2014 03:25:34 +0000
From: Al Viro <viro@...IV.linux.org.uk>
To: David Miller <davem@...emloft.net>
Cc: herbert@...dor.apana.org.au, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, bcrl@...ck.org
Subject: Re: [PATCH 1/4] inet: Add skb_copy_datagram_iter
On Wed, Nov 05, 2014 at 04:57:19PM -0500, David Miller wrote:
> From: Al Viro <viro@...IV.linux.org.uk>
> Date: Wed, 5 Nov 2014 21:07:45 +0000
>
> > Ping me when you put it there, OK? I'll rebase the rest of old stuff on
> > top of it (similar helpers, mostly).
>
> I just pushed it into net-next, thanks Al.
OK, I've taken the beginning of the old queue on top of net-next; it's
in git://git.kernel.org//pub/scm/linux/kernel/git/viro/vfs.git iov_iter-net.
>From the quick look at the remaining ->msg_iov users:
* I'll need to add several iov_iter primitives - counterparts of
checksum.h stuff (copy_and_csum_{from,to}_iter(), maybe some more). Not
a big deal, I'll do that tomorrow. That will give us a clean iov_iter-based
counterpart of skb_copy_and_csum_datagram_iovec().
* a new helper: zerocopy_sg_from_iter(). I have it, actually,
but I'd rather not step on Herbert's toes - it's too close to the areas
his series will touch, so that's probably for when his series goes in.
It will be needed for complete macvtap conversion...
* why doesn't verify_iovec() use rw_copy_check_uvector()? The only
real differences I see is that (a) you do allocation in callers (same as
rw_copy_check_uvector() would've done), (b) you return EMSGSIZE in case of
too long vector, while rw_copy_check_uvector() returns EINVAL in that case
and (c) you don't do access_ok(). The last one is described as optimization,
but for iov_iter primitives it's a serious PITA - for iovec-backed instances
they are using __copy_from_user()/__copy_to_user(), etc.
It certainly would be nice to have the same code doing all copying
of iovecs from userland - readv/writev/aio/sendmsg/recvmsg/etc. Am I
missing something subtle semantical difference in there? EMSGSIZE vs EINVAL
is trivial (we can lift that check into the callers, if nothing else), but
I could miss something more interesting...
* various getfrag will need to grow iov_iter-based counterparts,
but ip_append_output() needs no changes, AFAICS.
* crypto stuff will be easy to convert - iov_iter_get_pages()
would suffice for a primitive
* there's some really weird stuff in there. Just what is this
static int raw_probe_proto_opt(struct flowi4 *fl4, struct msghdr *msg)
{
struct iovec *iov;
u8 __user *type = NULL;
u8 __user *code = NULL;
int probed = 0;
unsigned int i;
if (!msg->msg_iov)
return 0;
for (i = 0; i < msg->msg_iovlen; i++) {
iov = &msg->msg_iov[i];
if (!iov)
continue;
trying to do? "If non-NULL pointer + i somehow happened to be NULL, skip it
and try to use the same pointer + i + 1"? Huh? Had been that way since
the function first went in back in 2004 ("[IPV4] XFRM: probe icmp type/code
when sending packets via raw socket.", according to historical tree)...
* rds, bluetooth and vsock are doing something odd; need to RTFS some
more.
* not sure I understand what TIPC is doing - does it prohibit too
short first segment of ->msg_iov? net/tipc/socket.c:dest_name_check() looks
odd _and_ potentially racy - we read the same data twice and hope our checks
still apply. I asked TIPC folks about that race back in April, but it
looks like that fell through the cracks...
Overall, so far it looks more or less feasible - other than the missing csum
primitives, current mm/iov_iter.c should suffice. I have _not_ seriously
looked into sendpage yet; that might very well require some more.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists