[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1329287109.2555.44.camel@edumazet-laptop>
Date: Wed, 15 Feb 2012 07:25:09 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Piergiorgio Beruto <piergiorgio.beruto@...il.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org
Subject: Re: Possible bugfix for AF_UNIX, SOCK_SEQPACKET sockets
Le mardi 14 février 2012 à 23:05 +0100, Piergiorgio Beruto a écrit :
> Hello,
> I would like to have your opinion regarding what I think could be a
> misbehaviour of the SIOCINQ ioctl on AF_UNIX domain sockets.
>
> I was implementing an IPC user-space library which makes use of a
> variety of sockets for sending data all around.
> When dealing with datagram sockets (i.e. SOCK_DGRAM, SOCK_RDM and
> SOCK_SEQPACKET) I used to implement the following paradigm:
>
> 1) issue the SIOCINQ (aka FIONREAD) ioctl and get the "size of the
> first queued datagram"
> 2) allocate memory of that size and fill it reading out data from the socket
>
> This works very fine for UDP and TIPC (including SOCK_SEQPACKET)
> sockets but when using AF_UNIX (SEQPACKET) sockets the same ioctl
> returns instead the number of bytes queued (sum of all pending
> datagrams) .
>
> In short, there is no way (as far as I know) to work out how much
> memory to allocate in userspace before reading out the datagram.
> The workaround for this is to use a fixed swap buffer of the max
> possible size of your packets (assuming you know / can retrieve it
> somehow) and make a copy of your data (which is bad for performance
> anyway).
>
I dont understand.... SIOCINQ returns an upper limit to you.
So your app can malloc() a big enough buffer.
Yes, SEQPACKET preserves message boundaries so your recvmsg() can fill
part of the application buffer.
> Looking at the kernel code I've found this in the
> linux/net/unix/af_unix.c file (my comment with ////)
>
> case SIOCINQ:
> {
> struct sk_buff *skb;
>
> if (sk->sk_state == TCP_LISTEN) {
> err = -EINVAL;
> break;
> }
>
> spin_lock(&sk->sk_receive_queue.lock);
> if (sk->sk_type == SOCK_STREAM ||
> sk->sk_type == SOCK_SEQPACKET) { //// <--------
> treats SEQPACKET as SOCK_STREAM
> skb_queue_walk(&sk->sk_receive_queue, skb)
> amount += skb->len;
> } else {
> skb = skb_peek(&sk->sk_receive_queue);
> if (skb)
> amount = skb->len;
> }
> spin_unlock(&sk->sk_receive_queue.lock);
> err = put_user(amount, (int __user *)arg);
> break;
> }
>
> In my opinion SOCK_SEQPACKETS are similar to SOCK_STREAM because they
> are "connection oriented" sockets but from the I/O point of view are
> more similar to datagram ones.
>
> My proposal is to remove the above condition from the test and let the
> function process SOCK_SEQPACKETS as if they were SOCK_DGRAM for this
> particular ioctl.
>
This behavior is undocumented in "man unix", so it is "implementation
dependant"
This change could break some applications.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists