[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1318520635.2393.22.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Date: Thu, 13 Oct 2011 17:43:55 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Марк Коренберг
<socketpair@...il.com>
Cc: netdev@...r.kernel.org, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: (splice socket -> pipe) + EPOLLET -> epoll_wait does not not
wake up !
Le jeudi 13 octobre 2011 à 20:48 +0600, Марк Коренберг a écrit :
> The problem:
>
> man 7 epoll said:
> For stream-oriented files (e.g., pipe, FIFO, stream socket), the
> condition that the read/write I/O space is exhausted can also be
> detected by checking the amount of data read from / written to the
> target file descriptor. For example, if you call read(2) by asking to
> read a certain amount of data and read(2) returns a lower number of
> bytes, you can be sure of having exhausted the read I/O space for
> the file descriptor.
>
> I decide to use splice socket -> pipe instead of recv. So I have
> registered socket's fd in epoll with EPOLLIN|EPOLLET.
>
> When data appear in socket faster than I splice() it from socket, the
> following sometimes appear:
>
> 1. in my code I sure, that pipe is empty.
> 2. my code do splice(socket, pipe, 65536)
> 3. splice return, say, 53248
> 4. my code accordingly to man, decide not to fire splice() again, as
> it thinks that it will return EWOULDBLOCK=EAGAIN.
> 5. so, my code go to epoll_wait to wait for EPOLLIN on socket
> 6. epoll hangs.
>
> This is not appear if I do just recv(). But it may be because speed is
> lower, and some race condition in effect.
>
> The hacked version of strace output is attached.
Your assumptions about splice() are false.
splice() can transfert partial pages.
So you can hit the 16 pages pipe limit, and splice() doesnt necessarly
returns 16*PAGE_SIZE.
With TCP frames, usually 1460 bytes per PAGE are used.
You must call splice() again and again, unless 0 bytes (EAGAIN) are
returned.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists