[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1285105131.6378.19.camel@edumazet-laptop>
Date: Tue, 21 Sep 2010 23:38:51 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Tom Herbert <therbert@...gle.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, sridharr@...gle.com
Subject: Re: [PATCH v2] xmit_compl_seq: information to reclaim vmsplice
buffers
Le mardi 21 septembre 2010 à 11:57 -0700, Tom Herbert a écrit :
> In this patch we propose to adds some socket API to retrieve the
> "transmit completion sequence number", essentially a byte counter
> for the number of bytes that have been transmitted and will not be
> retransmitted. In the case of TCP, this should correspond to snd_una.
>
> The purpose of this API is to provide information to userspace about
> which buffers can be reclaimed when sending with vmsplice() on a
> socket.
>
> There are two methods for retrieving the completed sequence number:
> through a simple getsockopt (implemented here for TCP), as well as
> returning the value in the ancilary data of a recvmsg.
>
> The expected flow would be something like:
> - Connect is created
> - Initial completion seq # is retrieved through the sockopt, and is
> stored in userspace "compl_seq" variable for the connection.
> - Whenever a send is done, compl_seq += # bytes sent.
> - When doing a vmsplice the completion sequence number is saved
> for each user space buffer, buffer_compl_seq = compl_seq.
> - When recvmsg returns with a completion sequence number in
> ancillary data, any buffers cover by that sequence number
> (where buffer_compl_seq < recvmsg_compl_seq) are reclaimed
> and can be written to again.
> - If no data is receieved on a connection (recvmsg does not
> return), a timeout can be used to call the getsockopt and
> reclaim buffers as a fallback.
>
> Using recvmsg data in this manner is sort of a cheap way to get a
> "callback" for when a vmspliced buffer is consumed. It will work
> well for a client where the response causes recvmsg to return.
> On the server side it works well if there are a sufficient
> number of requests coming on the connection (resorting to the
> timeout if necessary as described above).
>
> Signed-off-by: Tom Herbert <therbert@...gle.com>
> + * Copy the first unacked seq into the receive msg control part.
> + */
> +static inline void tcp_sock_xmit_compl_seq(struct msghdr *msg,
> + struct sock *sk)
> +{
> + if (sock_flag(sk, SOCK_XMIT_COMPL_SEQ)) {
> + struct tcp_sock *tp = tcp_sk(sk);
> + if (msg->msg_controllen >= sizeof(tp->snd_una)) {
> + put_cmsg(msg, SOL_SOCKET, SCM_XMIT_COMPL_SEQ,
> + sizeof(tp->snd_una), &tp->snd_una);
> + }
> + }
> +}
I am wondering if this part could be done outside of socket lock,
provided you latch tp->snd_una value right before release_sock();
u32 snd_una;
...
tcp_cleanup_rbuf(sk, copied);
TCP_CHECK_TIMER(sk);
snd_una = tp->snd_una;
release_sock(sk);
tcp_sock_xmit_compl_seq(msg, sk, snd_una);
return copied;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists