[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130219142629.GA31871@hmsreliant.think-freely.org>
Date: Tue, 19 Feb 2013 09:26:29 -0500
From: Neil Horman <nhorman@...driver.com>
To: Jon Maloy <jon.maloy@...csson.com>
Cc: Paul Gortmaker <paul.gortmaker@...driver.com>,
David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
Ying Xue <ying.xue@...driver.com>
Subject: Re: [PATCH net-next 2/3] tipc: byte-based overload control on socket
receive queue
On Tue, Feb 19, 2013 at 09:07:54AM +0100, Jon Maloy wrote:
> On 02/18/2013 09:47 AM, Neil Horman wrote:
> > On Fri, Feb 15, 2013 at 05:57:46PM -0500, Paul Gortmaker wrote:
> >> From: Ying Xue <ying.xue@...driver.com>
> >>
> >> Change overload control to be purely byte-based, using
> >> sk->sk_rmem_alloc as byte counter, and compare it to a calculated
> >> upper limit for the socket receive queue.
>
> [...]
>
> >> + *
> >> + * For all connectionless messages, by default new queue limits are
> >> + * as belows:
> >> + *
> >> + * TIPC_LOW_IMPORTANCE (5MB)
> >> + * TIPC_MEDIUM_IMPORTANCE (10MB)
> >> + * TIPC_HIGH_IMPORTANCE (20MB)
> >> + * TIPC_CRITICAL_IMPORTANCE (40MB)
> >> + *
> >> + * Returns overload limit according to corresponding message importance
> >> + */
> >> +static unsigned int rcvbuf_limit(struct sock *sk, struct sk_buff *buf)
> >> +{
> >> + struct tipc_msg *msg = buf_msg(buf);
> >> + unsigned int limit;
> >> +
> >> + if (msg_connected(msg))
> >> + limit = CONN_OVERLOAD_LIMIT;
> > This still strikes me as a bit wierd. If you really can't tolerate the default
> > rmem settings in proc, have you considered separating the rmem and wmem values
> > out into their own sysctls?
>
> Initially we tried to set this value as default for sk_rcvbuf, and then use
> fractions of it as limits, as you suggest below. The problem we found was that
> if we want to change this via setsockopt(SOL_SOCKET, SO_RCVBUF) the value range
> we can use is very limited, and doesn't fit our purposes.
>
Can you elaborate on this please? The above doesn't really explain why you
can't do what I suggested. Not asserting that what you say is untrue, mind you,
I'm just trying to understand what it is about TIPC that requires such a
specific reception buffer envelope, and how enforcing queue limits here is so
important when packets could just as easily be dropped at the ip layer (with
ostensibly no fatal failure).
> We did consider to introduce a separate setsockopt at TIPC level for this,
> but thought it had a value in itself to use the mechanism that is already there.
> Hence the "re-interpretation" of sk_rcvbuf as we do below.
> Considering the weird doubling of this parameter that is done elsewhere in the
> code we thought that having our own interpretation might be acceptable.
Thats quite different IMHO. The comments in sock_setsockopt make it pretty
clear that the doubling of the rcvbuf value is done to account for the sk_buff
overhead of packet reception, and thats documented in the socket(7) man page.
What you have here is the ability to set sk_rcvbuf, and then have that setting
be ignored, but only to within several different limits, depending on various
conditions, all of which are not visible to user space.
> We did of course see the potential issue with this, that is why we cc-ed
> you for comments.
I appreciate that.
> Now I see that David already pulled the series, so I am a little uncertain
> about how to proceed.
I saw that too, and asked him about this. A follow-on patch (if we wind up
deciding one is warranted) is the way to go here.
Regards
Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists