netdev - Re: [PATCH net-next 2/3] tipc: byte-based overload control on socket receive queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130219142629.GA31871@hmsreliant.think-freely.org>
Date:	Tue, 19 Feb 2013 09:26:29 -0500
From:	Neil Horman <nhorman@...driver.com>
To:	Jon Maloy <jon.maloy@...csson.com>
Cc:	Paul Gortmaker <paul.gortmaker@...driver.com>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	Ying Xue <ying.xue@...driver.com>
Subject: Re: [PATCH net-next 2/3] tipc: byte-based overload control on socket
 receive queue

On Tue, Feb 19, 2013 at 09:07:54AM +0100, Jon Maloy wrote:
> On 02/18/2013 09:47 AM, Neil Horman wrote:
> > On Fri, Feb 15, 2013 at 05:57:46PM -0500, Paul Gortmaker wrote:
> >> From: Ying Xue <ying.xue@...driver.com>
> >>
> >> Change overload control to be purely byte-based, using
> >> sk->sk_rmem_alloc as byte counter, and compare it to a calculated
> >> upper limit for the socket receive queue.
> 
> [...]
> 
> >> + *
> >> + * For all connectionless messages, by default new queue limits are
> >> + * as belows:
> >> + *
> >> + * TIPC_LOW_IMPORTANCE       (5MB)
> >> + * TIPC_MEDIUM_IMPORTANCE    (10MB)
> >> + * TIPC_HIGH_IMPORTANCE      (20MB)
> >> + * TIPC_CRITICAL_IMPORTANCE  (40MB)
> >> + *
> >> + * Returns overload limit according to corresponding message importance
> >> + */
> >> +static unsigned int rcvbuf_limit(struct sock *sk, struct sk_buff *buf)
> >> +{
> >> +	struct tipc_msg *msg = buf_msg(buf);
> >> +	unsigned int limit;
> >> +
> >> +	if (msg_connected(msg))
> >> +		limit = CONN_OVERLOAD_LIMIT;
> > This still strikes me as a bit wierd.  If you really can't tolerate the default
> > rmem settings in proc, have you considered separating the rmem and wmem values
> > out into their own sysctls?  
> 
> Initially we tried to set this value as default for sk_rcvbuf, and then use
> fractions of it as limits, as you suggest below. The problem we found was that
> if we want to change this via setsockopt(SOL_SOCKET, SO_RCVBUF) the value range
> we can use is very limited, and doesn't fit our purposes.
> 
Can you elaborate on this please?  The above doesn't really explain why you
can't do what I suggested.  Not asserting that what you say is untrue, mind you,
I'm just trying to understand what it is about TIPC that requires such a
specific reception buffer envelope, and how enforcing queue limits here is so
important when packets could just as easily be dropped at the ip layer (with
ostensibly no fatal failure).

> We did consider to introduce a separate setsockopt at TIPC level for this,
> but thought it had a value in itself to use the mechanism that is already there. 
> Hence the "re-interpretation" of sk_rcvbuf as we do below.
> Considering the weird doubling of this parameter that is done elsewhere in the
> code we thought that having our own interpretation might be acceptable.
Thats quite different IMHO.  The comments in sock_setsockopt make it pretty
clear that the doubling of the rcvbuf value is done to account for the sk_buff
overhead of packet reception, and thats documented in the socket(7) man page.
What you have here is the ability to set sk_rcvbuf, and then have that setting
be ignored, but only to within several different limits, depending on various
conditions, all of which are not visible to user space.

> We did of course see the potential issue with this, that is why we cc-ed
> you for comments.
I appreciate that.

> Now I see that David already pulled the series, so I am a little uncertain 
> about how to proceed. 
I saw that too, and asked him about this.  A follow-on patch (if we wind up
deciding one is warranted) is the way to go here.

Regards
Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html