lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Jul 2019 14:15:29 +0300
From:   Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To:     Brodie Greenfield <brodie.greenfield@...iedtelesis.co.nz>,
        davem@...emloft.net, stephen@...workplumber.org,
        kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org,
        netdev@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, chris.packham@...iedtelesis.co.nz,
        luuk.paulussen@...iedtelesis.co.nz
Subject: Re: [PATCH 1/2] ipmr: Make cache queue length configurable

On 26/07/2019 14:05, Nikolay Aleksandrov wrote:
> On 25/07/2019 23:42, Brodie Greenfield wrote:
>> We want to be able to keep more spaces available in our queue for
>> processing incoming multicast traffic (adding (S,G) entries) - this lets
>> us learn more groups faster, rather than dropping them at this stage.
>>
>> Signed-off-by: Brodie Greenfield <brodie.greenfield@...iedtelesis.co.nz>
>> ---
>>  Documentation/networking/ip-sysctl.txt | 8 ++++++++
>>  include/net/netns/ipv4.h               | 1 +
>>  net/ipv4/af_inet.c                     | 1 +
>>  net/ipv4/ipmr.c                        | 4 +++-
>>  net/ipv4/sysctl_net_ipv4.c             | 7 +++++++
>>  5 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
>> index acdfb5d2bcaa..02f77e932adf 100644
>> --- a/Documentation/networking/ip-sysctl.txt
>> +++ b/Documentation/networking/ip-sysctl.txt
>> @@ -887,6 +887,14 @@ ip_local_reserved_ports - list of comma separated ranges
>>  
>>  	Default: Empty
>>  
>> +ip_mr_cache_queue_length - INTEGER
>> +	Limit the number of multicast packets we can have in the queue to be
>> +	resolved.
>> +	Bear in mind that when an unresolved multicast packet is received,
>> +	there is an O(n) traversal of the queue. This should be considered
>> +	if increasing.
>> +	Default: 10
>> +
> 
> Hi,
> You've said it yourself - it has linear traversal time, but doesn't this patch allow any netns on the
> system to increase its limit to any value, thus possibly affecting others ?
> Though the socket limit will kick in at some point. I think that's where David
> was going with his suggestion back in 2018:
> https://www.spinics.net/lists/netdev/msg514543.html
> 
> If we add this sysctl now, we'll be stuck with it. I'd prefer David's suggestion
> so we can rely only on the receive queue queue limit which is already configurable. 
> We still need to be careful with the defaults though, the NOCACHE entry is 128 bytes
> and with the skb overhead currently on my setup we end up at about 277 entries default limit.

I mean that people might be surprised if they increased that limit by default, that's the
only problem I'm not sure how to handle. Maybe we need some hard limit anyway.
Have you done any tests what value works for your setup ?

In the end we might have to go with this patch, but perhaps limit the per-netns sysctl
to the init_ns value as maximum (similar to what we did for frags) or don't make it per-netns
at all.

> 
> Cheers,
>  Nik
> 
>>  ip_unprivileged_port_start - INTEGER
>>  	This is a per-namespace sysctl.  It defines the first
>>  	unprivileged port in the network namespace.  Privileged ports
>> diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
>> index 104a6669e344..3411d3f18d51 100644
>> --- a/include/net/netns/ipv4.h
>> +++ b/include/net/netns/ipv4.h
>> @@ -187,6 +187,7 @@ struct netns_ipv4 {
>>  	int sysctl_igmp_max_msf;
>>  	int sysctl_igmp_llm_reports;
>>  	int sysctl_igmp_qrv;
>> +	unsigned int sysctl_ip_mr_cache_queue_length;
>>  
>>  	struct ping_group_range ping_group_range;
>>  
>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>> index 0dfb72c46671..8e25538bdb1e 100644
>> --- a/net/ipv4/af_inet.c
>> +++ b/net/ipv4/af_inet.c
>> @@ -1827,6 +1827,7 @@ static __net_init int inet_init_net(struct net *net)
>>  	net->ipv4.sysctl_igmp_llm_reports = 1;
>>  	net->ipv4.sysctl_igmp_qrv = 2;
>>  
>> +	net->ipv4.sysctl_ip_mr_cache_queue_length = 10;
>>  	return 0;
>>  }
>>  
>> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
>> index ddbf8c9a1abb..c6a6c3e453a9 100644
>> --- a/net/ipv4/ipmr.c
>> +++ b/net/ipv4/ipmr.c
>> @@ -1127,6 +1127,7 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
>>  				 struct sk_buff *skb, struct net_device *dev)
>>  {
>>  	const struct iphdr *iph = ip_hdr(skb);
>> +	struct net *net = dev_net(dev);
>>  	struct mfc_cache *c;
>>  	bool found = false;
>>  	int err;
>> @@ -1142,7 +1143,8 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
>>  
>>  	if (!found) {
>>  		/* Create a new entry if allowable */
>> -		if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
>> +		if (atomic_read(&mrt->cache_resolve_queue_len) >=
>> +		    net->ipv4.sysctl_ip_mr_cache_queue_length ||
>>  		    (c = ipmr_cache_alloc_unres()) == NULL) {
>>  			spin_unlock_bh(&mfc_unres_lock);
>>  
>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
>> index ba0fc4b18465..78ae86e8c6cb 100644
>> --- a/net/ipv4/sysctl_net_ipv4.c
>> +++ b/net/ipv4/sysctl_net_ipv4.c
>> @@ -784,6 +784,13 @@ static struct ctl_table ipv4_net_table[] = {
>>  		.proc_handler	= proc_dointvec
>>  	},
>>  #ifdef CONFIG_IP_MULTICAST
>> +	{
>> +		.procname	= "ip_mr_cache_queue_length",
>> +		.data		= &init_net.ipv4.sysctl_ip_mr_cache_queue_length,
>> +		.maxlen		= sizeof(int),
>> +		.mode		= 0644,
>> +		.proc_handler	= proc_dointvec
>> +	},
>>  	{
>>  		.procname	= "igmp_qrv",
>>  		.data		= &init_net.ipv4.sysctl_igmp_qrv,
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ