lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Aug 2015 12:09:52 +0530
From:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
To:	David Miller <davem@...emloft.net>, edumazet@...gle.com
CC:	kuznet@....inr.ac.ru, jmorris@...ei.org, yoshfuji@...ux-ipv6.org,
	kaber@...sh.net, jiri@...nulli.us, hannes@...essinduktion.org,
	tom@...bertland.com, azhou@...ira.com, ebiederm@...ssion.com,
	ipm@...rality.org.uk, nicolas.dichtel@...nd.com,
	serge.hallyn@...onical.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, anton@....ibm.com,
	nacc@...ux.vnet.ibm.com, srikar@...ux.vnet.ibm.com
Subject: Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking
 all the percpu data at once

On 08/28/2015 12:08 AM, David Miller wrote:
> From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
> Date: Wed, 26 Aug 2015 23:07:33 +0530
>
>> @@ -4641,10 +4647,12 @@ static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib,
>>   static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype,
>>   			     int bytes)
>>   {
>> +	u64 buff[IPSTATS_MIB_MAX] = {0,};
>> +
>>   	switch (attrtype) {
>>   	case IFLA_INET6_STATS:
>> -		__snmp6_fill_stats64(stats, idev->stats.ipv6,
>
> I would suggest using an explicit memset() here, it makes the overhead incurred
> by this scheme clearer.
>

I changed the code to look like below to measure fill_stat overhead:

container creation now took: 3.012s
it was:
without patch     : 6.86sec
with current patch: 3.34sec

and perf did not show the snmp6_fill_stats() parent traces.

changed code:
snmp6_fill_stats(...)
{
         switch (attrtype) {
         case IFLA_INET6_STATS:
                 put_unaligned(IPSTATS_MIB_MAX, &stats[0]);
                 memset(&stats[1], 0, IPSTATS_MIB_MAX-1);

                 //__snmp6_fill_stats64(stats, idev->stats.ipv6, 
IPSTATS_MIB_MAX, bytes,
                 //                   offsetof(struct ipstats_mib, 
syncp), buff);
.....
}

So in summary:
The current patch amounts to reduction in major overhead in fill_stat,
though there is still percpu walk overhead (0.33sec difference).

[ percpu walk overead grows when create for e.g. 3k containers].

cache miss: there was no major difference (around 1.4%) w.r.t patch

Hi David,
hope you wanted to know the overhead than to change the current patch. 
please let me know..

Eric, does V2 patch look good now.. please add your ack/review

Details:
time
=========================
time docker run -itd  ubuntu:15.04  /bin/bash
b6670c321b5957f004e281cbb14512deafd0c0be6a39707c2f3dc95649bbc394

real	0m3.012s
user	0m0.093s
sys	0m0.009s

perf:
==========
# Samples: 18K of event 'cycles'
# Event count (approx.): 12838752009
# Overhead  Command          Shared Object          Symbol 

# ........  ...............  .....................  ............
#
     15.29%  swapper          [kernel.kallsyms]      [k] snooze_loop 

      9.37%  docker           docker                 [.] scanblock 

      6.47%  docker           [kernel.kallsyms]      [k] veth_stats_one 

      3.87%  swapper          [kernel.kallsyms]      [k] _raw_spin_lock 

      2.71%  docker           docker                 [.]


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ