lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3367B80B08154D42A3B2BC708B5D41F642D8EDB9C6@EXMAIL.ad.emulex.com>
Date:	Tue, 23 Aug 2011 00:06:48 -0700
From:	<Sathya.Perla@...lex.Com>
To:	<eric.dumazet@...il.com>
CC:	<netdev@...r.kernel.org>
Subject: RE: [PATCH net-next 3/5] be2net: fix erx->rx_drops_no_frags wrap
 around

>-----Original Message-----
>From: Eric Dumazet [mailto:eric.dumazet@...il.com]
>Sent: Tuesday, August 23, 2011 12:11 PM
>
>Le mardi 23 août 2011 à 11:11 +0530, Sathya Perla a écrit :
>> The rx_drops_no_frags HW counter for RSS rings is 16bits in HW and can
>> wraparound often. Maintain a 32-bit accumulator in the driver to prevent
>> frequent wraparound.
>>
>> Also, incorporated Eric's feedback to use ACCESS_ONCE() for the
>accumulator
>> write.
<...>
>>
>> +static void accumulate_16bit_val(u32 *acc, u16 val)
>> +{
>> +#define lo(x)			(x & 0xFFFF)
>> +#define hi(x)			(x & 0xFFFF0000)
>> +	bool wrapped = val < lo(*acc);
>> +	u32 newacc = hi(*acc) + val;
>> +
>> +	if (wrapped)
>> +		newacc += 65536;
>> +	ACCESS_ONCE(*acc) = newacc;
>> +}
>> +
>
>I still feel something is wrong here :
>
>>  void be_parse_stats(struct be_adapter *adapter)
>>  {
>>  	struct be_erx_stats_v1 *erx = be_erx_stats_from_cmd(adapter);
>> @@ -394,9 +406,13 @@ void be_parse_stats(struct be_adapter *adapter)
>>  	}
>>
>>  	/* as erx_v1 is longer than v0, ok to use v1 defn for v0 access */
>> -	for_all_rx_queues(adapter, rxo, i)
>> -		rx_stats(rxo)->rx_drops_no_frags =
>> -			erx->rx_drops_no_fragments[rxo->q.id];
>
>previous code was not doing a sum_of_all_queues.
Yes, the new code still *does not* do a sum of all queues. The code just 
parses per-queue stats. For each queue, the 16 bit
HW stats value is taken and stored in a 32-bit *per-queue* variable.
The comment that "driver accumulates a 32-bit val" may be misleading. The
code here is not doing a sum of the per-queue stat.

+       for_all_rx_queues(adapter, rxo, i) {
+               /* below erx HW counter can actually wrap around after
+                * 65535. Driver accumulates a 32-bit value
+                */
+               accumulate_16bit_val(&rx_stats(rxo)->rx_drops_no_frags,
+                               (u16)erx->rx_drops_no_fragments[rxo->q.id]);
+       }

>
>It only gave the final erx->rx_drops_no_fragments[rxo->q.id], not taking
>into account previous rx_stats(rxo)->rx_drops_no_frags value.
>
>Your changelog is about wrap around, while the bug might have be
>different (No real sum)
>
>Now you say : previous value is meaningfull, and we add to it 16bits
>values.
>
>> +	for_all_rx_queues(adapter, rxo, i) {
>> +		/* below erx HW counter can actually wrap around after
>> +		 * 65535. Driver accumulates a 32-bit value
>> +		 */
>> +		accumulate_16bit_val(&rx_stats(rxo)->rx_drops_no_frags,
>> +				(u16)erx->rx_drops_no_fragments[rxo->q.id]);
>> +	}
>>  }
>>
>
>Arent multiple calls to be_parse_stats() will have wrong final
>rx_drops_no_frags value ?
>
>Or are the rx_drops_no_fragments[rxo->q.id] cleared when read ?

The following logic should take care of that:
	u32 newacc = hi(*acc) + val;
Only the upper 16 bits of the accumulator are retained and the new-val
of the 16-bit stat is used for the lower 16 bits. So, even if I call this
routine 10 times with the same value for val, the accumulator value will not change.

Say: accumulator is 0x00010001
Say: new 16-bit hw counter value is now 2 (it was previously 1 and had wraped-aroud before that)
accumulate(acc, 2) ==> newacc = 0x00010000 + 2 = 0x00010002
any number of calls to accumulate(acc, 2) will still produce acc = 0x00010002

>
>I am afraid that if HW maintains 16bit values, then the only way is to
>also have a 16bit accumulator. You cannot detect wraparounds without
>also keeping a copy of previous 16bit samples.

I don't agree:
Say: accumulator is 0x0000FFFF
Say: new 16-bit hw counter value is now 0 (after a wrap-aroud)
accumulate(acc, 0) ==> newacc = 0x00000000 + 0 = 0x00000000
After the wraparound check newacc = 0x00010000

>
>u16 accum = 0;
>or_all_rx_queues(adapter, rxo, i) {
>	accum += erx->rx_drops_no_fragments[rxo->q.id];
>}
>rx_stats(rxo)->rx_drops_no_frags = accum;
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ