lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 24 Mar 2014 10:22:53 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Eric Dumazet' <eric.dumazet@...il.com>,
	David Miller <davem@...emloft.net>
CC:	"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
	"hkchu@...gle.com" <hkchu@...gle.com>,
	"mwdalton@...gle.com" <mwdalton@...gle.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH net-next] net: optimize csum_replace2()

From: Eric Dumazet <edumazet@...gle.com>
> 
> When changing one 16bit value by another in IP header, we can adjust the
> IP checksum by doing a simple operation described in RFC 1624,
> as reminded by David.
> 
> csum_partial() is a complex function on x86_64, not really suited
> for small number of checksummed bytes.
> 
> I spotted csum_partial() being in the top 20 most consuming
> functions (more than 1 %) in a GRO workload, which was rather
> unexpected.
> 
> The caller was inet_gro_complete() doing a csum_replace2() when
> building the new IP header for the GRO packet.
> 
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> ---
>  include/net/checksum.h |   23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/include/net/checksum.h b/include/net/checksum.h
> index 37a0e24adbe7..a28f4e0f6251 100644
> --- a/include/net/checksum.h
> +++ b/include/net/checksum.h
> @@ -69,6 +69,19 @@ static inline __wsum csum_sub(__wsum csum, __wsum addend)
>  	return csum_add(csum, ~addend);
>  }
> 
> +static inline __sum16 csum16_add(__sum16 csum, __be16 addend)
> +{
> +	u16 res = (__force u16)csum;

Shouldn't that be u32 ?

> +	res += (__force u16)addend;
> +	return (__force __sum16)(res + (res < (__force u16)addend));
> +}
> +
> +static inline __sum16 csum16_sub(__sum16 csum, __be16 addend)
> +{
> +	return csum16_add(csum, ~addend);
> +}
> +
>  static inline __wsum
>  csum_block_add(__wsum csum, __wsum csum2, int offset)
>  {
> @@ -112,9 +125,15 @@ static inline void csum_replace4(__sum16 *sum, __be32 from, __be32 to)
>  	*sum = csum_fold(csum_partial(diff, sizeof(diff), ~csum_unfold(*sum)));
>  }
> 
> -static inline void csum_replace2(__sum16 *sum, __be16 from, __be16 to)
> +/* Implements RFC 1624 (Incremental Internet Checksum)
> + * 3. Discussion states :
> + *     HC' = ~(~HC + ~m + m')
> + *  m : old value of a 16bit field
> + *  m' : new value of a 16bit field
> + */
> +static inline void csum_replace2(__sum16 *sum, __be16 old, __be16 new)
>  {
> -	csum_replace4(sum, (__force __be32)from, (__force __be32)to);
> +	*sum = ~csum16_add(csum16_sub(~(*sum), old), new);
>  }

It might be clearer to just say:
	*sum = ~csum16_add(csum16_add(~*sum, ~old), new));
or even:
	*sum = ~csum16_add(csum16_add(*sum ^ 0xffff, old ^ 0xffff), new));
which might remove some mask instructions - especially if all the
intermediate values are left larger than 16 bits.

	David



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ