linux-kernel - Re: [PATCH v9 2/2] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30e4f267-86c2-4df6-9f33-d6f5fc77c4db@csgroup.eu>
Date: Fri, 23 Feb 2024 10:06:54 +0000
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Charlie Jenkins <charlie@...osinc.com>, Guenter Roeck
	<linux@...ck-us.net>, David Laight <David.Laight@...lab.com>, Palmer Dabbelt
	<palmer@...belt.com>, Andrew Morton <akpm@...ux-foundation.org>, Helge Deller
	<deller@....de>, "James E.J. Bottomley"
	<James.Bottomley@...senpartnership.com>, Parisc List
	<linux-parisc@...r.kernel.org>, Al Viro <viro@...iv.linux.org.uk>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [PATCH v9 2/2] lib: checksum: Use aligned accesses for
 ip_fast_csum and csum_ipv6_magic tests



Le 22/02/2024 à 03:55, Charlie Jenkins a écrit :
> The test cases for ip_fast_csum and csum_ipv6_magic were failing on a
> variety of architectures that are big endian or do not support
> misalgined accesses. Both of these test cases are changed to support big
> and little endian architectures.

It is unclear. The endianess issue and the alignment issue are two 
independant subjects that should be handled in separate patches.

According to the subject of this patch, only misaligned accesses should 
be handled here. Endianness should have been fixed by patch 1.

Also, would be nice to give exemple of architecture that has such 
problem, and explain what is the problem exactly.

> 
> The test for ip_fast_csum is changed to align the data along (14 +
> NET_IP_ALIGN) bytes which is the alignment of an IP header. The test for
> csum_ipv6_magic aligns the data using a struct. An extra padding field
> is added to the struct to ensure that the size of the struct is the same
> on all architectures (44 bytes).

What is the purpose of that padding ? You take fields one by one and 
never do anything with the full struct.

> 
> Fixes: 6f4c45cbcb00 ("kunit: Add tests for csum_ipv6_magic and ip_fast_csum")
> Signed-off-by: Charlie Jenkins <charlie@...osinc.com>
> Reviewed-by: Guenter Roeck <linux@...ck-us.net>
> Tested-by: Guenter Roeck <linux@...ck-us.net>
> ---
>   lib/checksum_kunit.c | 393 ++++++++++++++++++---------------------------------
>   1 file changed, 134 insertions(+), 259 deletions(-)
> 
> diff --git a/lib/checksum_kunit.c b/lib/checksum_kunit.c
> index 776ad3d6d5a1..f1b18e3628dd 100644
> --- a/lib/checksum_kunit.c
> +++ b/lib/checksum_kunit.c
> @@ -13,8 +13,9 @@
>   
>   #define IPv4_MIN_WORDS 5
>   #define IPv4_MAX_WORDS 15
> -#define NUM_IPv6_TESTS 200
> -#define NUM_IP_FAST_CSUM_TESTS 181
> +#define WORD_ALIGNMENT 4

Is that macro really needed ? Can't you just use sizeof(u32) or 
something similar ?


> +/* Ethernet headers are 14 bytes and NET_IP_ALIGN is used to align them */
> +#define IP_ALIGNMENT (14 + NET_IP_ALIGN)

Only if no VLAN.

When using VLANs it is 4 bytes more. But why do you mind that at all ?

>   
>   /* Values for a little endian CPU. Byte swap each half on big endian CPU. */
>   static const u32 random_init_sum = 0x2847aab;

...

> @@ -578,15 +451,19 @@ static void test_csum_no_carry_inputs(struct kunit *test)
>   static void test_ip_fast_csum(struct kunit *test)
>   {
>   	__sum16 csum_result, expected;
> -
> -	for (int len = IPv4_MIN_WORDS; len < IPv4_MAX_WORDS; len++) {
> -		for (int index = 0; index < NUM_IP_FAST_CSUM_TESTS; index++) {
> -			csum_result = ip_fast_csum(random_buf + index, len);
> -			expected = (__force __sum16)
> -				expected_fast_csum[(len - IPv4_MIN_WORDS) *
> -						   NUM_IP_FAST_CSUM_TESTS +
> -						   index];
> -			CHECK_EQ(expected, csum_result);
> +	int num_tests = (MAX_LEN / WORD_ALIGNMENT - IPv4_MAX_WORDS * WORD_ALIGNMENT);
> +
> +	for (int i = 0; i < num_tests; i++) {
> +		memcpy(&tmp_buf[IP_ALIGNMENT],
> +		       random_buf + (i * WORD_ALIGNMENT),
> +		       IPv4_MAX_WORDS * WORD_ALIGNMENT);

That looks weird.

> +
> +		for (int len = IPv4_MIN_WORDS; len <= IPv4_MAX_WORDS; len++) {
> +			int index = (len - IPv4_MIN_WORDS) +
> +				 i * ((IPv4_MAX_WORDS - IPv4_MIN_WORDS) + 1);

Missing blank line after declaration.

> +			csum_result = ip_fast_csum(tmp_buf + IP_ALIGNMENT, len);
> +			expected = (__force __sum16)htons(expected_fast_csum[index]);

You must do proper type conversion using to_sum16(), not a forced cast.

> +			CHECK_EQ(csum_result, expected);
>   		}
>   	}
>   }
> @@ -594,29 +471,27 @@ static void test_ip_fast_csum(struct kunit *test)
>   static void test_csum_ipv6_magic(struct kunit *test)
>   {
>   #if defined(CONFIG_NET)
> -	const struct in6_addr *saddr;
> -	const struct in6_addr *daddr;
> -	unsigned int len;
> -	unsigned char proto;
> -	unsigned int csum;
> -
> -	const int daddr_offset = sizeof(struct in6_addr);
> -	const int len_offset = sizeof(struct in6_addr) + sizeof(struct in6_addr);
> -	const int proto_offset = sizeof(struct in6_addr) + sizeof(struct in6_addr) +
> -			     sizeof(int);
> -	const int csum_offset = sizeof(struct in6_addr) + sizeof(struct in6_addr) +
> -			    sizeof(int) + sizeof(char);
> -
> -	for (int i = 0; i < NUM_IPv6_TESTS; i++) {
> -		saddr = (const struct in6_addr *)(random_buf + i);
> -		daddr = (const struct in6_addr *)(random_buf + i +
> -						  daddr_offset);
> -		len = *(unsigned int *)(random_buf + i + len_offset);
> -		proto = *(random_buf + i + proto_offset);
> -		csum = *(unsigned int *)(random_buf + i + csum_offset);
> -		CHECK_EQ((__force __sum16)expected_csum_ipv6_magic[i],
> -			 csum_ipv6_magic(saddr, daddr, len, proto,
> -					 (__force __wsum)csum));
> +	struct csum_ipv6_magic_data {
> +		const struct in6_addr saddr;
> +		const struct in6_addr daddr;
> +		__be32 len;
> +		__wsum csum;
> +		unsigned char proto;
> +		unsigned char pad[3];
> +	} *data;
> +	__sum16 csum_result, expected;
> +	int ipv6_num_tests = ((MAX_LEN - sizeof(struct csum_ipv6_magic_data)) / WORD_ALIGNMENT);
> +
> +	for (int i = 0; i < ipv6_num_tests; i++) {
> +		int index = i * WORD_ALIGNMENT;
> +
> +		data = (struct csum_ipv6_magic_data *)(random_buf + index);
> +
> +		csum_result = csum_ipv6_magic(&data->saddr, &data->daddr,
> +					      ntohl(data->len), data->proto,
> +					      data->csum);
> +		expected = (__force __sum16)htons(expected_csum_ipv6_magic[i]);

Same, use to_sum16() instead htons() and a forced cast.

> +		CHECK_EQ(csum_result, expected);
>   	}
>   #endif /* !CONFIG_NET */
>   }
> 


Christophe