[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dd72b14d-3025-452a-889b-f2456a7ae431@canonical.com>
Date: Wed, 26 Nov 2025 14:18:47 -0800
From: John Johansen <john.johansen@...onical.com>
To: david laight <david.laight@...box.com>, Helge Deller <deller@...nel.org>
Cc: Helge Deller <deller@....de>,
John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
linux-kernel@...r.kernel.org, apparmor@...ts.ubuntu.com,
linux-security-module@...r.kernel.org, linux-parisc@...r.kernel.org
Subject: Re: [PATCH 0/2] apparmor unaligned memory fixes
On 11/26/25 13:23, david laight wrote:
> On Wed, 26 Nov 2025 16:12:23 +0100
> Helge Deller <deller@...nel.org> wrote:
>
>> * david laight <david.laight@...box.com>:
>>> On Wed, 26 Nov 2025 12:03:03 +0100
>>> Helge Deller <deller@....de> wrote:
>>>
>>>> On 11/26/25 11:44, david laight wrote:
>>> ...
>>>>>> diff --git a/security/apparmor/match.c b/security/apparmor/match.c
>>>>>> index 26e82ba879d44..3dcc342337aca 100644
>>>>>> --- a/security/apparmor/match.c
>>>>>> +++ b/security/apparmor/match.c
>>>>>> @@ -71,10 +71,10 @@ static struct table_header *unpack_table(char *blob, size_t bsize)
>>>>>> u8, u8, byte_to_byte);
>>>>>
>>>>> Is that that just memcpy() ?
>>>>
>>>> No, it's memcpy() only on big-endian machines.
>>>
>>> You've misread the quoting...
>>> The 'data8' case that was only half there is a memcpy().
>>>
>>>> On little-endian machines it converts from big-endian
>>>> 16/32-bit ints to little-endian 16/32-bit ints.
>>>>
>>>> But I see some potential for optimization here:
>>>> a) on big-endian machines just use memcpy()
>>>
>>> true
>>>
>>>> b) on little-endian machines use memcpy() to copy from possibly-unaligned
>>>> memory to then known-to-be-aligned destination. Then use a loop with
>>>> be32_to_cpu() instead of get_unaligned_xx() as it's faster.
>>>
>>> There is a function that does a loop byteswap of a buffer - no reason
>>> to re-invent it.
>>
>> I assumed there must be something, but I did not see it. Which one?
>
> I can't find it either - just some functions to do an in-place swap.
> (Which aren't usually a good idea)
>
>>
>>> But I doubt it is always (if ever) faster to do a copy and then byteswap.
>>> The loop control and extra memory accesses kill performance.
>>
>> Yes, you are probably right.
>>
>>> Not that I've seen a fast get_unaligned() - I don't think gcc or clang
>>> generate optimal code - For LE I think it is something like:
>>> low = *(addr & ~3);
>>> high = *((addr + 3) & ~3);
>>> shift = (addr & 3) * 8;
>>> value = low << shift | high >> (32 - shift);
>>> Note that it is only 2 aligned memory reads - even for 64bit.
>>
>> Ok, then maybe we should keep it simple like this patch:
>>
>> [PATCH v2] apparmor: Optimize table creation from possibly unaligned memory
>>
>> Source blob may come from userspace and might be unaligned.
>> Try to optize the copying process by avoiding unaligned memory accesses.
>
> Not sure that reads right.
> 'Allow for misaligned data from userspace when byteswapping source blob' ?
>
>>
>> Signed-off-by: Helge Deller <deller@....de>
>>
>> diff --git a/security/apparmor/include/match.h b/security/apparmor/include/match.h
>> index 1fbe82f5021b..386da2023d50 100644
>> --- a/security/apparmor/include/match.h
>> +++ b/security/apparmor/include/match.h
>> @@ -104,16 +104,20 @@ struct aa_dfa {
>> struct table_header *tables[YYTD_ID_TSIZE];
>> };
>>
>> -#define byte_to_byte(X) (X)
>> +#define byte_to_byte(X) (*(X))
>
> It's a bit of a shame you need something for the above...
>
We got rid of that in the last patch by just replacing the call to
UNPACK_ARRAY for bytes with just a call to memcpy.
> David
>
>
>>
>> #define UNPACK_ARRAY(TABLE, BLOB, LEN, TTYPE, BTYPE, NTOHX) \
>> do { \
>> typeof(LEN) __i; \
>> TTYPE *__t = (TTYPE *) TABLE; \
>> BTYPE *__b = (BTYPE *) BLOB; \
>> - for (__i = 0; __i < LEN; __i++) { \
>> - __t[__i] = NTOHX(__b[__i]); \
>> - } \
>> + BUILD_BUG_ON(sizeof(TTYPE) != sizeof(BTYPE)); \
>> + if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN) || sizeof(BTYPE) == 1) \
>> + memcpy(__t, __b, (LEN) * sizeof(BTYPE)); \
>> + else /* copy & convert convert from big-endian */ \
>> + for (__i = 0; __i < LEN; __i++) { \
>> + __t[__i] = NTOHX(&__b[__i]); \
>> + } \
>> } while (0)
>>
>> static inline size_t table_size(size_t len, size_t el_size)
>> diff --git a/security/apparmor/match.c b/security/apparmor/match.c
>> index c5a91600842a..13e2f6873329 100644
>> --- a/security/apparmor/match.c
>> +++ b/security/apparmor/match.c
>> @@ -15,6 +15,7 @@
>> #include <linux/vmalloc.h>
>> #include <linux/err.h>
>> #include <linux/kref.h>
>> +#include <linux/unaligned.h>
>>
>> #include "include/lib.h"
>> #include "include/match.h"
>> @@ -70,10 +71,10 @@ static struct table_header *unpack_table(char *blob, size_t bsize)
>> u8, u8, byte_to_byte);
>> else if (th.td_flags == YYTD_DATA16)
>> UNPACK_ARRAY(table->td_data, blob, th.td_lolen,
>> - u16, __be16, be16_to_cpu);
>> + u16, __be16, get_unaligned_be16);
>> else if (th.td_flags == YYTD_DATA32)
>> UNPACK_ARRAY(table->td_data, blob, th.td_lolen,
>> - u32, __be32, be32_to_cpu);
>> + u32, __be32, get_unaligned_be32);
>> else
>> goto fail;
>> /* if table was vmalloced make sure the page tables are synced
>
Powered by blists - more mailing lists