[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6ec63a78-b0cc-452e-9946-0acef346cac2@6wind.com>
Date: Wed, 11 Oct 2023 16:03:26 +0200
From: Nicolas Dichtel <nicolas.dichtel@...nd.com>
To: Johannes Berg <johannes@...solutions.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org
Cc: fw@...len.de, pablo@...filter.org, jiri@...nulli.us, mkubecek@...e.cz,
aleksander.lobakin@...el.com, Thomas Haller <thaller@...hat.com>
Subject: Re: [RFC] netlink: add variable-length / auto integers
Le 11/10/2023 à 15:11, Johannes Berg a écrit :
> +Thomas Haller, if you didn't see it yet.
>
>
> On Tue, 2023-10-10 at 17:33 -0700, Jakub Kicinski wrote:
>> We currently push everyone to use padding to align 64b values in netlink.
>> I'm not sure what the story behind this is. I found this:
>> https://lore.kernel.org/all/1461339084-3849-1-git-send-email-nicolas.dichtel@6wind.com/#t
There was some attempts before:
https://lore.kernel.org/netdev/20121205.125453.1457654258131828976.davem@davemloft.net/
https://lore.kernel.org/netdev/1355500160.2626.9.camel@bwh-desktop.uk.solarflarecom.com/
https://lore.kernel.org/netdev/1461142655-5067-1-git-send-email-nicolas.dichtel@6wind.com/
>> but it doesn't go into details WRT the motivation.
>> Even for arches which don't have good unaligned access - I'd think
>> that access aligned to 4B *is* pretty efficient, and that's all
>> we need. Plus kernel deals with unaligned input. Why can't user space?
>
> Hmm. I have a vague recollection that it was related to just not doing
> it - the kernel will do get_unaligned() or similar, but userspace if it
> just accesses it might take a trap on some architectures?
>
> But I can't find any record of this in public discussions, so ...
If I remember well, at this time, we had some (old) architectures that triggered
traps (in kernel) when a 64-bit field was accessed and unaligned. Maybe a mix
between 64-bit kernel / 32-bit userspace, I don't remember exactly. The goal was
to align u64 fields on 8 bytes.
Regards,
Nicolas
>
>
> In any case, I think for _new_ attributes it would be perfectly
> acceptable to do it without padding, as long as userspace is prepared to
> do unaligned accesses for them, so we might need something in libnl (or
> similar) to do it correctly.
>
>> Padded 64b is quite space-inefficient (64b + pad means at worst 16B
>> per attr vs 32b which takes 8B). It is also more typing:
>>
>> if (nla_put_u64_pad(rsp, NETDEV_A_SOMETHING_SOMETHING,
>> value, NETDEV_A_SOMETHING_PAD))
>>
>> Create a new attribute type which will use 32 bits at netlink
>> level if value is small enough (probably most of the time?),
>> and (4B-aligned) 64 bits otherwise. Kernel API is just:
>>
>> if (nla_put_uint(rsp, NETDEV_A_SOMETHING_SOMETHING, value))
>>
>> Calling this new type "just" sint / uint with no specific size
>> will hopefully also make people more comfortable with using it.
>> Currently telling people "don't use u8, you may need the space,
>> and netlink will round up to 4B, anyway" is the #1 comment
>> we give to newcomers.
>
> Yeah :)
>
>> Thoughts?
>
> Seems reasonable. I thought about endian variants, but with the variable
> size that doesn't make much sense.
>
> I do think the documentation in the patch could be clearer about the
> alignment, see below.
>
>> +++ b/include/net/netlink.h
>> @@ -183,6 +183,8 @@ enum {
>> NLA_REJECT,
>> NLA_BE16,
>> NLA_BE32,
>> + NLA_SINT,
>> + NLA_UINT,
>
> You should also update the policy-related documentation in this file.
>
>> +++ b/include/uapi/linux/netlink.h
>> @@ -298,6 +298,8 @@ struct nla_bitfield32 {
>> * entry has attributes again, the policy for those inner ones
>> * and the corresponding maxtype may be specified.
>> * @NL_ATTR_TYPE_BITFIELD32: &struct nla_bitfield32 attribute
>> + * @NL_ATTR_TYPE_SINT: 32-bit or 64-bit signed attribute, aligned to 4B
>> + * @NL_ATTR_TYPE_UINT: 32-bit or 64-bit unsigned attribute, aligned to 4B
>
> This is only for exposing the policy (policy description), not sure the
> alignment thing matters here?
>
> OTOH, there's nothing in this file that ever describes *any* of the
> attributes, yet in pracice all the uapi headers do refer to NLA_U8 and
> similar - so we should probably have a new comment section here in the
> UAPI that describes the various types as used by the documentation of
> other families?
>
> Anyway, I think some kind of bigger "careful with alignment" here would
> be good, so people do the correct thing and not just "if (big)
> nla_get_u64()" which would get the alignment thing problematic again.
>
> johannes
Powered by blists - more mailing lists