[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jU+FhX0e4EXXVzisD5hzsdxK+cyVD3=QuqGOSpE4j-SQ@mail.gmail.com>
Date: Wed, 18 Aug 2021 15:35:13 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Kees Cook <keescook@...omium.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Keith Packard <keithp@...thp.com>,
"Gustavo A . R . Silva" <gustavoars@...nel.org>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Wireless List <linux-wireless@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>,
Maling list - DRI developers
<dri-devel@...ts.freedesktop.org>, linux-staging@...ts.linux.dev,
linux-block@...r.kernel.org,
Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
clang-built-linux <clang-built-linux@...glegroups.com>,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v2 05/63] stddef: Introduce struct_group() helper macro
On Tue, Aug 17, 2021 at 11:06 PM Kees Cook <keescook@...omium.org> wrote:
>
> Kernel code has a regular need to describe groups of members within a
> structure usually when they need to be copied or initialized separately
> from the rest of the surrounding structure. The generally accepted design
> pattern in C is to use a named sub-struct:
>
> struct foo {
> int one;
> struct {
> int two;
> int three, four;
> } thing;
> int five;
> };
>
> This would allow for traditional references and sizing:
>
> memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
>
> However, doing this would mean that referencing struct members enclosed
> by such named structs would always require including the sub-struct name
> in identifiers:
>
> do_something(dst.thing.three);
>
> This has tended to be quite inflexible, especially when such groupings
> need to be added to established code which causes huge naming churn.
> Three workarounds exist in the kernel for this problem, and each have
> other negative properties.
>
> To avoid the naming churn, there is a design pattern of adding macro
> aliases for the named struct:
>
> #define f_three thing.three
>
> This ends up polluting the global namespace, and makes it difficult to
> search for identifiers.
>
> Another common work-around in kernel code avoids the pollution by avoiding
> the named struct entirely, instead identifying the group's boundaries using
> either a pair of empty anonymous structs of a pair of zero-element arrays:
>
> struct foo {
> int one;
> struct { } start;
> int two;
> int three, four;
> struct { } finish;
> int five;
> };
>
> struct foo {
> int one;
> int start[0];
> int two;
> int three, four;
> int finish[0];
> int five;
> };
>
> This allows code to avoid needing to use a sub-struct named for member
> references within the surrounding structure, but loses the benefits of
> being able to actually use such a struct, making it rather fragile. Using
> these requires open-coded calculation of sizes and offsets. The efforts
> made to avoid common mistakes include lots of comments, or adding various
> BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
> about the boundaries (e.g. the "start" object looks like it's 0 bytes
> in length), making bounds checking depend on open-coded calculations:
>
> if (length > offsetof(struct foo, finish) -
> offsetof(struct foo, start))
> return -EINVAL;
> memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
> offsetof(struct foo, start));
>
> However, the vast majority of places in the kernel that operate on
> groups of members do so without any identification of the grouping,
> relying either on comments or implicit knowledge of the struct contents,
> which is even harder for the compiler to reason about, and results in
> even more fragile manual sizing, usually depending on member locations
> outside of the region (e.g. to copy "two" and "three", use the start of
> "four" to find the size):
>
> BUILD_BUG_ON((offsetof(struct foo, four) <
> offsetof(struct foo, two)) ||
> (offsetof(struct foo, four) <
> offsetof(struct foo, three));
> if (length > offsetof(struct foo, four) -
> offsetof(struct foo, two))
> return -EINVAL;
> memcpy(&dst.two, &src.two, length);
>
> In order to have a regular programmatic way to describe a struct
> region that can be used for references and sizing, can be examined for
> bounds checking, avoids forcing the use of intermediate identifiers,
> and avoids polluting the global namespace, introduce the struct_group()
> macro. This macro wraps the member declarations to create an anonymous
> union of an anonymous struct (no intermediate name) and a named struct
> (for references and sizing):
>
> struct foo {
> int one;
> struct_group(thing,
> int two;
> int three, four;
> );
> int five;
> };
>
> if (length > sizeof(src.thing))
> return -EINVAL;
> memcpy(&dst.thing, &src.thing, length);
> do_something(dst.three);
>
> There are some rare cases where the resulting struct_group() needs
> attributes added, so struct_group_attr() is also introduced to allow
> for specifying struct attributes (e.g. __align(x) or __packed).
> Additionally, there are places where such declarations would like to
> have the struct be typed, so struct_group_typed() is added.
>
> Given there is a need for a handful of UAPI uses too, the underlying
> __struct_group() macro has been defined in UAPI so it can be used there
> too.
>
> Co-developed-by: Keith Packard <keithp@...thp.com>
> Signed-off-by: Keith Packard <keithp@...thp.com>
> Signed-off-by: Kees Cook <keescook@...omium.org>
> Acked-by: Gustavo A. R. Silva <gustavoars@...nel.org>
> Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor
> Enhanced-by: Rasmus Villemoes <linux@...musvillemoes.dk>
> Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk
> Enhanced-by: Dan Williams <dan.j.williams@...el.com>
Acked-by: Dan Williams <dan.j.williams@...el.com>
Powered by blists - more mailing lists