[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALJ9ZPN-PdYOymzxrWve2X9=dwN=SQemd=wnhv=pee6K7p0QXQ@mail.gmail.com>
Date: Fri, 21 Jun 2024 13:49:44 -0700
From: Yabin Cui <yabinc@...gle.com>
To: "Ballman, Aaron" <aaron.ballman@...el.com>
Cc: Nick Desaulniers <ndesaulniers@...gle.com>,
Steffen Klassert <steffen.klassert@...unet.com>, Herbert Xu <herbert@...dor.apana.org.au>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Nathan Chancellor <nathan@...nel.org>, Bill Wendling <morbo@...gle.com>,
Justin Stitt <justinstitt@...gle.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"llvm@...ts.linux.dev" <llvm@...ts.linux.dev>
Subject: Re: [PATCH] Fix initializing a static union variable
> You have to add a Fixes tag, perhaps:
> Fixes: 08ec9af1c062 ("xfrm: Fix xfrm_state_find() wrt. wildcard source
address.")
Thanks! I will add it in the v2 patch.
> And the other thing is that I would change the order in the union, to
> have largest element as the first. Would be best to also add such check
> into static analysis tool/s.
xfrm_address_t is defined in include/uapi. So I am a little hesitant
to change it now.
Currently clang doesn't even have a check/warning to detect this.
I am not sure how clang will fix it in the future, in
https://github.com/llvm/llvm-project/issues/78034.
> This idiom is used throughout the kernel. If we decide that it
> isn't safe to use then we should change the kernel as a whole rather
> than the one spot that happens to have been identified.
> Alternatively the buggy compiler should be banned until it's fixed.
The problem probably happens in all clang versions, depending on the
optimization flags and inline attributes
used. I found this problem when trying to add
-fdebug-info-for-profiling flag when compiling the code.
This patch is to fix kernel code where the problem happens, before
clang can give a better
solution in https://github.com/llvm/llvm-project/issues/78034.
On Fri, Jun 21, 2024 at 4:25 AM Ballman, Aaron <aaron.ballman@...el.com> wrote:
>
> Sadly, this is not a bug in Clang but is the terrible way C still works in C23. ☹ See C23 6.7.11p11, the last bullet:
>
> if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits.
>
> So the padding gets initialized as does the first member, but if the first member is not the largest member in the union, that leaves other bits uninitialized. This happened late during NB comment stages, which is why N2900 and N3011 give the impression that the largest union member should be initialized.
>
> That said, maybe there's appetite within the community to initialize the largest member as a conforming extension. The downside is that people not playing tricks with their unions may end up paying additional initialization cost that's useless, in pathological cases. e.g.,
>
> int main() {
> union U {
> int i;
> struct {
> long double huge[1000];
> } s;
> } u = {}; // This is not a brilliant use of unions
> return u.i;
> }
>
> But how often does this matter for performance -- I have to imagine that most unions make use of most of the memory needed for their members instead of being lopsided like that. If we find some important use case that has significantly worse performance, we could always add a driver flag to control the behavior to let people opt into/out of the extension.
I guess if people don't want the cost of zero initializing a union
variable, they will not use "= {}". And the situation that using "=
{}" only zero initializes the first member of a union variable isn't
very intuitive.
>
> ~Aaron
>
> -----Original Message-----
> From: Nick Desaulniers <ndesaulniers@...gle.com>
> Sent: Thursday, June 20, 2024 3:54 PM
> To: Yabin Cui <yabinc@...gle.com>
> Cc: Steffen Klassert <steffen.klassert@...unet.com>; Herbert Xu <herbert@...dor.apana.org.au>; David S. Miller <davem@...emloft.net>; Eric Dumazet <edumazet@...gle.com>; Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>; Nathan Chancellor <nathan@...nel.org>; Bill Wendling <morbo@...gle.com>; Justin Stitt <justinstitt@...gle.com>; netdev@...r.kernel.org; linux-kernel@...r.kernel.org; llvm@...ts.linux.dev; Ballman, Aaron <aaron.ballman@...el.com>
> Subject: Re: [PATCH] Fix initializing a static union variable
>
> On Thu, Jun 20, 2024 at 12:47 PM Nick Desaulniers <ndesaulniers@...gle.com> wrote:
> >
> > On Thu, Jun 20, 2024 at 12:31 PM Nick Desaulniers
> > <ndesaulniers@...gle.com> wrote:
> > >
> > > On Thu, Jun 20, 2024 at 11:17 AM Yabin Cui <yabinc@...gle.com> wrote:
> > > >
> > > > saddr_wildcard is a static union variable initialized with {}.
> > > > But c11 standard doesn't guarantee initializing all fields as zero
> > > > for this case. As in https://godbolt.org/z/rWvdv6aEx,
> > >
> > > Specifically, it sounds like C99+ is just the first member of the
> > > union, which is dumb since that may not necessarily be the largest
> > > variant. Can you find the specific relevant wording from a pre-c23
> > > spec?
> > >
I feel it's unspecified in pre-c23 spec.
>From https://en.cppreference.com/w/c/language/struct_initialization,
When initializing an object of struct or union type, the initializer
must be a non-empty,(until C23).
> > > > clang only initializes the first field as zero, but the bits
> > > > corresponding to other (larger) members are undefined.
> > >
> > > Oh, that sucks!
> > >
> > > Reading through the internal report on this is fascinating! Nice
> > > job tracking down the issue! It sounds like if we can aggressively
> > > inline the users of this partially initialized value, then the UB
> > > from control flow on the partially initialized value can result in
> > > Android's kernel network tests failing. It might be good to include
> > > more info on "why this is a problem" in the commit message.
> > >
> > > https://godbolt.org/z/hxnT1PTWo more clearly demonstrates the issue, IMO.
Great, I will use it in the v2 patch.
> > >
> > > TIL that C23 clarifies this, but clang still doesn't have the
> > > correct codegen then for -std=c23. Can you please find or file a
> > > bug about this, then add a link to it in the commit message?
> > >
> > > It might be interesting to link to the specific section of n3096
> > > that clarifies this, or if there was a corresponding defect report
> > > sent to ISO about this. Maybe something from
> > > https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm
> > > discusses this?
> >
> > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3011.htm
> >
> > https://clang.llvm.org/c_status.html mentions that n3011 was addressed
> > by clang-17, but based on my godbolt link above, it seems perhaps not?
>
> Sorry, n3011 was a minor revision to
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2900.htm
> which is a better citation for this bug, IMO. I still think the clang status page is wrong (for n2900) and that is a bug against clang that should be fixed (for c23), but this kernel patch still has merit (since the issue I'm referring to in clang is not what's leading to the test case failures).
>
I totally agree with you. And if clang fixes the support, I guess it
will not be limited to c23.
> >
> > 6.7.10.2 of n3096 (c23) defines "empty initialization" (which wasn't
> > defined in older standards).
> >
> > Ah, reading
> >
> > n2310 (c17) 6.7.9.10:
> >
> > ```
> > If an object that has static or thread storage duration is not
> > initialized explicitly, then:
> > ...
> > — if it is a union, the first named member is initialized
> > (recursively) according to these rules, and any padding is initialized
> > to zero bits; ```
> >
> > Specifically, "the first named member" was a terrible mistake in the language.
> >
> > Yikes! Might want to quote that in the commit message.
> >
> > >
> > > Can you also please (find or) file a bug against clang about this? A
> > > compiler diagnostic would be very very helpful here, since `= {};`
> > > is such a common idiom.
> > >
Reported to clang upstream in
https://github.com/llvm/llvm-project/issues/78034#issuecomment-2183233517.
It's a problem that still happens in the c23 standard.
> > > Patch LGTM, but I think more context can be provided in the commit
> > > message in a v2 that helps reviewers follow along with what's going
> > > on here.
> > >
> > > >
> > > > Signed-off-by: Yabin Cui <yabinc@...gle.com>
> > > > ---
> > > > net/xfrm/xfrm_state.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index
> > > > 649bb739df0d..9bc69d703e5c 100644
> > > > --- a/net/xfrm/xfrm_state.c
> > > > +++ b/net/xfrm/xfrm_state.c
> > > > @@ -1139,7 +1139,7 @@ xfrm_state_find(const xfrm_address_t *daddr, const xfrm_address_t *saddr,
> > > > struct xfrm_policy *pol, int *err,
> > > > unsigned short family, u32 if_id) {
> > > > - static xfrm_address_t saddr_wildcard = { };
> > > > + static const xfrm_address_t saddr_wildcard;
> > > > struct net *net = xp_net(pol);
> > > > unsigned int h, h_wildcard;
> > > > struct xfrm_state *x, *x0, *to_put;
> > > > --
> > > > 2.45.2.741.gdbec12cfda-goog
> > > >
> > >
> > >
> > > --
> > > Thanks,
> > > ~Nick Desaulniers
> >
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
>
>
>
> --
> Thanks,
> ~Nick Desaulniers
Powered by blists - more mailing lists