[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2833d0db-f122-eccd-7393-1f0169dc0741@collabora.com>
Date: Tue, 11 Jul 2023 16:46:14 +0530
From: Shreeya Patel <shreeya.patel@...labora.com>
To: Linux regressions mailing list <regressions@...ts.linux.dev>,
Masahiro Yamada <masahiroy@...nel.org>
Cc: Greg KH <gregkh@...uxfoundation.org>,
Maksim Panchenko <maks@...a.com>,
Ricardo Cañuelo <ricardo.canuelo@...labora.com>,
Michal Marek <michal.lkml@...kovi.net>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
clang-built-linux <llvm@...ts.linux.dev>,
Bill Wendling <morbo@...gle.com>,
Nathan Chancellor <nathan@...nel.org>,
"gustavo.padovan@...labora.com" <gustavo.padovan@...labora.com>,
Guillaume Charles Tucker <guillaume.tucker@...labora.com>,
denys.f@...labora.com, Nick Desaulniers <ndesaulniers@...gle.com>,
kernelci@...ts.linux.dev,
Collabora Kernel ML <kernel@...labora.com>
Subject: Re: [PATCH v4] Makefile.compiler: replace cc-ifversion with
compiler-specific macros
On 10/07/23 17:39, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Shreeya Patel, Masahiro Yamada: what's the status of this? Was any
> progress made to address this? Or is this maybe (accidentally?) fixed
> with 6.5-rc1?
Hi Thorsten,
I still see the regression happening so it doesn't seem to be fixed.
https://linux.kernelci.org/test/case/id/64ac675a8aebf63753bb2a8c/
Masahiro had submitted a fix for this issue here.
https://lore.kernel.org/lkml/ZJEni98knMMkU%2Fcl@buildd.core.avm.de/T/#t
But I don't see any movement there. Masahiro, are you planning to send a
v2 for it?
Thanks,
Shreeya Patel
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke
>
> On 20.06.23 06:19, Masahiro Yamada wrote:
>> On Mon, Jun 12, 2023 at 7:10 PM Shreeya Patel
>> <shreeya.patel@...labora.com> wrote:
>>> On 24/05/23 02:57, Nick Desaulniers wrote:
>>>> On Tue, May 23, 2023 at 3:27 AM Shreeya Patel
>>>> <shreeya.patel@...labora.com> wrote:
>>>>> Hi Nick and Masahiro,
>>>>>
>>>>> On 23/05/23 01:22, Nick Desaulniers wrote:
>>>>>> On Mon, May 22, 2023 at 9:52 AM Greg KH <gregkh@...uxfoundation.org> wrote:
>>>>>>> On Mon, May 22, 2023 at 12:09:34PM +0200, Ricardo Cañuelo wrote:
>>>>>>>> On vie, may 19 2023 at 08:57:24, Nick Desaulniers <ndesaulniers@...gle.com> wrote:
>>>>>>>>> It could be; if the link order was changed, it's possible that this
>>>>>>>>> target may be hitting something along the lines of:
>>>>>>>>> https://isocpp.org/wiki/faq/ctors#static-init-order i.e. the "static
>>>>>>>>> initialization order fiasco"
>>>>>>>>>
>>>>>>>>> I'm struggling to think of how this appears in C codebases, but I
>>>>>>>>> swear years ago I had a discussion with GKH (maybe?) about this. I
>>>>>>>>> think I was playing with converting Kbuild to use Ninja rather than
>>>>>>>>> Make; the resulting kernel image wouldn't boot because I had modified
>>>>>>>>> the order the object files were linked in. If you were to randomly
>>>>>>>>> shuffle the object files in the kernel, I recall some hazard that may
>>>>>>>>> prevent boot.
>>>>>>>> I thought that was specifically a C++ problem? But then again, the
>>>>>>>> kernel docs explicitly say that the ordering of obj-y goals in kbuild is
>>>>>>>> significant in some instances [1]:
>>>>>>> Yes, it matters, you can not change it. If you do, systems will break.
>>>>>>> It is the only way we have of properly ordering our init calls within
>>>>>>> the same "level".
>>>>>> Ah, right it was the initcall ordering. Thanks for the reminder.
>>>>>>
>>>>>> (There's a joke in there similar to the use of regexes to solve a
>>>>>> problem resulting in two new problems; initcalls have levels for
>>>>>> ordering, but we still have (unexpressed) dependencies between calls
>>>>>> of the same level; brittle!).
>>>>>>
>>>>>> +Maksim, since that might be relevant info for the BOLT+Kernel work.
>>>>>>
>>>>>> Ricardo,
>>>>>> https://elinux.org/images/e/e8/2020_ELCE_initcalls_myjosserand.pdf
>>>>>> mentions that there's a kernel command line param `initcall_debug`.
>>>>>> Perhaps that can be used to see if
>>>>>> 5750121ae7382ebac8d47ce6d68012d6cd1d7926 somehow changed initcall
>>>>>> ordering, resulting in a config that cannot boot?
>>>>> Here are the links to Lava jobs ran with initcall_debug added to the
>>>>> kernel command line.
>>>>>
>>>>> 1. Where regression happens (5750121ae7382ebac8d47ce6d68012d6cd1d7926)
>>>>> https://lava.collabora.dev/scheduler/job/10417706
>>>>> <https://lava.collabora.dev/scheduler/job/10417706>
>>>>>
>>>>> 2. With a revert of the commit 5750121ae7382ebac8d47ce6d68012d6cd1d7926
>>>>> https://lava.collabora.dev/scheduler/job/10418012
>>>>> <https://lava.collabora.dev/scheduler/job/10418012>
>>>> Thanks!
>>>>
>>>> Yeah, I can see a diff in the initcall ordering as a result of
>>>> commit 5750121ae738 ("kbuild: list sub-directories in ./Kbuild")
>>>>
>>>> https://gist.github.com/nickdesaulniers/c09db256e42ad06b90842a4bb85cc0f4
>>>>
>>>> Not just different orderings, but some initcalls seem unique to the
>>>> before vs. after, which is troubling. (example init_events and
>>>> init_fs_sysctls respectively)
>>>>
>>>> That isn't conclusive evidence that changes to initcall ordering are
>>>> to blame, but I suspect confirming that precisely to be very very time
>>>> consuming.
>>>>
>>>> Masahiro, what are your thoughts on reverting 5750121ae738? There are
>>>> conflicts in Kbuild and Makefile when reverting 5750121ae738 on
>>>> mainline.
>>> I'm not sure if you followed the conversation but we are still seeing
>>> this regression with the latest kernel builds and would like to know if
>>> you plan to revert 5750121ae738?
>>
>> Reverting 5750121ae738 does not solve the issue
>> because the issue happens even before 5750121ae738.
>> multi_v7_defconfig + debug.config + CONFIG_MODULES=n
>> fails to boot in the same way.
>>
>> The revert would hide the issue on a particular build setup.
>>
>>
>> I submitted a patch to more pin-point the issue.
>> Let's see how it goes.
>> https://lore.kernel.org/lkml/ZJEni98knMMkU%2Fcl@buildd.core.avm.de/T/#t
>>
>>
>> (BTW, the initcall order is unrelated)
>>
>>
>>
>>
>>
>>>
>>> Thanks,
>>> Shreeya Patel
>>>
>>>>> Thanks,
>>>>> Shreeya Patel
>>>>>
>> --
>> Best Regards
>> Masahiro Yamada
>>
>>
Powered by blists - more mailing lists