[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241213092603.13399-1-laura.nao@collabora.com>
Date: Fri, 13 Dec 2024 10:26:03 +0100
From: Laura Nao <laura.nao@...labora.com>
To: stephen.s.brennan@...cle.com
Cc: alan.maguire@...cle.com,
bpf@...r.kernel.org,
chrome-platform@...ts.linux.dev,
kernel@...labora.com,
laura.nao@...labora.com,
linux-kernel@...r.kernel.org,
olsajiri@...il.com,
regressions@...ts.linux.dev
Subject: Re: [REGRESSION] module BTF validation failure (Error -22) on
On 12/12/24 22:49, Stephen Brennan wrote:
> Jiri Olsa <olsajiri@...il.com> writes:
>> On Wed, Dec 11, 2024 at 10:10:24PM +0100, Jiri Olsa wrote:
>>> On Tue, Dec 10, 2024 at 02:55:01PM +0100, Laura Nao wrote:
>>>> Hi Jiri,
>>>>
>>>> Thanks for the feedback!
>>>>
>>>> On 12/6/24 13:35, Jiri Olsa wrote:
>>>>> On Fri, Nov 15, 2024 at 06:17:12PM +0100, Laura Nao wrote:
>>>>>> On 11/13/24 10:37, Laura Nao wrote:
>>>>>>>
>>>>>>> Currently, KernelCI only retains the bzImage, not the vmlinux
>>>>>>> binary. The
>>>>>>> bzImage can be downloaded from the same link mentioned above by
>>>>>>> selecting
>>>>>>> 'kernel' from the dropdown menu (modules can also be downloaded
>>>>>>> the
>>>>>>> same
>>>>>>> way). I’ll try to replicate the build on my end and share the
>>>>>>> vmlinux
>>>>>>> with DWARF data stripped for convenience.
>>>>>>>
>>>>>>
>>>>>> I managed to reproduce the issue locally and I've uploaded the
>>>>>> vmlinux[1]
>>>>>> (stripped of DWARF data) and vmlinux.raw[2] files, as well as one
>>>>>> of
>>>>>> the
>>>>>> modules[3] and its btf data[4] extracted with:
>>>>>>
>>>>>> bpftool -B vmlinux btf dump file cros_kbd_led_backlight.ko >
>>>>>> cros_kbd_led_backlight.ko.raw
>>>>>>
>>>>>> Looking again at the logs[5], I've noticed the following is
>>>>>> reported:
>>>>>>
>>>>>> [ 0.415885] BPF: type_id=115803 offset=177920 size=1152
>>>>>> [ 0.416029] BPF:
>>>>>> [ 0.416083] BPF: Invalid offset
>>>>>> [ 0.416165] BPF:
>>>>>>
>>>>>> There are two different definitions of rcu_data in
>>>>>> '.data..percpu',
>>>>>> one
>>>>>> is a struct and the other is an integer:
>>>>>>
>>>>>> type_id=115801 offset=177920 size=1152 (VAR 'rcu_data')
>>>>>> type_id=115803 offset=177920 size=1152 (VAR 'rcu_data')
>>>>>>
>>>>>> [115801] VAR 'rcu_data' type_id=115572, linkage=static
>>>>>> [115803] VAR 'rcu_data' type_id=1, linkage=static
>>>>>>
>>>>>> [115572] STRUCT 'rcu_data' size=1152 vlen=69
>>>>>> [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64
>>>>>> encoding=(none)
>>>>>>
>>>>>> I assume that's not expected, correct?
>>>>>
>>>>> yes, that seems wrong.. but I can't reproduce with your config
>>>>> together with pahole 1.24 .. could you try with latest one?
>>>>
>>>> I just tested next-20241210 with the latest pahole version (1.28
>>>> from
>>>> the master branch[1]), and the issue does not occur with this
>>>> version
>>>> (I can see only one instance of rcu_data in the BTF data, as
>>>> expected).
>>>>
>>>> I can confirm that the same kernel revision still exhibits the
>>>> issue
>>>> with pahole 1.24.
>>>>
>>>> If helpful, I can also test versions between 1.24 and 1.28 to
>>>> identify
>>>> which ones work.
>>>
>>> I managed to reproduce finally with gcc-12, but had to use pahole
>>> 1.25,
>>> 1.24 failed with unknown attribute
>>>
>>> [95096] VAR 'rcu_data' type_id=94868, linkage=static
>>> [95098] VAR 'rcu_data' type_id=4, linkage=static
>>> type_id=95096 offset=177088 size=1152 (VAR 'rcu_data')
>>> type_id=95098 offset=177088 size=1152 (VAR 'rcu_data')
>>
>> so for me the difference seems to be using gcc-12 and this commit in
>> linux tree:
>> dabddd687c9e percpu: cast percpu pointer in PERCPU_PTR() via
>> unsigned long
>>
>> which adds extra __pcpu_ptr variable into dwarf, and it has the same
>> address as the per cpu variable and that confuses pahole
>>
>> it ends up with adding per cpu variable twice.. one with real type
>> (type_id=94868) and the other with unsigned long type (type_id=4)
>>
>> however this got fixed in pahole 1.28 commit:
>> 47dcb534e253 btf_encoder: Stop indexing symbols for VARs
>>
>> which filters out __pcpu_ptr variable completely, adding Stephen to
>> the loop
>
> Thanks for sharing this. Your analysis is spot-on, but I can fill in
> the
> details a bit. I just grabbed 6.13-rc2 and built it with gcc 11 and
> pahole 1.27, and observed the same issue:
>
> $ bpftool btf dump file vmlinux | grep "VAR 'rcu_data"
> [4045] VAR 'rcu_data' type_id=3962, linkage=static
> [4047] VAR 'rcu_data' type_id=1, linkage=static
> type_id=4045 offset=196608 size=520 (VAR 'rcu_data')
> type_id=4047 offset=196608 size=520 (VAR 'rcu_data')
>
> In pahole 1.27, the (simplified) process for generating variables for
> BTF is:
>
> 1. Look through the ELF symbol table, and find all symbols whose
> addresses are within the percpu section, and add them to a list.
>
> 2. Look through the DWARF: for each tag of type DW_TAG_variable,
> determine if the variable is "global". If so, and if the address
> matches
> one of the symbols found in Step 1, continue.
>
> 3. Except for one special case, pahole doesn't check whether the DWARF
> variable's name matches the symbol name. It simply emits a variable
> using the name of the symbol from Step 1, and the type information
> from
> Step 2.
>
> The result of this process, in this case, is:
>
> 1. kernel/rcu/tree.c contains a declaration of "rcu_data". This
> results
> in an ELF symbol in vmlinux of the same name. Great!
>
> $ eu-readelf -s vmlinux | grep '\brcu_data\b'
> 12319: 0000000000030000 520 OBJECT LOCAL DEFAULT 21
> rcu_data
>
>
> 2. A DWARF entry is emitted for "rcu_data" which has a matching
> location
> (DW_AT_location has value DW_OP_addr 0x30000, matching the ELF
> symbol).
> So far so good - pahole emits a BTF variable with the expected type.
>
> $ llvm-dwarfdump --name=rcu_data
> ...
> 0x01af03f1: DW_TAG_variable
> DW_AT_name ("rcu_data")
> DW_AT_decl_file
> ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c")
> DW_AT_decl_line (80)
> DW_AT_decl_column (8)
> DW_AT_type (0x01aefb38 "rcu_data")
> DW_AT_alignment (0x40)
> DW_AT_location (DW_OP_addr 0x30000)
>
> 3. In kernel/rcu/tree.c, we also have the following declaration at
> line
> 5227 which uses per_cpu_ptr() on &rcu_data:
>
> 5222 void rcutree_migrate_callbacks(int cpu)
> 5223 {
> 5224 unsigned long flags;
> 5225 struct rcu_data *my_rdp;
> 5226 struct rcu_node *my_rnp;
> 5227 struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> ^^^^^^^^^^^
>
> With the new changes in dabddd687c9e ("percpu: cast percpu pointer in
> PERCPU_PTR() via unsigned long"), this expands to a lexical block
> which
> contains a variable named "__pcpu_ptr", of type unsigned long. The
> compiler emits the following DW_TAG_variable in the DWARF:
>
> 0x01b05d20: DW_TAG_variable
> DW_AT_name ("__pcpu_ptr")
> DW_AT_decl_file
> ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c")
> DW_AT_decl_line (5227)
> DW_AT_decl_column (25)
> DW_AT_type (0x01adb52e "long unsigned
> int")
> DW_AT_location (DW_OP_addr 0x30000,
> DW_OP_stack_value)
>
> Since the DW_AT_location has a DW_OP_addr - pahole understands this to
> mean that the variable is located in global memory, and thus has
> VSCOPE_GLOBAL. But of course, the actual "scope" of this variable is
> not
> global, it is limited to the lexical block, which is completely hidden
> away by the macro. But pahole 1.27 does not consider this, and since
> the
> address matches the "rcu_data" symbol, it emits a variable of type
> "long
> unsigned int" under the name "rcu_data" -- despite the fact that the
> DWARF info has a name of "__pcpu_ptr".
>
> The changes I made in 1.28 address this (unintentionally) by:
>
> 1. Requiring global variables be both "in the global scope" (i.e. in
> the
> CU-level, rather than any function or other lexical block.
> 2. Requiring global variables have global memory (some of them could
> be
> register variables, despite having global scope -- e.g.
> current_stack_pointer).
> 3. No longer using the ELF symbol table, and instead using the DWARF
> names for variables.
>
> With #1, we would filter this variable. And with #3, even if the
> variable were not filtered, we would output (a bunch of) variables
> with
> the correct __pcpu_ptr variable name, which is unhelpful but at least
> helps us understand where these things come from.
>
> Rebuilding with GCC 14, we can see that the "__pcpu_ptr" variable no
> longer has a DW_AT_location:
>
> 0x01afa82f: DW_TAG_variable
> DW_AT_name ("__pcpu_ptr")
> DW_AT_decl_file
> ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c")
> DW_AT_decl_line (5227)
> DW_AT_decl_column (25)
> DW_AT_type (0x01ad0267 "long unsigned
> int")
>
> This is the reason that pahole 1.27 now recognizes it as
> VSCOPE_OPTIMIZED. Without a memory location pahole can't do anything
> to
> match it against the "rcu_data" variable so nothing is emitted, and we
> don't get the issue.
>
> I'm not sure if this adds at all to the discussion, since the overall
> answer is the same, an upgrade of pahole and/or gcc. (Pahole would be
> recommended; GCC just changed the generated DWARF and I could imagine
> other situations popping up elsewhere).
>
Thank you for the help with debugging and for the detailed explanation!
We'll proceed with updating pahole to v1.28 in the KernelCI build
environment.
Best,
Laura
#regzbot resolve: fixed by changes in pahole 1.28
>
> Thanks,
> Stephen
>
>> with gcc-14 the __pcpu_ptr variable has VSCOPE_OPTIMIZED scope, so it
>> won't
>> get into btf even without above pahole fix
>>
>> I suggest gcc/pahole upgrade ;-)
>>
>> thanks,
>> jirka
Powered by blists - more mailing lists