[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ca286c7.5fe6.197f8320b0a.Coremail.chenyuan_fl@163.com>
Date: Fri, 11 Jul 2025 14:35:18 +0800 (CST)
From: chenyuan <chenyuan_fl@....com>
To: "Quentin Monnet" <qmo@...nel.org>
Cc: ast@...nel.org, bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
"Yuan Chen" <chenyuan@...inos.cn>, "Jiri Olsa" <jolsa@...nel.org>
Subject: Re:Re: [PATCH v3] bpftool: Add CET-aware symbol matching for x86_64
architectures
Thank you for reviewing the patch and providing valuable feedback! I appreciate your insights on CET
compatibility and code structure. Here are my responses to your points:
1. Maintainer List
I confirm that in future submissions, I will run:
./scripts/get_maintainer.pl -f tools/bpf/bpftool/link.c
to ensure all relevant maintainers are included in the recipient list . This was an oversight in the initial submission.
2. False Positives on Older CPUs
Your concern about older CPUs is valid. To address this:
Current Approach: The patch relies on address offset matching (symbol_addr == target_addr - 4), which is safe because:
Non-CET functions won’t have a valid symbol at target_addr - 4 .
Symbol tables are deterministic, so accidental matches at addr - 4 are statistically negligible.
Instruction Verification: While checking for endbr32/endbr64 would be ideal, user-space cannot directly inspect kernel instruction memory for security and portability reasons.
Could you advise if there are any safe methods to verify the presence of endbr32/endbr64 instructions at kernel symbol addresses from user space?
At 2025-06-27 19:08:48, "Quentin Monnet" <qmo@...nel.org> wrote:
>Thanks! Next time, please try to add all relevant maintainers as
>recipients or in copy of your message when submitting patches. You can
>get the list with get_maintainer.pl, try running it on your patch or with
>"./scripts/get_maintainer.pl -f tools/bpf/bpftool/link.c"
>
>2025-06-26 15:49 UTC+0800 ~ Yuan Chen <chenyuan_fl@....com>
>> From: Yuan Chen <chenyuan@...inos.cn>
>>
>> Adjust symbol matching logic to account for Control-flow Enforcement
>> Technology (CET) on x86_64 systems. CET prefixes functions with a 4-byte
>> 'endbr' instruction, shifting the actual entry point to symbol + 4.
>>
>> Signed-off-by: Yuan Chen <chenyuan@...inos.cn>
>> ---
>> tools/bpf/bpftool/link.c | 30 ++++++++++++++++++++++++++++--
>> 1 file changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
>> index 03513ffffb79..dfd192b4c5ad 100644
>> --- a/tools/bpf/bpftool/link.c
>> +++ b/tools/bpf/bpftool/link.c
>> @@ -307,8 +307,21 @@ show_kprobe_multi_json(struct bpf_link_info *info, json_writer_t *wtr)
>> goto error;
>>
>> for (i = 0; i < dd.sym_count; i++) {
>> - if (dd.sym_mapping[i].address != data[j].addr)
>> + if (dd.sym_mapping[i].address != data[j].addr) {
>> +#if defined(__x86_64__) || defined(__amd64__)
>
>
>I'm not familiar with CET, but from what I read, it's been around since
>Tiger Lake processors (2020). Do we have a risk of false positive with
>older CPUs? Maybe check that the instruction at
>dd.sym_mapping[i].address is endbr32 or endbr34?
>
>
>> + /*
>> + * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>> + * function entry points have a 4-byte 'endbr' instruction prefix.
>> + * This causes the actual function address = symbol address + 4.
>> + * Here we check if this symbol matches the target address minus 4,
>> + * indicating we've found a CET-enabled function entry point.
>> + */
>> + if (dd.sym_mapping[i].address == data[j].addr - 4)
>> + goto found;
>> +#endif
>> continue;
>> + }
>> +found:
>> jsonw_start_object(json_wtr);
>> jsonw_uint_field(json_wtr, "addr", dd.sym_mapping[i].address);
>
>
>I suppose we still want to print dd.sym_mapping[i].address (and not
>data[j].addr) when we found it with the CET offset here - just
>double-checking.
>
>
>> jsonw_string_field(json_wtr, "func", dd.sym_mapping[i].name);
>> @@ -744,8 +757,21 @@ static void show_kprobe_multi_plain(struct bpf_link_info *info)
>>
>> printf("\n\t%-16s %-16s %s", "addr", "cookie", "func [module]");
>> for (i = 0; i < dd.sym_count; i++) {
>> - if (dd.sym_mapping[i].address != data[j].addr)
>> + if (dd.sym_mapping[i].address != data[j].addr) {
>> +#if defined(__x86_64__) || defined(__amd64__)
>> + /*
>> + * On x86_64 architectures with CET (Control-flow Enforcement Technology),
>> + * function entry points have a 4-byte 'endbr' instruction prefix.
>> + * This causes the actual function address = symbol address + 4.
>> + * Here we check if this symbol matches the target address minus 4,
>> + * indicating we've found a CET-enabled function entry point.
>> + */
>> + if (dd.sym_mapping[i].address == data[j].addr - 4)
>> + goto found;
>> +#endif
>
>
>Given that we have twice the same check, I'd move this to a dedicated
>wrapper function that we could call from both show_kprobe_multi_json()
>and show_kprobe_multi_plain().
>
>
>> continue;
>> + }
>> +found:
>> printf("\n\t%016lx %-16llx %s",
>> dd.sym_mapping[i].address, data[j].cookie, dd.sym_mapping[i].name);
>> if (dd.sym_mapping[i].module[0] != '\0')
Powered by blists - more mailing lists