[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55EBD70D.9080301@huawei.com>
Date: Sun, 6 Sep 2015 14:02:53 +0800
From: "Wangnan (F)" <wangnan0@...wei.com>
To: 平松雅巳 / HIRAMATU,MASAMI
<masami.hiramatsu.pt@...achi.com>,
"'Arnaldo Carvalho de Melo'" <acme@...hat.com>
CC: "acme@...nel.org" <acme@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
He Kuang <hekuang@...wei.com>,
"Alexei Starovoitov" <ast@...mgrid.com>,
Brendan Gregg <brendan.d.gregg@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
David Ahern <dsahern@...il.com>, Jiri Olsa <jolsa@...nel.org>,
Kaixu Xia <xiakaixu@...wei.com>,
Namhyung Kim <namhyung@...nel.org>,
Paul Mackerras <paulus@...ba.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Zefan Li <lizefan@...wei.com>,
"pi3orama@....com" <pi3orama@....com>
Subject: Re: [PATCH 23/31] perf tools: Introduce regs_query_register_offset()
for x86
On 2015/9/1 23:54, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Arnaldo Carvalho de Melo [mailto:acme@...hat.com]
>>
>> Em Tue, Sep 01, 2015 at 11:47:41AM +0000, 平松雅巳 / HIRAMATU,MASAMI escreveu:
>>>> From: Wang Nan [mailto:wangnan0@...wei.com]
>>>>
>>>> regs_query_register_offset() is a helper function which converts
>>>> register name like "%rax" to offset of a register in 'struct pt_regs',
>>>> which is required by BPF prologue generator. Since the function is
>>>> identical, try to reuse the code in arch/x86/kernel/ptrace.c.
>>>>
>>>> Comment inside dwarf-regs.c list the differences between this
>>>> implementation and kernel code.
>>> Hmm, this also introduce a duplication of the code...
>>> It might be a good time to move them into arch/x86/lib/ and
>>> reuse it directly from perf code.
>> It is strange to, having tried sharing stuff directly from the kernel,
>> to be now in a position where I advocate against it...
>>
>> Copy'n'pasting what I said in another message:
>>
>> -----
>> We would go back to sharing stuff with the kernel, but this time around
>> we would be using something that everybody knows is being shared, which
>> doesn't elliminates the possibility that at some point changes made with
>> the kernel in mind would break the tools/ using code.
>>
>> Perhaps it is better to keep copying what we want and introduce
>> infrastructure to check for differences and warn us as soon as possible
>> so that we would do the copy, test if it doesn't break what we use, etc.
>>
>> I.e. we wouldn't be putting any new burden on the "kernel people", i.e.
>> the burden of having to check that changes they make don't break tools/
>> living code, nor any out of the blue breakage on tools/ for tools/
>> developers to fix when changes are made on the kernel "side" -----
>> ---
>>
>> The "stop sharing directly stuff with the kernel" stance was taken after
>> a report from Linus about breakage due to tools/ using kernel files
>> directly and then a change made in some RCU files broke the tools/perf/
>> build, even with tools/perf/ not using anything RCU related so far.
>>
>> Looking at tools/perf/MANIFEST, the file used to create a detached
>> tarball so that perf can be built outside the kernel sources there are
>> still some kernel source files listed, but those probably need to be
>> copied too...
> OK, so let this apply.
>
> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
>
> And also we'll need a testcase for this.
I created a testcase for the whole BPF prologue, so I think this can be
covered?
I'll post them by replying this mail.
Thank you.
> Thank you,
>
>> - Arnaldo
>>
>>> Thank you,
>>>
>>>> get_arch_regstr() switches to regoffset_table and the old string table
>>>> is dropped.
>>>>
>>>> Signed-off-by: Wang Nan <wangnan0@...wei.com>
>>>> Signed-off-by: He Kuang <hekuang@...wei.com>
>>>> Cc: Alexei Starovoitov <ast@...mgrid.com>
>>>> Cc: Brendan Gregg <brendan.d.gregg@...il.com>
>>>> Cc: Daniel Borkmann <daniel@...earbox.net>
>>>> Cc: David Ahern <dsahern@...il.com>
>>>> Cc: He Kuang <hekuang@...wei.com>
>>>> Cc: Jiri Olsa <jolsa@...nel.org>
>>>> Cc: Kaixu Xia <xiakaixu@...wei.com>
>>>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
>>>> Cc: Namhyung Kim <namhyung@...nel.org>
>>>> Cc: Paul Mackerras <paulus@...ba.org>
>>>> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
>>>> Cc: Zefan Li <lizefan@...wei.com>
>>>> Cc: pi3orama@....com
>>>> Cc: Arnaldo Carvalho de Melo <acme@...hat.com>
>>>> ---
>>>> tools/perf/arch/x86/Makefile | 1 +
>>>> tools/perf/arch/x86/util/Build | 1 +
>>>> tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
>>>> 3 files changed, 90 insertions(+), 34 deletions(-)
>>>>
>>>> diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
>>>> index 21322e0..09ba923 100644
>>>> --- a/tools/perf/arch/x86/Makefile
>>>> +++ b/tools/perf/arch/x86/Makefile
>>>> @@ -2,3 +2,4 @@ ifndef NO_DWARF
>>>> PERF_HAVE_DWARF_REGS := 1
>>>> endif
>>>> HAVE_KVM_STAT_SUPPORT := 1
>>>> +PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
>>>> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
>>>> index 2c55e1b..d4d1f23 100644
>>>> --- a/tools/perf/arch/x86/util/Build
>>>> +++ b/tools/perf/arch/x86/util/Build
>>>> @@ -4,6 +4,7 @@ libperf-y += pmu.o
>>>> libperf-y += kvm-stat.o
>>>>
>>>> libperf-$(CONFIG_DWARF) += dwarf-regs.o
>>>> +libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
>>>>
>>>> libperf-$(CONFIG_LIBUNWIND) += unwind-libunwind.o
>>>> libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
>>>> diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
>>>> index a08de0a..de5b936 100644
>>>> --- a/tools/perf/arch/x86/util/dwarf-regs.c
>>>> +++ b/tools/perf/arch/x86/util/dwarf-regs.c
>>>> @@ -21,55 +21,109 @@
>>>> */
>>>>
>>>> #include <stddef.h>
>>>> +#include <errno.h> /* for EINVAL */
>>>> +#include <string.h> /* for strcmp */
>>>> +#include <linux/ptrace.h> /* for struct pt_regs */
>>>> +#include <linux/kernel.h> /* for offsetof */
>>>> #include <dwarf-regs.h>
>>>>
>>>> /*
>>>> - * Generic dwarf analysis helpers
>>>> + * See arch/x86/kernel/ptrace.c.
>>>> + * Different from it:
>>>> + *
>>>> + * - Since struct pt_regs is defined differently for user and kernel,
>>>> + * but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
>>>> + * field name of user's pt_regs), we make REG_OFFSET_NAME to accept
>>>> + * both string name and reg field name.
>>>> + *
>>>> + * - Since accessing x86_32's pt_regs from x86_64 building is difficult
>>>> + * and vise versa, we simply fill offset with -1, so
>>>> + * get_arch_regstr() still works but regs_query_register_offset()
>>>> + * returns error.
>>>> + * The only inconvenience caused by it now is that we are not allowed
>>>> + * to generate BPF prologue for a x86_64 kernel if perf is built for
>>>> + * x86_32. This is really a rare usecase.
>>>> + *
>>>> + * - Order is different from kernel's ptrace.c for get_arch_regstr(), which
>>>> + * is defined by dwarf.
>>>> */
>>>>
>>>> -#define X86_32_MAX_REGS 8
>>>> -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
>>>> - "%ax",
>>>> - "%cx",
>>>> - "%dx",
>>>> - "%bx",
>>>> - "$stack", /* Stack address instead of %sp */
>>>> - "%bp",
>>>> - "%si",
>>>> - "%di",
>>>> +struct pt_regs_offset {
>>>> + const char *name;
>>>> + int offset;
>>>> +};
>>>> +
>>>> +#define REG_OFFSET_END {.name = NULL, .offset = 0}
>>>> +
>>>> +#ifdef __x86_64__
>>>> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
>>>> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
>>>> +#else
>>>> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
>>>> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
>>>> +#endif
>>>> +
>>>> +static const struct pt_regs_offset x86_32_regoffset_table[] = {
>>>> + REG_OFFSET_NAME_32("%ax", eax),
>>>> + REG_OFFSET_NAME_32("%cx", ecx),
>>>> + REG_OFFSET_NAME_32("%dx", edx),
>>>> + REG_OFFSET_NAME_32("%bx", ebx),
>>>> + REG_OFFSET_NAME_32("$stack", esp), /* Stack address instead of %sp */
>>>> + REG_OFFSET_NAME_32("%bp", ebp),
>>>> + REG_OFFSET_NAME_32("%si", esi),
>>>> + REG_OFFSET_NAME_32("%di", edi),
>>>> + REG_OFFSET_END,
>>>> };
>>>>
>>>> -#define X86_64_MAX_REGS 16
>>>> -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
>>>> - "%ax",
>>>> - "%dx",
>>>> - "%cx",
>>>> - "%bx",
>>>> - "%si",
>>>> - "%di",
>>>> - "%bp",
>>>> - "%sp",
>>>> - "%r8",
>>>> - "%r9",
>>>> - "%r10",
>>>> - "%r11",
>>>> - "%r12",
>>>> - "%r13",
>>>> - "%r14",
>>>> - "%r15",
>>>> +static const struct pt_regs_offset x86_64_regoffset_table[] = {
>>>> + REG_OFFSET_NAME_64("%ax", rax),
>>>> + REG_OFFSET_NAME_64("%dx", rdx),
>>>> + REG_OFFSET_NAME_64("%cx", rcx),
>>>> + REG_OFFSET_NAME_64("%bx", rbx),
>>>> + REG_OFFSET_NAME_64("%si", rsi),
>>>> + REG_OFFSET_NAME_64("%di", rdi),
>>>> + REG_OFFSET_NAME_64("%bp", rbp),
>>>> + REG_OFFSET_NAME_64("%sp", rsp),
>>>> + REG_OFFSET_NAME_64("%r8", r8),
>>>> + REG_OFFSET_NAME_64("%r9", r9),
>>>> + REG_OFFSET_NAME_64("%r10", r10),
>>>> + REG_OFFSET_NAME_64("%r11", r11),
>>>> + REG_OFFSET_NAME_64("%r12", r12),
>>>> + REG_OFFSET_NAME_64("%r13", r13),
>>>> + REG_OFFSET_NAME_64("%r14", r14),
>>>> + REG_OFFSET_NAME_64("%r15", r15),
>>>> + REG_OFFSET_END,
>>>> };
>>>>
>>>> /* TODO: switching by dwarf address size */
>>>> #ifdef __x86_64__
>>>> -#define ARCH_MAX_REGS X86_64_MAX_REGS
>>>> -#define arch_regs_table x86_64_regs_table
>>>> +#define regoffset_table x86_64_regoffset_table
>>>> #else
>>>> -#define ARCH_MAX_REGS X86_32_MAX_REGS
>>>> -#define arch_regs_table x86_32_regs_table
>>>> +#define regoffset_table x86_32_regoffset_table
>>>> #endif
>>>>
>>>> +/* Minus 1 for the ending REG_OFFSET_END */
>>>> +#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
>>>> +
>>>> /* Return architecture dependent register string (for kprobe-tracer) */
>>>> const char *get_arch_regstr(unsigned int n)
>>>> {
>>>> - return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
>>>> + return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
>>>> +}
>>>> +
>>>> +/* Reuse code from arch/x86/kernel/ptrace.c */
>>>> +/**
>>>> + * regs_query_register_offset() - query register offset from its name
>>>> + * @name: the name of a register
>>>> + *
>>>> + * regs_query_register_offset() returns the offset of a register in struct
>>>> + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
>>>> + */
>>>> +int regs_query_register_offset(const char *name)
>>>> +{
>>>> + const struct pt_regs_offset *roff;
>>>> + for (roff = regoffset_table; roff->name != NULL; roff++)
>>>> + if (!strcmp(roff->name, name))
>>>> + return roff->offset;
>>>> + return -EINVAL;
>>>> }
>>>> --
>>>> 1.8.3.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists