lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e0919e39-7607-815b-3a12-96f098e45a5f@intel.com>
Date:   Thu, 15 Aug 2019 14:45:57 +0300
From:   Adrian Hunter <adrian.hunter@...el.com>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        bpf@...r.kernel.org, clang-built-linux@...glegroups.com,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Suzuki Poulouse <suzuki.poulose@....com>,
        coresight@...ts.linaro.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v5] perf machine: arm/arm64: Improve completeness for
 kernel address space

On 15/08/19 2:32 PM, Leo Yan wrote:
> Hi Adrian,
> 
> On Thu, Aug 15, 2019 at 11:54:54AM +0300, Adrian Hunter wrote:
> 
> [...]
> 
>>> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
>>> index e4988f49ea79..d7ff839d8b20 100644
>>> --- a/tools/perf/Makefile.config
>>> +++ b/tools/perf/Makefile.config
>>> @@ -48,9 +48,20 @@ ifeq ($(SRCARCH),x86)
>>>    NO_PERF_REGS := 0
>>>  endif
>>>  
>>> +ARM_PRE_START_SIZE := 0
>>> +
>>>  ifeq ($(SRCARCH),arm)
>>>    NO_PERF_REGS := 0
>>>    LIBUNWIND_LIBS = -lunwind -lunwind-arm
>>> +  ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),)
>>> +    # Extract info from lds:
>>> +    #   . = ((0xC0000000)) + 0x00208000;
>>> +    # ARM_PRE_START_SIZE := 0x00208000
>>> +    ARM_PRE_START_SIZE := $(shell egrep ' \. \= \({2}0x[0-9a-fA-F]+\){2}' \
>>> +      $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \
>>> +      sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \
>>> +      awk -F' ' '{printf "0x%x", $$2}' 2>/dev/null)
>>> +  endif
>>>  endif
>>>  
>>>  ifeq ($(SRCARCH),arm64)
>>> @@ -58,8 +69,19 @@ ifeq ($(SRCARCH),arm64)
>>>    NO_SYSCALL_TABLE := 0
>>>    CFLAGS += -I$(OUTPUT)arch/arm64/include/generated
>>>    LIBUNWIND_LIBS = -lunwind -lunwind-aarch64
>>> +  ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),)
>>> +    # Extract info from lds:
>>> +    #  . = ((((((((0xffffffffffffffff)) - (((1)) << (48)) + 1) + (0)) + (0x08000000))) + (0x08000000))) + 0x00080000;
>>> +    # ARM_PRE_START_SIZE := (0x08000000 + 0x08000000 + 0x00080000) = 0x10080000
>>> +    ARM_PRE_START_SIZE := $(shell egrep ' \. \= \({8}0x[0-9a-fA-F]+\){2}' \
>>> +      $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \
>>> +      sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \
>>> +      awk -F' ' '{printf "0x%x", $$6+$$7+$$8}' 2>/dev/null)
>>> +  endif
>>
>> So, that is not going to work if you take a perf.data file to a non-arm machine?
> 
> Yeah, this patch will only allow perf to work correctly when perf
> run natively on arm/arm64, so it can resolve partial of the issue.
> 
>> How come you cannot use kallsyms to get the information?
> 
> Thanks for pointing out this.  Sorry I skipped your comment "I don't
> know how you intend to calculate ARM_PRE_START_SIZE" when you reviewed
> the patch v3, I should use that chance to elaborate the detailed idea
> and so can get more feedback/guidance before procceed.
> 
> Actually, I have considered to use kallsyms when worked on the previous
> patch set.
> 
> As mentioned in patch set v4's cover letter, I tried to implement
> machine__create_extra_kernel_maps() for arm/arm64, the purpose is to
> parse kallsyms so can find more kernel maps and thus also can fixup
> the kernel start address.  But I found the 'perf script' tool directly
> calls machine__get_kernel_start() instead of running into the flow for
> machine__create_extra_kernel_maps();

Doesn't it just need to loop through each kernel map to find the lowest
start address?

>                                      so I finally gave up to use
> machine__create_extra_kernel_maps() for tweaking kernel start address
> and went back to use this patch's approach by parsing lds files.
> 
> So for next step, I want to get some guidances:
> 
> - One method is to add a new weak function, e.g.
>   arch__fix_kernel_text_start(), then every arch can implement its own
>   function to fixup the kernel start address;
> 
>   For arm/arm64, can use kallsyms to find the symbols with least
>   address and fixup for kernel start address.
> 
> - Another method is to directly parse kallsyms in the function
>   machine__get_kernel_start(), thus the change can be used for all
>   archs;
> 
> Seems to me the second method is to address this issue as a common
> issue crossing all archs.  But not sure if this is the requirement for
> all archs or just this is only required for arm/arm64.  Please let me
> know what's your preference or other thoughts.  Thanks a lot!
> 
> Leo.
> 
>>>  endif
>>>  
>>> +CFLAGS += -DARM_PRE_START_SIZE=$(ARM_PRE_START_SIZE)
>>> +
>>>  ifeq ($(SRCARCH),csky)
>>>    NO_PERF_REGS := 0
>>>  endif
>>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
>>> index f6ee7fbad3e4..e993f891bb82 100644
>>> --- a/tools/perf/util/machine.c
>>> +++ b/tools/perf/util/machine.c
>>> @@ -2687,13 +2687,26 @@ int machine__get_kernel_start(struct machine *machine)
>>>  	machine->kernel_start = 1ULL << 63;
>>>  	if (map) {
>>>  		err = map__load(map);
>>> +		if (err)
>>> +			return err;
>>> +
>>>  		/*
>>>  		 * On x86_64, PTI entry trampolines are less than the
>>>  		 * start of kernel text, but still above 2^63. So leave
>>>  		 * kernel_start = 1ULL << 63 for x86_64.
>>>  		 */
>>> -		if (!err && !machine__is(machine, "x86_64"))
>>> +		if (!machine__is(machine, "x86_64"))
>>>  			machine->kernel_start = map->start;
>>> +
>>> +		/*
>>> +		 * On arm/arm64, the kernel uses some memory regions which are
>>> +		 * prior to '_stext' symbol; to reflect the complete kernel
>>> +		 * address space, compensate these pre-defined regions for
>>> +		 * kernel start address.
>>> +		 */
>>> +		if (!strcmp(perf_env__arch(machine->env), "arm") ||
>>> +		    !strcmp(perf_env__arch(machine->env), "arm64"))
>>> +			machine->kernel_start -= ARM_PRE_START_SIZE;
>>>  	}
>>>  	return err;
>>>  }
>>>
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ