lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 18 Jun 2020 14:03:32 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Will Deacon <will@...nel.org>
Cc:     Jiping Ma <Jiping.Ma2@...driver.com>, zhe.he@...driver.com,
        bruce.ashfield@...il.com, yue.tao@...driver.com,
        will.deacon@....com, linux-kernel@...r.kernel.org,
        paul.gortmaker@...driver.com, catalin.marinas@....com,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH][V3] arm64: perf: Get the wrong PC value in REGS_ABI_32
 mode

On Thu, May 28, 2020 at 08:54:19AM +0100, Will Deacon wrote:
> On Thu, May 28, 2020 at 09:06:07AM +0800, Jiping Ma wrote:
> > On 05/27/2020 11:19 PM, Mark Rutland wrote:
> > > On Wed, May 27, 2020 at 09:33:00AM +0800, Jiping Ma wrote:
> > > > On 05/26/2020 06:26 PM, Mark Rutland wrote:
> > > > > On Mon, May 11, 2020 at 10:52:07AM +0800, Jiping Ma wrote:
> > > > This modification can not fix our issue,� we need
> > > > perf_reg_abi(current) == PERF_SAMPLE_REGS_ABI_32 to judge if it is 32-bit
> > > > task or not,
> > > > then return the correct PC value.
> > > I must be missing something here.
> > > 
> > > The core code perf_reg_abi(task) is called with the task being sampled,
> > > and the regs are from the task being sampled. For a userspace sample for
> > > a compat task, compat_user_mode(regs) should be equivalent to the
> > > is_compat_thread(task_thread_info(task)) check.
> > > 
> > > What am I missing?
> > This issue caused by PC value is not correct. regs are sampled in function
> > perf_output_sample_regs, that call perf_reg_value(regs, bit) to get PC
> > value.
> > PC value is regs[15] in perf_reg_value() function. it should be regs[32].
> > 
> > perf_output_sample_regs(struct perf_output_handle *handle,
> > ����������������������� struct pt_regs *regs, u64 mask)
> > {
> > ������� int bit;
> > ������� DECLARE_BITMAP(_mask, 64);
> > 
> > ������� bitmap_from_u64(_mask, mask);
> > ������� for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) {
> > ��������������� u64 val;
> > 
> > ��������������� val = perf_reg_value(regs, bit);
> > ��������������� perf_output_put(handle, val);
> > ������� }
> > }
> 
> Yes, but Mark's point is that checking 'compat_user_mode(regs)' should be
> exactly the same as checking 'perf_reg_abi(current) == PERF_SAMPLE_REGS_ABI_32'.
> Are you saying that's not the case? If so, please can you provide an example
> of when they are different?
> 
> Leaving that aside for a second, I also think it's reasonable to question
> whether this whole interface is busted or not. I looked at it last night but
> struggled to work out what it's supposed to do. Consider these three
> scenarios, all under an arm64 kernel:
> 
>   1. 64-bit perf + 64-bit application being profiled
>   2. 64-bit perf + 32-bit application being profiled
>   3. 32-bit perf + 32-bit application being profiled
> 
> It looks like the current code is a bodge to try to handle both (2) and
> (3) at the same time:
> 
>   - In case (3), userspace only asks about registers 0-15
>   - In case (2), we fudge the higher registers so that 64-bit SP and LR
>     hold the 32-bit values as a bodge to allow a 64-bit dwarf unwinder
>     to unwind the stack

I think the fudging is nonsensical to begin with, as I would have
expected that PERF_REGS_ABI_32 should be the same layout regardless of
consumer (and therefore should be identical to the 32-bit arm native
format). I realise that doesn't change that we might be stuck with it...

> So the idea behind the patch looks fine because case (3) is expecting the PC
> in register 15 and instead gets 0, but the temptation is to clean this up so
> that cases (2) and (3) report the same data to userspace (along the lines of
> Mark's patch), namely only the first 16 registers with the PC moved down. We
> can only do that if the unwinder is happy, which it might be if it only ever
> looks up dwarf register numbers based on the unwind tables in the binary.
> Somebody would need to dig into that.

Agreed; I will try to figure out what the perf tool does in the three
cases above. I would be grateful if others could take a look too.

Another slightly scary thought: what happens for a 32-bit perf with a
64-bit application being profiled? I don't see how that'd be forbidden,
but I also don't see how it'd work.

> Otherwise, if it generates unconditional references to things like
> register 30 to grab the link register, then we're stuck with the bodge
> and need to special-case the PC.

I agree that in that case we'd have to keep the existing bodge, and we'd
have to special-case the PC, but I'd prefer to split the logic for case
1 into a separate function for cases 2 and 3 so that we can more easily
avoid getting this more confused.

Let's figure out what userspace does first...

Thanks,
Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ