lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220205153940.GB391033@leoy-ThinkPad-X240s>
Date:   Sat, 5 Feb 2022 23:39:40 +0800
From:   Leo Yan <leo.yan@...aro.org>
To:     German Gomez <german.gomez@....com>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        will@...nel.org, mark.rutland@....com, james.clark@....com
Subject: Re: [RFC PATCH 1/2] perf: arm_spe: Fix consistency of PMSCR register
 bit CX

Hi German,

On Mon, Jan 17, 2022 at 12:44:31PM +0000, German Gomez wrote:
> The arm_spe_pmu driver will enable the CX bit of the PMSCR register in
> order to add CONTEXT packets into the traces if the owner of the perf
> event runs with capabilities CAP_PERFMON or CAP_SYS_ADMIN.
> 
> The value of the bit is computed in the arm_spe_event_to_pmscr function
> [1], and the check for capabilities happens in [2] in the pmu init
> callback. This suggests that the value of the CX bit should remain
> consistent for the duration of the perf session.
> 
> However, the function arm_spe_event_to_pmscr may be called later during
> the start callback [3] when the "current" process is not the owner of
> the perf session, so the CX bit is currently not consistent. Consider
> the following example:
> 
>   1. Run a process in the background with capability CAP_SYS_ADMIN in CPU0.
> 
>     $ taskset --cpu-list 0 sudo dd if=/dev/random of=/dev/null &
>     [3] 3806
> 
>   2. Begin a perf session _without_ capabilities (we shouldn't see CONTEXT packets).
> 
>     $ perf record -e arm_spe_0// -C0 -- sleep 1
>     $ perf report -D | grep CONTEXT
>     .  0000000e:  65 df 0e 00 00                                  CONTEXT 0xedf el2
>     .  0000004e:  65 df 0e 00 00                                  CONTEXT 0xedf el2
>     .  0000008e:  65 df 0e 00 00                                  CONTEXT 0xedf el2
>     [...]

Could you confirm if you still can reproduce this issue on the latest
mainline kernel?  I cannot reproduce this issue on the latest mainline
kernel, I suspect this is impacted by recent refactoring evlist
patches from Ian Rogers (though I am not for this).

> As can be seen, the traces begin showing CONTEXT packets when the pid is
> 0xedf (3807). This happens because the pmu start callback is run when
> the current process is not the owner of the perf session, so the CX
> register bit is set.

I can image a potential issue is: the "dd" program running in background
with capability CAP_SYS_ADMIN on CPU0, and then perf session sends an
IPI remotely from any other CPU to CPU0, the dd process (on CPU0) is
interrupted to handle ioctl PERF_EVENT_IOC_ENABLE, thus perfmon_capable()
returns the capability of dd process, finally it leads to the wrong
setting for PMSCR.

I reviewed the code and also traced the backtrace for the function
arm_spe_pmu_start(), I can confirm that every time perf session will
execute below flow:

  evlist__enable()
    __evlist__enable()
      evlist__for_each_cpu() {  -> call affinity__set()
        evsel__enable_cpu()
      }

We can see the macro evlist__for_each_cpu() will extend to invoke
evlist__cpu_begin() and affinity__set(); affinity__set() will set CPU
affinity to the target CPU, thus perf process will firstly migrate to
the target CPU and enable event on the target CPU.  This means perf
will not send remote IPI and it directly runs on target CPU, and the
dd program will not interfere capabilities for perf session.

Thanks,
Leo

> One way to fix this is by caching the value of the CX bit during the
> initialization of the PMU event, so that it remains consistent for the
> duration of the session.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/arm_spe_pmu.c?h=v5.16#n275
> [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/arm_spe_pmu.c?h=v5.16#n713
> [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/arm_spe_pmu.c?h=v5.16#n751
> 
> Signed-off-by: German Gomez <german.gomez@....com>
> ---
>  drivers/perf/arm_spe_pmu.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index d44bcc29d..8515bf85c 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -57,6 +57,7 @@ struct arm_spe_pmu {
>  	u16					pmsver;
>  	u16					min_period;
>  	u16					counter_sz;
> +	bool					pmscr_cx;
>  
>  #define SPE_PMU_FEAT_FILT_EVT			(1UL << 0)
>  #define SPE_PMU_FEAT_FILT_TYP			(1UL << 1)
> @@ -260,6 +261,7 @@ static const struct attribute_group *arm_spe_pmu_attr_groups[] = {
>  static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>  {
>  	struct perf_event_attr *attr = &event->attr;
> +	struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu);
>  	u64 reg = 0;
>  
>  	reg |= ATTR_CFG_GET_FLD(attr, ts_enable) << SYS_PMSCR_EL1_TS_SHIFT;
> @@ -272,7 +274,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>  	if (!attr->exclude_kernel)
>  		reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
>  
> -	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
> +	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && spe_pmu->pmscr_cx)
>  		reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
>  
>  	return reg;
> @@ -709,10 +711,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>  	    !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT))
>  		return -EOPNOTSUPP;
>  
> +	spe_pmu->pmscr_cx = perfmon_capable();
>  	reg = arm_spe_event_to_pmscr(event);
>  	if (!perfmon_capable() &&
>  	    (reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
> -		    BIT(SYS_PMSCR_EL1_CX_SHIFT) |
>  		    BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
>  		return -EACCES;
>  
> -- 
> 2.25.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ