lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o7ks16gh.fsf@intel.com>
Date:   Tue, 04 Jul 2023 10:28:14 +0300
From:   Jani Nikula <jani.nikula@...ux.intel.com>
To:     Uros Bizjak <ubizjak@...il.com>, intel-gfx@...ts.freedesktop.org,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Cc:     Uros Bizjak <ubizjak@...il.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        David Airlie <airlied@...il.com>,
        Daniel Vetter <daniel@...ll.ch>
Subject: Re: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in
 i915_pmu_event_read

On Mon, 03 Jul 2023, Uros Bizjak <ubizjak@...il.com> wrote:
> Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old
> in i915_pmu_event_read.  x86 CMPXCHG instruction returns success in ZF flag,
> so this change saves a compare after cmpxchg (and related move instruction
> in front of cmpxchg).
>
> Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
> fails. There is no need to re-read the value in the loop.
>
> No functional change intended.
>
> Cc: Jani Nikula <jani.nikula@...ux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@...el.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
> Cc: David Airlie <airlied@...il.com>
> Cc: Daniel Vetter <daniel@...ll.ch>
> Signed-off-by: Uros Bizjak <ubizjak@...il.com>
> ---
>  drivers/gpu/drm/i915/i915_pmu.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index d35973b41186..108b675088ba 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event)
>  		event->hw.state = PERF_HES_STOPPED;
>  		return;
>  	}
> -again:
> -	prev = local64_read(&hwc->prev_count);
> -	new = __i915_pmu_event_read(event);
>  
> -	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> -		goto again;
> +	prev = local64_read(&hwc->prev_count);
> +	do {
> +		new = __i915_pmu_event_read(event);
> +	} while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));

You could save everyone a lot of time by actually documenting what these
functions do. Assume you don't know what local64_try_cmpxchg() does, and
see how many calls you have to go through to figure it out.

Because the next time I encounter this code or a patch like this, I'm
probably going to have to do that again.

To me, the old one was more readable. The optimization is meaningless to
me if it's not quantified but reduces readability.


BR,
Jani.


>  
>  	local64_add(new - prev, &event->count);
>  }

-- 
Jani Nikula, Intel Open Source Graphics Center

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ