[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM9d7cihqiUULBR7JoizDGySVYdOx3TH_CJV=QDpeck3p8z5wg@mail.gmail.com>
Date: Thu, 1 Jun 2023 19:40:58 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Anshuman Khandual <anshuman.khandual@....com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
will@...nel.org, catalin.marinas@....com, mark.rutland@....com,
Mark Brown <broonie@...nel.org>,
James Clark <james.clark@....com>,
Rob Herring <robh@...nel.org>, Marc Zyngier <maz@...nel.org>,
Suzuki Poulose <suzuki.poulose@....com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH V11 08/10] arm64/perf: Add struct brbe_regset helper functions
On Tue, May 30, 2023 at 9:15 PM Anshuman Khandual
<anshuman.khandual@....com> wrote:
>
> The primary abstraction level for fetching branch records from BRBE HW has
> been changed as 'struct brbe_regset', which contains storage for all three
> BRBE registers i.e BRBSRC, BRBTGT, BRBINF. Whether branch record processing
> happens in the task sched out path, or in the PMU IRQ handling path, these
> registers need to be extracted from the HW. Afterwards both live and stored
> sets need to be stitched together to create final branch records set. This
> adds required helper functions for such operations.
>
> Cc: Catalin Marinas <catalin.marinas@....com>
> Cc: Will Deacon <will@...nel.org>
> Cc: Mark Rutland <mark.rutland@....com>
> Cc: linux-arm-kernel@...ts.infradead.org
> Cc: linux-kernel@...r.kernel.org
> Tested-by: James Clark <james.clark@....com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@....com>
> ---
[SNIP]
> +
> +static inline void copy_brbe_regset(struct brbe_regset *src, int src_idx,
> + struct brbe_regset *dst, int dst_idx)
> +{
> + dst[dst_idx].brbinf = src[src_idx].brbinf;
> + dst[dst_idx].brbsrc = src[src_idx].brbsrc;
> + dst[dst_idx].brbtgt = src[src_idx].brbtgt;
> +}
> +
> +/*
> + * This function concatenates branch records from stored and live buffer
> + * up to maximum nr_max records and the stored buffer holds the resultant
> + * buffer. The concatenated buffer contains all the branch records from
> + * the live buffer but might contain some from stored buffer considering
> + * the maximum combined length does not exceed 'nr_max'.
> + *
> + * Stored records Live records
> + * ------------------------------------------------^
> + * | S0 | L0 | Newest |
> + * --------------------------------- |
> + * | S1 | L1 | |
> + * --------------------------------- |
> + * | S2 | L2 | |
> + * --------------------------------- |
> + * | S3 | L3 | |
> + * --------------------------------- |
> + * | S4 | L4 | nr_max
> + * --------------------------------- |
> + * | | L5 | |
> + * --------------------------------- |
> + * | | L6 | |
> + * --------------------------------- |
> + * | | L7 | |
> + * --------------------------------- |
> + * | | | |
> + * --------------------------------- |
> + * | | | Oldest |
> + * ------------------------------------------------V
> + *
> + *
> + * S0 is the newest in the stored records, where as L7 is the oldest in
> + * the live reocords. Unless the live buffer is detetcted as being full
> + * thus potentially dropping off some older records, L7 and S0 records
> + * are contiguous in time for a user task context. The stitched buffer
> + * here represents maximum possible branch records, contiguous in time.
> + *
> + * Stored records Live records
> + * ------------------------------------------------^
> + * | L0 | L0 | Newest |
> + * --------------------------------- |
> + * | L0 | L1 | |
> + * --------------------------------- |
> + * | L2 | L2 | |
> + * --------------------------------- |
> + * | L3 | L3 | |
> + * --------------------------------- |
> + * | L4 | L4 | nr_max
> + * --------------------------------- |
> + * | L5 | L5 | |
> + * --------------------------------- |
> + * | L6 | L6 | |
> + * --------------------------------- |
> + * | L7 | L7 | |
> + * --------------------------------- |
> + * | S0 | | |
> + * --------------------------------- |
> + * | S1 | | Oldest |
> + * ------------------------------------------------V
> + * | S2 | <----|
> + * ----------------- |
> + * | S3 | <----| Dropped off after nr_max
> + * ----------------- |
> + * | S4 | <----|
> + * -----------------
> + */
> +static int stitch_stored_live_entries(struct brbe_regset *stored,
> + struct brbe_regset *live,
> + int nr_stored, int nr_live,
> + int nr_max)
> +{
> + int nr_total, nr_excess, nr_last, i;
> +
> + nr_total = nr_stored + nr_live;
> + nr_excess = nr_total - nr_max;
> +
> + /* Stored branch records in stitched buffer */
> + if (nr_live == nr_max)
> + nr_stored = 0;
> + else if (nr_excess > 0)
> + nr_stored -= nr_excess;
> +
> + /* Stitched buffer branch records length */
> + if (nr_total > nr_max)
> + nr_last = nr_max;
> + else
> + nr_last = nr_total;
> +
> + /* Move stored branch records */
> + for (i = 0; i < nr_stored; i++)
> + copy_brbe_regset(stored, i, stored, nr_last - nr_stored - 1 + i);
I'm afraid it can overwrite some entries if nr_live is small
and nr_stored is big. Why not use memmove()?
Also I think it'd be simpler if you copy store to live.
It'll save copying live in the IRQ but it will copy the
whole content to store again for the sched switch.
Thanks,
Namhyung
> +
> + /* Copy live branch records */
> + for (i = 0; i < nr_live; i++)
> + copy_brbe_regset(live, i, stored, i);
> +
> + return nr_last;
> +}
> +
> /*
> * Generic perf branch filters supported on BRBE
> *
> --
> 2.25.1
>
Powered by blists - more mailing lists