lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 28 Jun 2019 10:08:46 +0100
From:   Will Deacon <will@...nel.org>
To:     Ganapatrao Kulkarni <gklkml16@...il.com>
Cc:     Ganapatrao Kulkarni <gkulkarni@...vell.com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "Will.Deacon@....com" <Will.Deacon@....com>,
        "mark.rutland@....com" <mark.rutland@....com>,
        "corbet@....net" <corbet@....net>, jnair@...vell.com,
        rrichter@...vell.com, jglauber@...vell.com
Subject: Re: [PATCH 2/2] drivers/perf: Add CCPI2 PMU support in ThunderX2
 UNCORE driver.

Hi again, Ganapat,

Thanks for the quick reply.

On Fri, Jun 28, 2019 at 11:09:33AM +0530, Ganapatrao Kulkarni wrote:
> On Thu, Jun 27, 2019 at 3:27 PM Will Deacon <will@...nel.org> wrote:
> > On Fri, Jun 14, 2019 at 05:42:46PM +0000, Ganapatrao Kulkarni wrote:
> > > CCPI2 is a low-latency high-bandwidth serial interface for connecting
> > > ThunderX2 processors. This patch adds support to capture CCPI2 perf events.
> > >
> > > Signed-off-by: Ganapatrao Kulkarni <gkulkarni@...vell.com>
> > > ---
> > >  drivers/perf/thunderx2_pmu.c | 179 ++++++++++++++++++++++++++++++-----
> > >  1 file changed, 157 insertions(+), 22 deletions(-)
> > >
> > > diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
> > > index 43d76c85da56..3791ac9b897d 100644
> > > --- a/drivers/perf/thunderx2_pmu.c
> > > +++ b/drivers/perf/thunderx2_pmu.c
> > > @@ -16,7 +16,9 @@
> > >   * they need to be sampled before overflow(i.e, at every 2 seconds).
> > >   */
> > >
> > > -#define TX2_PMU_MAX_COUNTERS         4
> > > +#define TX2_PMU_DMC_L3C_MAX_COUNTERS 4
> >
> > I find it odd that you rename this...
> 
> i am not sure, how to avoid this since dmc/l3c have 4 counters and ccpi2 has 8.
> i will try to make this better in v2.
> >
> > > +#define TX2_PMU_CCPI2_MAX_COUNTERS   8
> > > +
> > >  #define TX2_PMU_DMC_CHANNELS         8
> > >  #define TX2_PMU_L3_TILES             16
> > >
> > > @@ -28,11 +30,22 @@
> > >    */
> > >  #define DMC_EVENT_CFG(idx, val)              ((val) << (((idx) * 8) + 1))
> > >
> > > +#define GET_EVENTID_CCPI2(ev)                ((ev->hw.config) & 0x1ff)
> > > +/* bits[3:0] to select counters, starts from 8, bit[3] set always. */
> > > +#define GET_COUNTERID_CCPI2(ev)              ((ev->hw.idx) & 0x7)
> > > +#define CCPI2_COUNTER_OFFSET         8
> >
> >
> > ... but leave GET_EVENTID alone, even though it only applies to DMC/L3C
> > events. Saying that, if you have the event you can figure out its type,
> > so could you avoid the need for additional macros entirely and just use
> > the correct masks based on the corresponding PMU type? That might also
> > avoid some of the conditional control flow you're introducing elsewhere.
> 
> sure, i will add mask as argument to macro.
> >
> > >  #define L3C_COUNTER_CTL                      0xA8
> > >  #define L3C_COUNTER_DATA             0xAC
> > >  #define DMC_COUNTER_CTL                      0x234
> > >  #define DMC_COUNTER_DATA             0x240
> > >
> > > +#define CCPI2_PERF_CTL                       0x108
> > > +#define CCPI2_COUNTER_CTL            0x10C
> > > +#define CCPI2_COUNTER_SEL            0x12c
> > > +#define CCPI2_COUNTER_DATA_L         0x130
> > > +#define CCPI2_COUNTER_DATA_H         0x134
> > > +
> > >  /* L3C event IDs */
> > >  #define L3_EVENT_READ_REQ            0xD
> > >  #define L3_EVENT_WRITEBACK_REQ               0xE
> > > @@ -51,9 +64,16 @@
> > >  #define DMC_EVENT_READ_TXNS          0xF
> > >  #define DMC_EVENT_MAX                        0x10
> > >
> > > +#define CCPI2_EVENT_REQ_PKT_SENT     0x3D
> > > +#define CCPI2_EVENT_SNOOP_PKT_SENT   0x65
> > > +#define CCPI2_EVENT_DATA_PKT_SENT    0x105
> > > +#define CCPI2_EVENT_GIC_PKT_SENT     0x12D
> > > +#define CCPI2_EVENT_MAX                      0x200
> > > +
> > >  enum tx2_uncore_type {
> > >       PMU_TYPE_L3C,
> > >       PMU_TYPE_DMC,
> > > +     PMU_TYPE_CCPI2,
> > >       PMU_TYPE_INVALID,
> > >  };
> > >
> > > @@ -73,8 +93,8 @@ struct tx2_uncore_pmu {
> > >       u32 max_events;
> > >       u64 hrtimer_interval;
> > >       void __iomem *base;
> > > -     DECLARE_BITMAP(active_counters, TX2_PMU_MAX_COUNTERS);
> > > -     struct perf_event *events[TX2_PMU_MAX_COUNTERS];
> > > +     DECLARE_BITMAP(active_counters, TX2_PMU_CCPI2_MAX_COUNTERS);
> > > +     struct perf_event *events[TX2_PMU_DMC_L3C_MAX_COUNTERS];
> >
> > Hmm, that looks very odd. Why are they now different sizes?
> 
> events[ ] is used to hold reference to active events to use in timer
> callback, which is not applicable to ccpi2, hence 4.
> active_counters is set to max of both. i.e, 8. i will try to make it
> better readable in v2.

Thanks. I suspect renaming the field would help a lot, or perhaps reworking
your data structures so that you have a union of ccpi2 and dmc/l2c
structures where necessary.

> > >       struct device *dev;
> > >       struct hrtimer hrtimer;
> > >       const struct attribute_group **attr_groups;
> > > @@ -92,7 +112,21 @@ static inline struct tx2_uncore_pmu *pmu_to_tx2_pmu(struct pmu *pmu)
> > >       return container_of(pmu, struct tx2_uncore_pmu, pmu);
> > >  }
> > >
> > > -PMU_FORMAT_ATTR(event,       "config:0-4");
> > > +#define TX2_PMU_FORMAT_ATTR(_var, _name, _format)                    \
> > > +static ssize_t                                                               \
> > > +__tx2_pmu_##_var##_show(struct device *dev,                          \
> > > +                            struct device_attribute *attr,           \
> > > +                            char *page)                              \
> > > +{                                                                    \
> > > +     BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE);                     \
> > > +     return sprintf(page, _format "\n");                             \
> > > +}                                                                    \
> > > +                                                                     \
> > > +static struct device_attribute format_attr_##_var =                  \
> > > +     __ATTR(_name, 0444, __tx2_pmu_##_var##_show, NULL)
> > > +
> > > +TX2_PMU_FORMAT_ATTR(event, event, "config:0-4");
> > > +TX2_PMU_FORMAT_ATTR(event_ccpi2, event, "config:0-9");
> >
> > This doesn't look right. Can perf deal with overlapping fields? It seems
> > wrong that we'd allow the user to specify both, for example.
> 
> I am not sure what is the issue here? both are different PMUs
> root@...-26> cat /sys/bus/event_source/devices/uncore_dmc_0/format/event
> config:0-4
> root@...-26> cat /sys/bus/event_source/devices/uncore_ccpi2_0/format/event
> config:0-9

Ah, sorry about that. I got _var and _name the wrong way around and thought
you were introducing a file called event_ccpi2! What you have looks fine.

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ