lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Dec 2022 15:21:39 +0100
From:   Peter Newman <peternewman@...gle.com>
To:     Reinette Chatre <reinette.chatre@...el.com>
Cc:     Fenghua Yu <fenghua.yu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>,
        James Morse <james.morse@....com>,
        Shaopeng Tan <tan.shaopeng@...itsu.com>,
        Jamie Iles <quic_jiles@...cinc.com>,
        linux-kernel@...r.kernel.org, eranian@...gle.com,
        Babu Moger <Babu.Moger@....com>
Subject: Re: [PATCH] x86/resctrl: Fix event counts regression in reused RMIDs

Hi Reinette,

On Thu, Dec 8, 2022 at 7:31 PM Reinette Chatre
<reinette.chatre@...el.com> wrote:
>
> I think this can be cleaned up to make the code more clear. Notice the
> duplication of following snippet in __mon_event_count():
> rr->val += tval;
> return 0;
>
> I do not see any need to check the event id before doing the above. That
> leaves the bulk of the switch just needed for the rr->first handling that
> can be moved to resctrl_arch_reset_rmid().
>
> Something like:
>
> void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, ...
> {
> ...
> struct arch_mbm_state *am;
> struct mbm_state *m;
> u64 val = 0;
> int ret;
>
> m = get_mbm_state(d, rmid, eventid); /* get_mbm_state() to be created */

Good call. When prototyping another change, I quickly found the need to
create this myself.

> if (m)
> memset(m, 0, sizeof(*m));

mbm_state is arch-independent, so I think putting it here would require
the MPAM version to copy this and for get_mbm_state() to be exported.

>
> am = get_arch_mbm_state(hw_dom, rmid, eventid);
> if (am) {
> memset(am, 0, sizeof(*am));
> /* Record any initial, non-zero count value. */
> ret = __rmid_read(rmid, eventid, &val);
> if (!ret)
> am->prev_msr = val;
> }
>
> }
>
> Having this would be helpful as reference to Babu's usage.

His usage looks a little different.

According to the comment in Babu's patch:

https://lore.kernel.org/lkml/166990903030.17806.5106229901730558377.stgit@bmoger-ubuntu/

+ /*
+ * When an Event Configuration is changed, the bandwidth counters
+ * for all RMIDs and Events will be cleared by the hardware. The
+ * hardware also sets MSR_IA32_QM_CTR.Unavailable (bit 62) for
+ * every RMID on the next read to any event for every RMID.
+ * Subsequent reads will have MSR_IA32_QM_CTR.Unavailable (bit 62)
+ * cleared while it is tracked by the hardware. Clear the
+ * mbm_local and mbm_total counts for all the RMIDs.
+ */
+ resctrl_arch_reset_rmid_all(r, d);

If all the hardware counters are zeroed as the comment suggests, then
leaving am->prev_msr zero seems correct. __rmid_read() would likely
return an error anyways. The bug I was addressing was one of reusing
an RMID which had not been reset.

>
> Also please note that I changed the __rmid_read(). There is no need
> to require each __rmid_read() caller to test MSR bits for validity, that
> can be contained within __rmid_read().
>
> Something like below remains:
>
> static int __mon_event_count(u32 rmid, struct rmid_read *rr)
> {
>
> ...
>
> if (rr->first) {
> resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid);
> return 0;
> }
>
> rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval);
> if (rr->err)
> return rr->err;
>
> rr->val += tval;
> return 0;
>
> }
>
> What do you think?

Looks much better. This function has been bothering me since the
refactor. I'll see how close I can get to this in the next patch.

Thanks!
-Peter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ