lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Dec 2022 11:17:20 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Peter Newman <peternewman@...gle.com>
CC:     Fenghua Yu <fenghua.yu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        James Morse <james.morse@....com>,
        Shaopeng Tan <tan.shaopeng@...itsu.com>,
        Jamie Iles <quic_jiles@...cinc.com>,
        <linux-kernel@...r.kernel.org>, <eranian@...gle.com>,
        Babu Moger <Babu.Moger@....com>
Subject: Re: [PATCH] x86/resctrl: Fix event counts regression in reused RMIDs

Hi Peter,

On 12/14/2022 6:21 AM, Peter Newman wrote:
> On Thu, Dec 8, 2022 at 7:31 PM Reinette Chatre
> <reinette.chatre@...el.com> wrote:
>>
>> I think this can be cleaned up to make the code more clear. Notice the
>> duplication of following snippet in __mon_event_count():
>> rr->val += tval;
>> return 0;
>>
>> I do not see any need to check the event id before doing the above. That
>> leaves the bulk of the switch just needed for the rr->first handling that
>> can be moved to resctrl_arch_reset_rmid().
>>
>> Something like:
>>
>> void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, ...
>> {
>> ...
>> struct arch_mbm_state *am;
>> struct mbm_state *m;
>> u64 val = 0;
>> int ret;
>>
>> m = get_mbm_state(d, rmid, eventid); /* get_mbm_state() to be created */
> 
> Good call. When prototyping another change, I quickly found the need to
> create this myself.
> 
>> if (m)
>> memset(m, 0, sizeof(*m));
> 
> mbm_state is arch-independent, so I think putting it here would require
> the MPAM version to copy this and for get_mbm_state() to be exported.

You are correct, it is arch independent ... so every arch is expected to
have it.
I peeked at your series and that looks good also - having cleanup done in
a central place helps to avoid future mistakes.

>> am = get_arch_mbm_state(hw_dom, rmid, eventid);
>> if (am) {
>> memset(am, 0, sizeof(*am));
>> /* Record any initial, non-zero count value. */
>> ret = __rmid_read(rmid, eventid, &val);
>> if (!ret)
>> am->prev_msr = val;
>> }
>>
>> }
>>
>> Having this would be helpful as reference to Babu's usage.
> 
> His usage looks a little different.
> 
> According to the comment in Babu's patch:
> 
> https://lore.kernel.org/lkml/166990903030.17806.5106229901730558377.stgit@bmoger-ubuntu/
> 
> + /*
> + * When an Event Configuration is changed, the bandwidth counters
> + * for all RMIDs and Events will be cleared by the hardware. The
> + * hardware also sets MSR_IA32_QM_CTR.Unavailable (bit 62) for
> + * every RMID on the next read to any event for every RMID.
> + * Subsequent reads will have MSR_IA32_QM_CTR.Unavailable (bit 62)
> + * cleared while it is tracked by the hardware. Clear the
> + * mbm_local and mbm_total counts for all the RMIDs.
> + */
> + resctrl_arch_reset_rmid_all(r, d);
> 
> If all the hardware counters are zeroed as the comment suggests, then
> leaving am->prev_msr zero seems correct. __rmid_read() would likely
> return an error anyways. The bug I was addressing was one of reusing
> an RMID which had not been reset.

You are correct, but there are two things to keep in mind though:
* the change from which you copied the above snippet introduces a new
  _generic_ utility far away from this call site. It is thus reasonable to
  assume that this utility should work for all use cases, not just the one
  for which it is created. Since there are no other use cases at this time,
  this may be ok, but I think at minimum the utility will benefit from 
  a snippet indicating the caveats of its use as a heads up to any future users.
* the utility does not clear struct mbm_state contents. Again, this is ok
  for this usage since AMD does not support the software controller but 
  as far as a generic utility goes the usage should be clear to avoid
  traps for future changes.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ