linux-kernel - Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1701171826180.15892@vshiva-Udesk>
Date:   Tue, 17 Jan 2017 18:38:55 -0800 (PST)
From:   Shivappa Vikas <vikas.shivappa@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
cc:     Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        vikas.shivappa@...el.com, davidcc@...gle.com, eranian@...gle.com,
        linux-kernel@...r.kernel.org, x86@...nel.org, hpa@...or.com,
        mingo@...nel.org, peterz@...radead.org, ravi.v.shankar@...el.com,
        tony.luck@...el.com, fenghua.yu@...el.com, andi.kleen@...el.com,
        h.peter.anvin@...el.com
Subject: Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes



On Tue, 17 Jan 2017, Thomas Gleixner wrote:

> On Fri, 6 Jan 2017, Vikas Shivappa wrote:
>> Cqm(cache quality monitoring) is part of Intel RDT(resource director
>> technology) which enables monitoring and controlling of processor shared
>> resources via MSR interface.
>
> We know that already. No need for advertising this over and over.
>
>> Below are the issues and the fixes we attempt-
>
> Emphasis on attempt.
>
>> - Issue(1): Inaccurate data for per package data, systemwide. Just prints
>> zeros or arbitrary numbers.
>>
>> Fix: Patches fix this by just throwing an error if the mode is not supported.
>> The modes supported is task monitoring and cgroup monitoring.
>> Also the per package
>> data for say socket x is returned with the -C <cpu on socketx> -G cgrpy option.
>> The systemwide data can be looked up by monitoring root cgroup.
>
> Fine. That just lacks any comment in the implementation. Otherwise I would
> not have asked the question about cpu monitoring. Though I fundamentaly
> hate the idea of requiring cgroups for this to work.
>
> If I just want to look at CPU X why on earth do I have to set up all that
> cgroup muck? Just because your main focus is cgroups?

The upstream per cpu data is broken because its not overriding the other task 
event RMIDs on that cpu with the cpu event RMID.

Can be fixed by adding a percpu struct to hold the RMID thats affinitized
to the cpu, however then we miss all the task llc_occupancy in that - still 
evaluating it.

>
>> - Issue(2): RMIDs are global and dont scale with more packages and hence
>> also run out of RMIDs very soon.
>>
>> Fix: Support per pkg RMIDs hence scale better with more
>> packages, and get more RMIDs to use and use when needed (ie when tasks
>> are actually scheduled on the package).
>
> That's fine, just the implementation is completely broken
>
>> - Issue(5): CAT and cqm/mbm write the same PQR_ASSOC_MSR seperately
>> Fix: Integrate the sched in code and write the PQR_MSR only once every switch_to
>
> Brilliant stuff that. I bet that the seperate MSR writes which we have now
> are actually faster even in the worst case of two writes.
>
>> [PATCH 02/12] x86/cqm: Remove cqm recycling/conflict handling
>
> That's the only undisputed and probably correct patch in that whole series.
>
> Executive summary: The whole series except 2/12 is a complete trainwreck.
>
> Major issues:
>
> - Patch split is random and leads to non compilable and non functional
>  steps aside of creating a unreviewable mess.

Will split as per the comments into divisions that are compilable without 
warnings. The series was tested for compile and build but wasnt for warnings. 
(except for the int if_cgroup_event which you pointed..)

>
> - The core code is updated with random hooks which claim to be a generic
>  framework but are completely hardcoded for that particular CQM case.
>
>  The introduced hooks are neither fully documented (semantical and
>  functional) nor justified why they are required.
>
> - The code quality is horrible in coding style, technical correctness and
>  design.
>
> So how can we proceed from here?
>
> I want to see small patch series which change a certain aspect of the
> implementation and only that. They need to be split in preparatory changes,
> which refactor code or add new interfaces to the core code, and the actual
> implementation in CQM/MBM.

Appreciate your time for review and all the feedback.

Will plan to send a smaller patch series which includes the following contents 
at first and then follow up other series:

- the current 2/12 
- add the per package RMID with the fixes mentioned.
- The reuse patch which reuses the RMIDs that are freed from events.
- Patch to indicate error if RMID was still not available.
- Any relavant sched and hot cpu updates.

That way its a functional set where tasks would be able to use different RMIDs 
on different packages and provide correct data for 'task events'. Either the 
data would be correct or would indicate an error that we were limited by h/w 
RMIDs.
Is that a reasonable set to target ?

Then followup with different series to get correct data for 'cgroup events' and 
support cgroup and task together and other issues, system/ per cpu events etc.

Thanks,
Vikas

>
> Each of the patches must have a proper changelog explaining the WHY and if
> required the semantical properties of a new interface.
>
> Each of the patches must compile without warnigns/errors and be fully
> functional vs. the changes they do.
>
> Any attempt to resend this disaster as a whole will be NACKed w/o even
> looking at it. I've wasted enough time with this wholesale approach in the
> past and it did not work out. I'm not going to play that wasteful game
> over and over again.
>
> Thanks,
>
> 	tglx
>
>
>