linux-kernel - Re: [PATCH 08/21] x86/intel_rdt/cqm: Add RMID(Resource monitoring ID) management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1707122208570.2510@nanos>
Date:   Wed, 12 Jul 2017 22:14:01 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Shivappa Vikas <vikas.shivappa@...el.com>
cc:     Vikas Shivappa <vikas.shivappa@...ux.intel.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, hpa@...or.com, peterz@...radead.org,
        ravi.v.shankar@...el.com, tony.luck@...el.com,
        fenghua.yu@...el.com, andi.kleen@...el.com
Subject: Re: [PATCH 08/21] x86/intel_rdt/cqm: Add RMID(Resource monitoring
 ID) management

On Tue, 11 Jul 2017, Shivappa Vikas wrote:
> On Mon, 3 Jul 2017, Thomas Gleixner wrote:
> > That means, the free list is used as the primary source. One of my boxes
> > has 143 RMIDs. So it only takes 142 mkdir/rmdir invocations to move all
> > RMIDs to the limbo list. On the next mkdir invocation the allocation goes
> > into the limbo path and the SMP function call has to walk the list with 142
> > entries on ALL online domains whether they used the RMID or not!
> 
> Would it be better if we do this in the MBM 1s overflow timer delayed_work?
> That is not in the interupt context. So we do a periodic flush of the limbo
> list and then mkdir fails with -EBUSY if list_empty(&free_list) &&
> !list_empty(&limbo_list).

Well, the overflow timer is just running when MBM monitoring is active. I'd
rather avoid tying thing together which do not belong technically together.

> To improve that -
> We may also include the optimization Tony suggested to skip the checks for
> RMIDs which are already checked to be < threshold (however that needs a domain
> mask like I mention below but may be we can just check the list here).

Yes.

> > 
> > 	for_each_domain(d, resource) {
> > 		cpu = cpumask_any_and(d->cpu_mask, tmpmask);
> > 		if (cpu < nr_cpu_ids)
> > 			cpumask_set(cpu, rmid_entry->mask);
> 
> When this cpu goes offline - the rmid_entry->mask needs an update. Otherwise,
> the work function would return true for
>              if (!cpumask_test_cpu(cpu, rme->mask))

Sure. You need to flush the work from the cpu offline callback and then
reschedule it on another online CPU of the domain or clear the domain from the
mask when the last CPU goes offline.
 
> since the work may have been moved to a different cpu.
> 
> So we really need a package mask ? or really a per-domain mask and for that we
> dont know the max domain number(which is why we use a list..)

Well, you can assume a maximum number of domains per package and we have an
upper limit of possible packages. So sizing the mask should be trivial.

Thanks,

	tglx