lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230918114420.000058c3@Huawei.com>
Date:   Mon, 18 Sep 2023 11:44:20 +0100
From:   Jonathan Cameron <Jonathan.Cameron@...wei.com>
To:     Drew Fustini <dfustini@...libre.com>
CC:     Tony Luck <tony.luck@...el.com>, <babu.moger@....com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        "james.morse@....com" <james.morse@....com>,
        Amit Singh Tomar <amitsinght@...vell.com>,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        George Cherian <gcherian@...vell.com>,
        "robh@...nel.org" <robh@...nel.org>,
        "peternewman@...gle.com" <peternewman@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: resctrl2 - status

On Fri, 15 Sep 2023 10:55:58 -0700
Drew Fustini <dfustini@...libre.com> wrote:

> On Fri, Sep 08, 2023 at 04:13:54PM -0700, Tony Luck wrote:
> > On Fri, Sep 08, 2023 at 04:35:05PM -0500, Moger, Babu wrote:  
> > > Hi Tony,
> > > 
> > > 
> > > On 9/8/2023 1:51 PM, Luck, Tony wrote:  
> > > > > > Can you try this out on an AMD system. I think I covered most of the
> > > > > > existing AMD resctrl features, but I have no machine to test the code
> > > > > > on, so very likely there are bugs in these code paths.
> > > > > > 
> > > > > > I'd like to make any needed changes now, before I start breaking this
> > > > > > into reviewable bite-sized patches to avoid too much churn.  
> > > > > I tried your latest code briefly on my system.  Unfortunately, I could
> > > > > not get it to work on my AMD system.
> > > > > 
> > > > > # git branch -a
> > > > >     next
> > > > > * resctrl2_v65
> > > > > # ]# uname -r
> > > > > 6.5.0+
> > > > > #lsmod |grep rdt
> > > > > rdt_show_ids           12288  0
> > > > > rdt_mbm_local_bytes    12288  0
> > > > > rdt_mbm_total_bytes    12288  0
> > > > > rdt_llc_occupancy      12288  0
> > > > > rdt_l3_cat             16384  0
> > > > > 
> > > > > # lsmod |grep mbe
> > > > > amd_mbec               16384  0
> > > > > 
> > > > > I could not get  rdt_l3_mba
> > > > > 
> > > > > # modprobe rdt_l3_mba
> > > > > modprobe: ERROR: could not insert 'rdt_l3_mba': No such device
> > > > > 
> > > > > I don't see any data for the default group either.
> > > > > 
> > > > > mount  -t resctrl resctrl /sys/fs/resctrl/
> > > > > 
> > > > > cd /sys/fs/resctrl/mon_data/mon_L3_00
> > > > > 
> > > > > cat mbm_summary
> > > > >        n/a      n/a /  
> > > > Babu,
> > > > 
> > > > Thank a bunch for taking this for a quick spin. There's several bits of
> > > > good news there. Several modules automatically loaded as expected.
> > > > Nothing went "OOPS" and crashed the system.
> > > > 
> > > > Here’s the code that the rdt_l3_mba module runs that can cause failure
> > > > to load with "No such device"
> > > > 
> > > >          if (!boot_cpu_has(X86_FEATURE_RDT_A)) {
> > > >                  pr_debug("No RDT allocation support\n");
> > > >                  return -ENODEV;
> > > >          }  
> > > 
> > > Shouldn't this be ?(or similar)
> > > 
> > > if (!rdt_cpu_has(X86_FEATURE_MBA))
> > >                 return false;  
> > 
> > Yes. I should be using X86_FEATURE bits where they are available
> > rather than peeking directly at CPUID register bits.
> >   
> > >   
> > > >          mba_features = cpuid_ebx(0x10);
> > > > 
> > > >          if (!(mba_features & BIT(3))) {
> > > >                  pr_debug("No RDT MBA allocation\n");
> > > >                  return -ENODEV;
> > > >          }
> > > > 
> > > > I assume the first test must have succeeded (same code in rdt_l3_cat, and
> > > > that loaded OK). So must be the second. How does AMD enumerate MBA
> > > > support?
> > > > 
> > > > Less obvious what is the root cause of the mbm_summary file to fail to
> > > > show any data. rdt_mbm_local_bytes  and rdt_mbm_total_bytes  modules
> > > > loaded OK. So I'm looking for the right CPUID bits to detect memory bandwidth
> > > > monitoring.  
> > > 
> > > I am still not sure if resctrl2 will address all the current gaps in
> > > resctrl1. We should probably list all issues on the table before we go that
> > > route.  
> > 
> > Indeed yes! I don't want to have to do resctrl3 in a few years to
> > cover gaps that could have been addressed in resctrl2.
> > 
> > However, fixing resctrl gaps is only one of the motivations for
> > the rewrite. The bigger one is making life easier for all the
> > architectures sharing the common code to do what they need to
> > for their own quirks & differences without cluttering the
> > common code base, or worrying "did my change just break something
> > for another CPU architecture".
> >   
> > > One of the main issue for AMD is coupling of LLC domains.
> > > 
> > > For example, AMD hardware supports 16 CLOSids per LLC domain. But Linux
> > > design assumes that there are globally 16 total CLOSIDs for the whole
> > > systems. We can only create 16 CLOSID now irrespective of how many domains
> > > are there.
> > > 
> > > In reality, we should be able to create "16 x number of LLC domains" CLOSIDS
> > > in the systems.  This is more evident in AMD. But, same problem applies to
> > > Intel with multiple sockets.  
> > 
> > I think this can be somewhat achieved already with a combination of
> > resctrl and cpusets (or some other way to set CPU affinity for tasks
> > to only run on CPUs within a specific domain (or set of domains).
> > That's why the schemata file allows setting different CBM masks
> > per domain.
> > 
> > Can you explain how you would use 64 domains on a system with 4 domains
> > and 16 CLOSID per domain?
> >   
> > > My 02 cents. Hope to discuss more in our upcoming meeting.  
> > Agreed. This will be faster when we can talk instead of type :-)  
> 
> Is it a meeting that other interested developers can join?
> 
> This reminds me that Linux Plumbers Conference [1] is in November and
> I think resctrl2 could be a good topic. The CFP is still open for Birds
> of a Feather (BoF) proposals [2]. These are free-form get-togethers for
> people wishing to discuss a particular topic, and I have had success
> hosting them in the past for topics like pinctrl and gpio.
> 
> Anyone planning to attend Plumbers?
> 
> I'll be going in person but the virtual option works really well in my
> experience. I had developers and maintainers attending virtually
> participate in my BoF sessions and I felt it was very productive.

FWIW I'm keen and should be there in person.  However, I'm not on the must
be available list for this one ;)   Agree that hybrid worked fine for BoF last
year.

Jonathan


> 
> thanks,
> drew
> 
> [1] https://lpc.events/
> [2] https://lpc.events/event/17/abstracts/
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ