lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b53da670-c2b3-124a-e315-a0a9d5936eb1@arm.com>
Date:   Fri, 28 Jul 2023 17:34:07 +0100
From:   James Morse <james.morse@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     Fenghua Yu <fenghua.yu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        H Peter Anvin <hpa@...or.com>,
        Babu Moger <Babu.Moger@....com>,
        shameerali.kolothum.thodi@...wei.com,
        D Scott Phillips OS <scott@...amperecomputing.com>,
        carl@...amperecomputing.com, lcherian@...vell.com,
        bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
        xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
        Jamie Iles <quic_jiles@...cinc.com>,
        Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com,
        dfustini@...libre.com
Subject: Re: [PATCH v4 24/24] x86/resctrl: Separate arch and fs resctrl locks

Hi Reinette,

On 15/06/2023 23:26, Reinette Chatre wrote:
> On 5/25/2023 11:02 AM, James Morse wrote:
>> resctrl has one mutex that is taken by the architecture specific code,
>> and the filesystem parts. The two interact via cpuhp, where the
>> architecture code updates the domain list. Filesystem handlers that
>> walk the domains list should not run concurrently with the cpuhp
>> callback modifying the list.
>>
>> Exposing a lock from the filesystem code means the interface is not
>> cleanly defined, and creates the possibility of cross-architecture
>> lock ordering headaches. The interaction only exists so that certain
>> filesystem paths are serialised against cpu hotplug. The cpu hotplug
>> code already has a mechanism to do this using cpus_read_lock().
>>
>> MPAM's monitors have an overflow interrupt, so it needs to be possible
>> to walk the domains list in irq context. RCU is ideal for this,
>> but some paths need to be able to sleep to allocate memory.
>>
>> Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part
>> of a cpuhp callback, cpus_read_lock() must always be taken first.
>> rdtgroup_schemata_write() already does this.
>>
>> Most of the filesystem code's domain list walkers are currently
>> protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live().
>> The exceptions are rdt_bit_usage_show() and the mon_config helpers
>> which take the lock directly.
>>
>> Make the domain list protected by RCU. An architecture-specific
>> lock prevents concurrent writers. rdt_bit_usage_show() can

> This does not sound right to me. The domain list belongs to resctrl,
> the filesystem,

I'd argue the list belongs to the arch code, as only the arch code writes to it.


> having an architecture specific lock protect it does
> not seem like the right thing to do. Could this instead be a resctrl
> owned lock that is hidden behind helpers for the architecture
> code to add domains?

resctrl should never be writing to the domain list, so it never needs to know how
concurrent-writers are avoided.

The memory is allocated and added to the list by the architecture code.
This new domain-list-lock API would only ever be called by the arch code, and there is no
resctrl call that needs to be made with that lock held. I don't see any reason for this to
be visible to the filesystem code.

I'm trying to keep the number of functions exposed by resctrl to the arch code as small as
possible. After all this they are:
resctrl_online_domain()
resctrl_offline_domain()
resctrl_online_cpu()
resctrl_offline_cpu()
resctrl_init()
resctrl_exit()

The weird one is resctrl_get_domain_from_cpu(), which is used by both architecture and
filesystem code, because I wanted to avoid duplicating it. It ends up inlined from a
header file.

Part of this is to explore making resctrl a loadable module in the future, which I think
fits with Tony Luck's aims.


Thanks,

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ