lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230316145045.if3iw5qdtfjyroea@bogus>
Date:   Thu, 16 Mar 2023 14:50:45 +0000
From:   Sudeep Holla <sudeep.holla@....com>
To:     Song Shuai <suagrfillet@...il.com>
Cc:     gregkh@...uxfoundation.org, rafael@...nel.org,
        conor.dooley@...rochip.com, ionela.voinescu@....com,
        Sudeep Holla <sudeep.holla@....com>, Pierre.Gondois@....com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2] arch_topology: Clear LLC sibling when cacheinfo
 teardown

On Thu, Mar 16, 2023 at 10:30:52AM +0000, Song Shuai wrote:
> Sudeep Holla <sudeep.holla@....com> 于2023年3月16日周四 09:29写道:
> >
> > On Tue, Mar 14, 2023 at 03:53:45PM +0800, Song Shuai wrote:
> > > The teardown of CPUHP_AP_BASE_CACHEINFO_ONLINE now only invokes
> > > free_cache_attributes() to clear share_cpu_map of cacheinfo list.
> > > At the same time, clearing cpu_topology[].llc_sibling is
> > > called quite late at the teardown code in hotplug STARTING section.
> > >
> > > To avoid the incorrect LLC sibling masks generated, move its clearing
> > > right after free_cache_attributes().
> > >
> >
> > Technically in terms of flow/timing this is correct. However I would like
> > to know if you are seeing any issues without this change ?
> >
> > Technically, if a cpu is hotplugged out, the cacheinfo is reset first
> > and then the topology. Until the cpu is removes, the LLC info in the
> > topology is still valid. Also I am not sure if anything gets scheduled
> > and this LLC info is utilised once the teardown of CPUHP_AP_BASE_CACHEINFO_ONLINE
> > has started.
>
> There is no visible issue in the entire offline process(eg: echo 0 > online).
>
> However, when I hotplugged out the CPU into the state before CACHEINFO_ONLINE on
> my kernel with the CONFIG_CPU_HOTPLUG_STATE_CONTROL configured,
> the share_cpu_map had been updated but llc_sibling had not, which
> would result in a trivial issue:
>
> At the end of stepped hotplugging out, the cpuset_hotplug_work would
> be flushed and then sched domain would be rebuilt
> where the **cpu_coregroup_mask** in sched_domain_topology got
> incorrect llc_sibling, but the result of rebuilding was correct due
> to the protection of cpu_active_mask.
>

Wait, I would like to disagree there. While I agree there is inconsistency
between cacheinfo cpu_shared_map and the llc_sibling in the tear down path,
it is still correct and terming it as "incorrect" llc_sibling is wrong.
The cpu is not yet completely offline yet and hence the llc_sibling
represents all the cpus it shares LLC. When the cpu is offlined, the
cpu_topology is anyway removed. So I don't see it as an issue at all.
If you follow __cpu_disable()->remove_cpu_topology(), it gets updated there.
If the sched_domain_topology is not rebuilt after that, then we may have
other issues. What am I missing ?

I am not bothered by cacheinfo cpu_shared_map and cpu_topology llc_sibling
mismatch for short window during the teardown as technically until the cpu
is torndown, it is sharing llc with llc_sibling and it is definitely not
wrong to have it in there.

> The stepped hotplugging may not be used in the production environment,
> but the issue existed.

What issue ? If it is just inconsistency, then I am fine to ignore. That
is just artificial and momentary and it impacts nothing.

> Even in the entire offline process, it's possible that a future user
> gets wrong the llc_sibling when accessing it concurrently or right
> after the teardown of CACHEINFO_ONLINE.

As I said, even if someone access it, it is not wrong information.

--
Regards,
Sudeep

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ