[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aBSfqU9ssymM/yC5@e129823.arm.com>
Date: Fri, 2 May 2025 11:34:17 +0100
From: Yeoreum Yun <yeoreum.yun@....com>
To: Leo Yan <leo.yan@....com>
Cc: suzuki.poulose@....com, mike.leach@...aro.org, james.clark@...aro.org,
alexander.shishkin@...ux.intel.com, coresight@...ts.linaro.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] coresight: prevent deactivate active config while
enabling the config
Hi Leo,
>
> [...]
>
> > > > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > > > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > > > @@ -1020,6 +1020,9 @@ static void etm4_disable_sysfs(struct coresight_device *csdev)
> > > > smp_call_function_single(drvdata->cpu, etm4_disable_hw, drvdata, 1);
> > > >
> > > > raw_spin_unlock(&drvdata->spinlock);
> > > > +
> > > > + cscfg_csdev_disable_active_config(csdev);
> > > > +
> > >
> > > In general, we need to split changes into several patches if each
> > > addresses a different issue. From my understanding, the change above is
> > > to fix missing to disable config when disable Sysfs mode.
> > >
> > > If so, could we use a seperate patch for this change?
> > >
> >
> > It's not a differnt issue. Without this line, the active count wouldn't
> > decrese and it raise another issue -- unloadable moudle for active_cnt :(
> > So I think it should be included in this patch.
>
> I read the code again and concluded the change above is not related to
> locking and would be a separate issue: when we close a Sysfs session,
> we need to disable a config on a CoreSight device.
> Could you clarify what is meaning "unloadable moudle for active_cnt"?
> I only saw "cscfg_mgr->sys_active_cnt" is used for module unloading,
> but have no clue why "active_cnt" impacts module unloading.
Yes. but it also related "by this patch".
When the load config and "activate" configuration via sysfs,
not only the cscfg_mgr->sys_active_cnt is increase but also
"individual cscfg->active_cnt".
This patch extends the meaning of "cscfg->active_cnt" includes
"enable of configuraiton". so that cscfg_config_desc_put() prevent
decrease "module reference" while holding individual active_cnt.
That's why without this change, the "module reference" couldn't be
decreased. The problem before this change is deactivation doesn't
chekc cscfg->active_cnt but put module reference whenever it calls.
> > > > cpus_read_unlock();
> > > >
> > > > /*
> > > > diff --git a/drivers/hwtracing/coresight/coresight-syscfg.c b/drivers/hwtracing/coresight/coresight-syscfg.c
> > > > index a70c1454b410..6d8c212ad434 100644
> > > > --- a/drivers/hwtracing/coresight/coresight-syscfg.c
> > > > +++ b/drivers/hwtracing/coresight/coresight-syscfg.c
> > > > @@ -391,14 +391,17 @@ static void cscfg_owner_put(struct cscfg_load_owner_info *owner_info)
> > > > static void cscfg_remove_owned_csdev_configs(struct coresight_device *csdev, void *load_owner)
> > > > {
> > > > struct cscfg_config_csdev *config_csdev, *tmp;
> > > > + unsigned long flags;
> > > >
> > > > if (list_empty(&csdev->config_csdev_list))
> > > > return;
> > > >
> > > > + raw_spin_lock_irqsave(&csdev->cscfg_csdev_lock, flags);
> > >
> > > I think we should use spinlock to guard the condition checking
> > > list_empty().
> > >
> > > Here the race condition is the 'config_csdev_list' list and
> > > configurations on the list. For atomicity, we should use lock to
> > > protect any operations on the list (read, add, delete, etc).
> >
> > Interesting... Would you let me know which race it is?
> > here to check list_empty(), it already guarded with "cscfg_mutex".
>
> Thanks for pointing out this. I read the code and understood that in
> some scenarios the list is protected by the mutex "cscfg_mutex".
>
> I would argue for using locking, we need to make clear for two thigns:
>
> - What is the race condition;
> - What locking is used to protect the race condition.
>
> For current case, a CoreSight device has a config list, the race
> condition is the config list will be manipulated by multiple places
> (e.g., for module loading / unloading, opening or closing a perf or
> SysFS session). So a spinlock is used to to protect the config list.
>
> "cscfg_mutex" is a high level lock, my understanding is to protect the
> high level operations from the Sysfs knobs, though sometimes it can
> mitigate the race condition on configuration list mentioned above, but
> the spinlock is the locking mechanism for the low level's config list
> on a CoreSight device.
[...]
> > However list_del() is special case because iterating config_csdev_list
> > can be done without cscfg_mutex -- see
> > cscfg_csdev_enable_active_config().
> > This gurad with spinlock purpose to guard race unloading and
> > get the config in cscfg_csdev_enable_active_config()
> > (Please see my response below...).
> >
> > the emptiness of config_csdev_list is guarded with cscfg_mutex.
> > therefore, It seems enough to guard iterating part with spinlock :)
>
> Mixed using cscfg_mutex and spinlock get code complexity, and I am a bit
> concerned this is not best practice.
>
> At a glance, I would say 'cscfg_mutex' is purposed to protect the global
> 'cscfg_mgr', per CoreSight device's config list should be protected by
> 'cscfg_csdev_lock'.
Agree. BTW this patch is just for bugfix and consider to backport to
lower version.
I think it would be better to improve lock related things with another
patch if it's necessary.
> > > A side topic, as here it adds locks for protecting 'config_csdev_list',
> > > I am wandering why we do not do the same thing for
> > > 'feature_csdev_list' (See cscfg_remove_owned_csdev_features() and
> > > cscfg_get_feat_csdev()).
> >
> > In case of feature, It's okay since it couldn't be accessed when it
> > gets failed to get related config.
>
> I looked at cscfg_load_feat_csdev(), it uses 'cscfg_csdev_lock' to
> protect the feature list. This is why I thought the feature list also
> need to be protected by the lock. Now it is only partially protected.
I don't think the feature don't need.
Since when add the configuration. "feature first and config second"
and reference and remove is done with "config first and feature second".
That menas if enable logic get configuration, we can gurantee the
liveness of each feature.
> > When we see cscfg_csdev_enable_active_config(), the config could be
> > accessed without cscfg_mutex lock. so the config need to be guarded with
> > spin_lock otherwise it could be acquired while unload module
> > (after get active_cnt in search logic cscfg_csdev_enable_active_config()
> > and other running unloading process)
>
> To make things more clear, I have a questions.
>
> 'cscfg_mgr->sys_active_cnt' is used in the cscfg_unload_config_sets()
> function to decide if can unload module, if any configuration is
> active, why this variable cannot prevent unloading module?
because cscfg_unload_config_sets() runs in "unload_modules" where
module reference is 0.
in the diagram. the enable get the configuration but other cpu do
deactivate and unload together.
Here, by other sys_active_cnt becames 0. while enable logic get the
config's pointer.
> > But feature list is depends on config, If config is safe from
> > load/unload, this is not an issue so we don't need it.
> >
> > Thanks for your review!
--
Sincerely,
Yeoreum Yun
Powered by blists - more mailing lists