[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0406ae9f-28f6-44e1-26dd-8f243e7fc9fb@h-partners.com>
Date: Mon, 29 Dec 2025 16:27:23 +0800
From: hejunhao <hejunhao3@...artners.com>
To: Suzuki K Poulose <suzuki.poulose@....com>, <james.clark@...aro.org>,
<mike.leach@...aro.org>, <leo.yan@....com>
CC: <anshuman.khandual@....com>, <yeoreum.yun@....com>,
<coresight@...ts.linaro.org>, <linux-arm-kernel@...ts.infradead.org>,
<linux-kernel@...r.kernel.org>, <jonathan.cameron@...wei.com>, Linuxarm
<linuxarm@...wei.com>, Junhao He <hejunhao3@...artners.com>
Subject: Re: [PATCH v4 2/3] coresight: tmc: refactor the tmc-etr mode setting
to avoid race conditions
On 2025/11/19 22:46, Suzuki K Poulose wrote:
> On 15/11/2025 02:09, hejunhao wrote:
>>
>>
>> On 2025/11/13 23:02, Suzuki K Poulose wrote:
>>> Hi Junhao,
>>>
>>> While your patch fixes the problem it introduces imbalance in the
>>> way perf vs sysfs modes are handled.
>>>
>>
>> Hi Suzuki,
>>
>> Thank you for your comments.
>>
>> What you mean is that, as long as tmc_etr_enable_hw() has not yet been
>> called (i.e., ETR hasn't been fully enabled), both the perf and sysfs
>> sessions
>> have the opportunity to take ownership of ETR device?
>>
>> Based on the following sysfs-based ETR enablement sequence, the
>> spinlock is
>> released at two points (denoted as points a and b).
>> At either of these points, perf may acquire the lock and invoke
>> tmc_etr_enable_hw() to complete the ETR initialization, resulting in
>> ETR_mode == PERF. This would interrupt the ongoing sysfs enablement
>> flow and effectively give precedence to perf over sysfs.
>>
>> mode:
>> [sysfs mode]
>> tmc_enable_etr_sink_sysfs()
>> tmc_etr_get_sysfs_buffer()
>> spin_lock(&drvdata->spinlock)
>> sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
>> if (!sysfs_buf || (sysfs_buf->size != drvdata->size))
>> spin_unlock(&drvdata->spinlock) <-- point a
>> [sysfs buffer allocation]
>> spin_lock(&drvdata->spinlock)
>> drvdata->sysfs_buf = new_buf;
>> spin_unlock(&drvdata->spinlock) <-- point b
>> spin_lock(&drvdata->spinlock)
>> tmc_etr_enable_hw()
>> spin_unlock(&drvdata->spinlock)
>>
>> [perf mode]
>> tmc_enable_etr_sink_perf()
>> spin_lock(&drvdata->spinlock)
>> tmc_etr_enable_hw()
>> spin_unlock(&drvdata->spinlock)
>>
>> After the patch, both the perf and sysfs sessions contend fairly for
>> the lock at
>> the entry point of tmc_enable_etr_sink(). The first to acquire the
>> lock gains
>> exclusive control over the ETR sink for the remainder of the session.
>
> TBH, we don't care about fairness of the different modes of operation.
>
> 1. Perf must be given the priority - Because that provides a full easy
> framework for making any meaningful sense of the trace.
>
> 2. Sysfs mode should be only used for custom use cases, where you have
> full control of the system. (e.g., System bringup, Validation or event
> crash capture).
>
> In other words, if someone wants to use sysfs mode they must know what
> they are doing and not run any perf session in parallel.
>
Hi Suzuki,
I understand now.
This issue has been present for some time. Based on the discussion
above, the
quickest way to resolve it is to revert to patch [1]. I will rebase the
series onto
for-next, submit the next version.
Additionally, I might delay or suspend the planned refactoring of
the mode handling.
Thanks
Junhao
[1]
https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-1-yangyicong@huawei.com/
> Thanks
> Suzuki
>
>
>>
>>> On 11/11/2025 12:21, Junhao He wrote:
>>>> From: Junhao He <hejunhao3@...wei.com>
>>>>
>>>> When trying to run perf and sysfs mode simultaneously, the WARN_ON()
>>>> in tmc_etr_enable_hw() is triggered sometimes:
>>>>
>>>> WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/
>>>> coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc]
>>>> [..snip..]
>>>> Call trace:
>>>> tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P)
>>>> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L)
>>>> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc]
>>>> coresight_enable_path+0x1c8/0x218 [coresight]
>>>> coresight_enable_sysfs+0xa4/0x228 [coresight]
>>>> enable_source_store+0x58/0xa8 [coresight]
>>>> dev_attr_store+0x20/0x40
>>>> sysfs_kf_write+0x4c/0x68
>>>> kernfs_fop_write_iter+0x120/0x1b8
>>>> vfs_write+0x2c8/0x388
>>>> ksys_write+0x74/0x108
>>>> __arm64_sys_write+0x24/0x38
>>>> el0_svc_common.constprop.0+0x64/0x148
>>>> do_el0_svc+0x24/0x38
>>>> el0_svc+0x3c/0x130
>>>> el0t_64_sync_handler+0xc8/0xd0
>>>> el0t_64_sync+0x1ac/0x1b0
>>>> ---[ end trace 0000000000000000 ]---
>>>>
>>>> Since the sysfs buffer allocation and the hardware enablement is not
>>>> in the same critical region, it's possible to race with the perf
>>>>
>>>> mode:
>>>> [sysfs mode] [perf mode]
>>>> tmc_etr_get_sysfs_buffer()
>>>> spin_lock(&drvdata->spinlock)
>>>> [sysfs buffer allocation]
>>>> spin_unlock(&drvdata->spinlock)
>>>> spin_lock(&drvdata->spinlock)
>>>> tmc_etr_enable_hw()
>>>> drvdata->etr_buf =
>>>> etr_perf->etr_buf
>>>> spin_unlock(&drvdata->spinlock)
>>>> spin_lock(&drvdata->spinlock)
>>>
>>> ^^ Could we not double check the mode here and break out ?
>>>
>>> Something like this :
>>>
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/
>>> drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> index e0d83ee01b77..2097c57d19d0 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> @@ -1314,6 +1314,9 @@ static int tmc_enable_etr_sink_sysfs(struct
>>> coresight_device *csdev)
>>> if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>> csdev->refcnt++;
>>> goto out;
>>> + } else if (coresight_get_mode(csdev) != CS_MODE_DISABLED) {
>>> + ret = -EBUSY;
>>> + goto out;
>>> }
>>>
>>> ret = tmc_etr_enable_hw(drvdata, sysfs_buf);
>>>
>>>
>>> Suzuki
>>
>> Indeed, this approach addresses only the race condition between the
>> perf and sysfs
>> components. It represents the simplest viable solution.
>>
>> This patch also resolves the potential null pointer dereference of
>> drvdata->sysfs_buf,
>> which could occur when the buffer size is modified after ETR has
>> already been
>> enabled, followed by enabling a second source device [1].
>>
>> If the refactored patch introduces additional issues, the
>> implementation may be
>> reworked based on Yicong's original patches [2] ?
>>
>> [1]https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-2-
>> yangyicong@...wei.com/
>> [2]https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-4-
>> yangyicong@...wei.com/
>>
>> Best regards,
>> Junhao.
>>
>>>
>>>
>>>> tmc_etr_enable_hw()
>>>> WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at
>>>> the perf side
>>>> spin_unlock(&drvdata->spinlock)
>>>>
>>>> A race condition is introduced here, perf always prioritizes
>>>> execution, and
>>>> warnings can lead to concerns about potential hidden bugs, such as
>>>> getting
>>>> out of sync.
>>>>
>>>> To fix this, configure the tmc-etr mode before invoking
>>>> enable_etr_perf()
>>>> or enable_etr_sysfs(), explicitly check if the tmc-etr sink is already
>>>> enabled in a different mode, and simplily the setup and checks for
>>>> "mode".
>>>> To prevent race conditions between mode transitions.
>>>>
>>>> Fixes: 296b01fd106e ("coresight: Refactor out buffer allocation
>>>> function for ETR")
>>>> Reported-by: Yicong Yang <yangyicong@...ilicon.com>
>>>> Closes: https://lore.kernel.org/linux-arm-
>>>> kernel/20241202092419.11777-2-yangyicong@...wei.com/
>>>> Signed-off-by: Junhao He <hejunhao3@...wei.com>
>>>> Tested-by: Yicong Yang <yangyicong@...ilicon.com>
>>>> Signed-off-by: Junhao He <hejunhao3@...artners.com>
>>>> ---
>>>> .../hwtracing/coresight/coresight-tmc-etr.c | 106
>>>> +++++++++---------
>>>> 1 file changed, 55 insertions(+), 51 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/
>>>> drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>> index b07fcdb3fe1a..fe1d5e8a0d2b 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>> @@ -1263,11 +1263,6 @@ static struct etr_buf
>>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>>> raw_spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> }
>>>> - if (drvdata->reading || coresight_get_mode(csdev) ==
>>>> CS_MODE_PERF) {
>>>> - ret = -EBUSY;
>>>> - goto out;
>>>> - }
>>>> -
>>>> /*
>>>> * If we don't have a buffer or it doesn't match the
>>>> requested size,
>>>> * use the buffer allocated above. Otherwise reuse the
>>>> existing buffer.
>>>> @@ -1278,7 +1273,6 @@ static struct etr_buf
>>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>>> drvdata->sysfs_buf = new_buf;
>>>> }
>>>> -out:
>>>> raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>>> /* Free memory outside the spinlock if need be */
>>>> @@ -1289,7 +1283,7 @@ static struct etr_buf
>>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>>> static int tmc_enable_etr_sink_sysfs(struct coresight_device
>>>> *csdev)
>>>> {
>>>> - int ret = 0;
>>>> + int ret;
>>>> unsigned long flags;
>>>> struct tmc_drvdata *drvdata =
>>>> dev_get_drvdata(csdev->dev.parent);
>>>> struct etr_buf *sysfs_buf = tmc_etr_get_sysfs_buffer(csdev);
>>>> @@ -1299,23 +1293,10 @@ static int tmc_enable_etr_sink_sysfs(struct
>>>> coresight_device *csdev)
>>>> raw_spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> - /*
>>>> - * In sysFS mode we can have multiple writers per sink. Since
>>>> this
>>>> - * sink is already enabled no memory is needed and the HW need
>>>> not be
>>>> - * touched, even if the buffer size has changed.
>>>> - */
>>>> - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>>> - csdev->refcnt++;
>>>> - goto out;
>>>> - }
>>>> -
>>>> ret = tmc_etr_enable_hw(drvdata, sysfs_buf);
>>>> - if (!ret) {
>>>> - coresight_set_mode(csdev, CS_MODE_SYSFS);
>>>> + if (!ret)
>>>> csdev->refcnt++;
>>>> - }
>>>> -out:
>>>> raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>>> if (!ret)
>>>> @@ -1729,39 +1710,24 @@ static int tmc_enable_etr_sink_perf(struct
>>>> coresight_device *csdev, void *data)
>>>> {
>>>> int rc = 0;
>>>> pid_t pid;
>>>> - unsigned long flags;
>>>> struct tmc_drvdata *drvdata =
>>>> dev_get_drvdata(csdev->dev.parent);
>>>> struct perf_output_handle *handle = data;
>>>> struct etr_perf_buffer *etr_perf = etm_perf_sink_config(handle);
>>>> - raw_spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> - /* Don't use this sink if it is already claimed by sysFS */
>>>> - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>>> - rc = -EBUSY;
>>>> - goto unlock_out;
>>>> - }
>>>> -
>>>> - if (WARN_ON(!etr_perf || !etr_perf->etr_buf)) {
>>>> - rc = -EINVAL;
>>>> - goto unlock_out;
>>>> - }
>>>> + if (WARN_ON(!etr_perf || !etr_perf->etr_buf))
>>>> + return -EINVAL;
>>>> /* Get a handle on the pid of the session owner */
>>>> pid = etr_perf->pid;
>>>> /* Do not proceed if this device is associated with another
>>>> session */
>>>> - if (drvdata->pid != -1 && drvdata->pid != pid) {
>>>> - rc = -EBUSY;
>>>> - goto unlock_out;
>>>> - }
>>>> + if (drvdata->pid != -1 && drvdata->pid != pid)
>>>> + return -EBUSY;
>>>> - /*
>>>> - * No HW configuration is needed if the sink is already in
>>>> - * use for this session.
>>>> - */
>>>> + /* The sink is already in use for this session */
>>>> if (drvdata->pid == pid) {
>>>> csdev->refcnt++;
>>>> - goto unlock_out;
>>>> + return rc;
>>>> }
>>>> rc = tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
>>>> @@ -1773,22 +1739,60 @@ static int tmc_enable_etr_sink_perf(struct
>>>> coresight_device *csdev, void *data)
>>>> csdev->refcnt++;
>>>> }
>>>> -unlock_out:
>>>> - raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>>> return rc;
>>>> }
>>>> static int tmc_enable_etr_sink(struct coresight_device *csdev,
>>>> enum cs_mode mode, void *data)
>>>> {
>>>> - switch (mode) {
>>>> - case CS_MODE_SYSFS:
>>>> - return tmc_enable_etr_sink_sysfs(csdev);
>>>> - case CS_MODE_PERF:
>>>> - return tmc_enable_etr_sink_perf(csdev, data);
>>>> - default:
>>>> - return -EINVAL;
>>>> + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>>> + int rc;
>>>> +
>>>> +retry:
>>>> + scoped_guard(raw_spinlock_irqsave, &drvdata->spinlock) {
>>>> + if (coresight_get_mode(csdev) != CS_MODE_DISABLED &&
>>>> + coresight_get_mode(csdev) != mode)
>>>> + return -EBUSY;
>>>> +
>>>> + switch (mode) {
>>>> + case CS_MODE_SYSFS:
>>>> + if (drvdata->reading)
>>>> + return -EBUSY;
>>>> +
>>>> + if (csdev->refcnt) {
>>>> + /* The sink is already enabled via sysfs */
>>>> + csdev->refcnt++;
>>>> + return 0;
>>>> + }
>>>> +
>>>> + /*
>>>> + * A sysfs session is currently enabling ETR, preventing
>>>> + * a second sysfs process from repeatedly triggering the
>>>> + * enable procedure.
>>>> + */
>>>> + if (coresight_get_mode(csdev) == CS_MODE_SYSFS && !
>>>> csdev->refcnt)
>>>> + goto retry;
>>>> +
>>>> + /*
>>>> + * Set ETR to sysfs mode before it is fully enabled, to
>>>> + * prevent race conditions during mode transitions.
>>>> + */
>>>> + coresight_set_mode(csdev, mode);
>>>> + break;
>>>> + case CS_MODE_PERF:
>>>> + return tmc_enable_etr_sink_perf(csdev, data);
>>>> + default:
>>>> + return -EINVAL;
>>>> + }
>>>> + }
>>>> +
>>>> + rc = tmc_enable_etr_sink_sysfs(csdev);
>>>> + if (rc) {
>>>> + guard(raw_spinlock_irqsave)(&drvdata->spinlock);
>>>> + coresight_set_mode(csdev, CS_MODE_DISABLED);
>>>> }
>>>> +
>>>> + return rc;
>>>> }
>>>> static int tmc_disable_etr_sink(struct coresight_device *csdev)
>>>
>>>
>>> .
>>>
>>
>
>
> .
>
Powered by blists - more mailing lists