[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a0ff9d15-c0e8-4ba2-9187-2abbe5676b01@arm.com>
Date: Wed, 19 Nov 2025 14:46:13 +0000
From: Suzuki K Poulose <suzuki.poulose@....com>
To: hejunhao <hejunhao3@...artners.com>, james.clark@...aro.org,
mike.leach@...aro.org, leo.yan@....com
Cc: anshuman.khandual@....com, yeoreum.yun@....com,
coresight@...ts.linaro.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, jonathan.cameron@...wei.com,
Linuxarm <linuxarm@...wei.com>
Subject: Re: [PATCH v4 2/3] coresight: tmc: refactor the tmc-etr mode setting
to avoid race conditions
On 15/11/2025 02:09, hejunhao wrote:
>
>
> On 2025/11/13 23:02, Suzuki K Poulose wrote:
>> Hi Junhao,
>>
>> While your patch fixes the problem it introduces imbalance in the
>> way perf vs sysfs modes are handled.
>>
>
> Hi Suzuki,
>
> Thank you for your comments.
>
> What you mean is that, as long as tmc_etr_enable_hw() has not yet been
> called (i.e., ETR hasn't been fully enabled), both the perf and sysfs
> sessions
> have the opportunity to take ownership of ETR device?
>
> Based on the following sysfs-based ETR enablement sequence, the spinlock is
> released at two points (denoted as points a and b).
> At either of these points, perf may acquire the lock and invoke
> tmc_etr_enable_hw() to complete the ETR initialization, resulting in
> ETR_mode == PERF. This would interrupt the ongoing sysfs enablement
> flow and effectively give precedence to perf over sysfs.
>
> mode:
> [sysfs mode]
> tmc_enable_etr_sink_sysfs()
> tmc_etr_get_sysfs_buffer()
> spin_lock(&drvdata->spinlock)
> sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
> if (!sysfs_buf || (sysfs_buf->size != drvdata->size))
> spin_unlock(&drvdata->spinlock) <-- point a
> [sysfs buffer allocation]
> spin_lock(&drvdata->spinlock)
> drvdata->sysfs_buf = new_buf;
> spin_unlock(&drvdata->spinlock) <-- point b
> spin_lock(&drvdata->spinlock)
> tmc_etr_enable_hw()
> spin_unlock(&drvdata->spinlock)
>
> [perf mode]
> tmc_enable_etr_sink_perf()
> spin_lock(&drvdata->spinlock)
> tmc_etr_enable_hw()
> spin_unlock(&drvdata->spinlock)
>
> After the patch, both the perf and sysfs sessions contend fairly for the
> lock at
> the entry point of tmc_enable_etr_sink(). The first to acquire the lock
> gains
> exclusive control over the ETR sink for the remainder of the session.
TBH, we don't care about fairness of the different modes of operation.
1. Perf must be given the priority - Because that provides a full easy
framework for making any meaningful sense of the trace.
2. Sysfs mode should be only used for custom use cases, where you have
full control of the system. (e.g., System bringup, Validation or event
crash capture).
In other words, if someone wants to use sysfs mode they must know what
they are doing and not run any perf session in parallel.
Thanks
Suzuki
>
>> On 11/11/2025 12:21, Junhao He wrote:
>>> From: Junhao He <hejunhao3@...wei.com>
>>>
>>> When trying to run perf and sysfs mode simultaneously, the WARN_ON()
>>> in tmc_etr_enable_hw() is triggered sometimes:
>>>
>>> WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/
>>> coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc]
>>> [..snip..]
>>> Call trace:
>>> tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P)
>>> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L)
>>> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc]
>>> coresight_enable_path+0x1c8/0x218 [coresight]
>>> coresight_enable_sysfs+0xa4/0x228 [coresight]
>>> enable_source_store+0x58/0xa8 [coresight]
>>> dev_attr_store+0x20/0x40
>>> sysfs_kf_write+0x4c/0x68
>>> kernfs_fop_write_iter+0x120/0x1b8
>>> vfs_write+0x2c8/0x388
>>> ksys_write+0x74/0x108
>>> __arm64_sys_write+0x24/0x38
>>> el0_svc_common.constprop.0+0x64/0x148
>>> do_el0_svc+0x24/0x38
>>> el0_svc+0x3c/0x130
>>> el0t_64_sync_handler+0xc8/0xd0
>>> el0t_64_sync+0x1ac/0x1b0
>>> ---[ end trace 0000000000000000 ]---
>>>
>>> Since the sysfs buffer allocation and the hardware enablement is not
>>> in the same critical region, it's possible to race with the perf
>>>
>>> mode:
>>> [sysfs mode] [perf mode]
>>> tmc_etr_get_sysfs_buffer()
>>> spin_lock(&drvdata->spinlock)
>>> [sysfs buffer allocation]
>>> spin_unlock(&drvdata->spinlock)
>>> spin_lock(&drvdata->spinlock)
>>> tmc_etr_enable_hw()
>>> drvdata->etr_buf = etr_perf->etr_buf
>>> spin_unlock(&drvdata->spinlock)
>>> spin_lock(&drvdata->spinlock)
>>
>> ^^ Could we not double check the mode here and break out ?
>>
>> Something like this :
>>
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/
>> drivers/hwtracing/coresight/coresight-tmc-etr.c
>> index e0d83ee01b77..2097c57d19d0 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> @@ -1314,6 +1314,9 @@ static int tmc_enable_etr_sink_sysfs(struct
>> coresight_device *csdev)
>> if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>> csdev->refcnt++;
>> goto out;
>> + } else if (coresight_get_mode(csdev) != CS_MODE_DISABLED) {
>> + ret = -EBUSY;
>> + goto out;
>> }
>>
>> ret = tmc_etr_enable_hw(drvdata, sysfs_buf);
>>
>>
>> Suzuki
>
> Indeed, this approach addresses only the race condition between the perf
> and sysfs
> components. It represents the simplest viable solution.
>
> This patch also resolves the potential null pointer dereference of
> drvdata->sysfs_buf,
> which could occur when the buffer size is modified after ETR has already
> been
> enabled, followed by enabling a second source device [1].
>
> If the refactored patch introduces additional issues, the implementation
> may be
> reworked based on Yicong's original patches [2] ?
>
> [1]https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-2-
> yangyicong@...wei.com/
> [2]https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-4-
> yangyicong@...wei.com/
>
> Best regards,
> Junhao.
>
>>
>>
>>> tmc_etr_enable_hw()
>>> WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at
>>> the perf side
>>> spin_unlock(&drvdata->spinlock)
>>>
>>> A race condition is introduced here, perf always prioritizes
>>> execution, and
>>> warnings can lead to concerns about potential hidden bugs, such as
>>> getting
>>> out of sync.
>>>
>>> To fix this, configure the tmc-etr mode before invoking
>>> enable_etr_perf()
>>> or enable_etr_sysfs(), explicitly check if the tmc-etr sink is already
>>> enabled in a different mode, and simplily the setup and checks for
>>> "mode".
>>> To prevent race conditions between mode transitions.
>>>
>>> Fixes: 296b01fd106e ("coresight: Refactor out buffer allocation
>>> function for ETR")
>>> Reported-by: Yicong Yang <yangyicong@...ilicon.com>
>>> Closes: https://lore.kernel.org/linux-arm-
>>> kernel/20241202092419.11777-2-yangyicong@...wei.com/
>>> Signed-off-by: Junhao He <hejunhao3@...wei.com>
>>> Tested-by: Yicong Yang <yangyicong@...ilicon.com>
>>> Signed-off-by: Junhao He <hejunhao3@...artners.com>
>>> ---
>>> .../hwtracing/coresight/coresight-tmc-etr.c | 106 +++++++++---------
>>> 1 file changed, 55 insertions(+), 51 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/
>>> drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> index b07fcdb3fe1a..fe1d5e8a0d2b 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> @@ -1263,11 +1263,6 @@ static struct etr_buf
>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>> raw_spin_lock_irqsave(&drvdata->spinlock, flags);
>>> }
>>> - if (drvdata->reading || coresight_get_mode(csdev) ==
>>> CS_MODE_PERF) {
>>> - ret = -EBUSY;
>>> - goto out;
>>> - }
>>> -
>>> /*
>>> * If we don't have a buffer or it doesn't match the requested
>>> size,
>>> * use the buffer allocated above. Otherwise reuse the existing
>>> buffer.
>>> @@ -1278,7 +1273,6 @@ static struct etr_buf
>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>> drvdata->sysfs_buf = new_buf;
>>> }
>>> -out:
>>> raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>> /* Free memory outside the spinlock if need be */
>>> @@ -1289,7 +1283,7 @@ static struct etr_buf
>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>> static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>>> {
>>> - int ret = 0;
>>> + int ret;
>>> unsigned long flags;
>>> struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>> struct etr_buf *sysfs_buf = tmc_etr_get_sysfs_buffer(csdev);
>>> @@ -1299,23 +1293,10 @@ static int tmc_enable_etr_sink_sysfs(struct
>>> coresight_device *csdev)
>>> raw_spin_lock_irqsave(&drvdata->spinlock, flags);
>>> - /*
>>> - * In sysFS mode we can have multiple writers per sink. Since this
>>> - * sink is already enabled no memory is needed and the HW need
>>> not be
>>> - * touched, even if the buffer size has changed.
>>> - */
>>> - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>> - csdev->refcnt++;
>>> - goto out;
>>> - }
>>> -
>>> ret = tmc_etr_enable_hw(drvdata, sysfs_buf);
>>> - if (!ret) {
>>> - coresight_set_mode(csdev, CS_MODE_SYSFS);
>>> + if (!ret)
>>> csdev->refcnt++;
>>> - }
>>> -out:
>>> raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>> if (!ret)
>>> @@ -1729,39 +1710,24 @@ static int tmc_enable_etr_sink_perf(struct
>>> coresight_device *csdev, void *data)
>>> {
>>> int rc = 0;
>>> pid_t pid;
>>> - unsigned long flags;
>>> struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>> struct perf_output_handle *handle = data;
>>> struct etr_perf_buffer *etr_perf = etm_perf_sink_config(handle);
>>> - raw_spin_lock_irqsave(&drvdata->spinlock, flags);
>>> - /* Don't use this sink if it is already claimed by sysFS */
>>> - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>> - rc = -EBUSY;
>>> - goto unlock_out;
>>> - }
>>> -
>>> - if (WARN_ON(!etr_perf || !etr_perf->etr_buf)) {
>>> - rc = -EINVAL;
>>> - goto unlock_out;
>>> - }
>>> + if (WARN_ON(!etr_perf || !etr_perf->etr_buf))
>>> + return -EINVAL;
>>> /* Get a handle on the pid of the session owner */
>>> pid = etr_perf->pid;
>>> /* Do not proceed if this device is associated with another
>>> session */
>>> - if (drvdata->pid != -1 && drvdata->pid != pid) {
>>> - rc = -EBUSY;
>>> - goto unlock_out;
>>> - }
>>> + if (drvdata->pid != -1 && drvdata->pid != pid)
>>> + return -EBUSY;
>>> - /*
>>> - * No HW configuration is needed if the sink is already in
>>> - * use for this session.
>>> - */
>>> + /* The sink is already in use for this session */
>>> if (drvdata->pid == pid) {
>>> csdev->refcnt++;
>>> - goto unlock_out;
>>> + return rc;
>>> }
>>> rc = tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
>>> @@ -1773,22 +1739,60 @@ static int tmc_enable_etr_sink_perf(struct
>>> coresight_device *csdev, void *data)
>>> csdev->refcnt++;
>>> }
>>> -unlock_out:
>>> - raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>> return rc;
>>> }
>>> static int tmc_enable_etr_sink(struct coresight_device *csdev,
>>> enum cs_mode mode, void *data)
>>> {
>>> - switch (mode) {
>>> - case CS_MODE_SYSFS:
>>> - return tmc_enable_etr_sink_sysfs(csdev);
>>> - case CS_MODE_PERF:
>>> - return tmc_enable_etr_sink_perf(csdev, data);
>>> - default:
>>> - return -EINVAL;
>>> + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>> + int rc;
>>> +
>>> +retry:
>>> + scoped_guard(raw_spinlock_irqsave, &drvdata->spinlock) {
>>> + if (coresight_get_mode(csdev) != CS_MODE_DISABLED &&
>>> + coresight_get_mode(csdev) != mode)
>>> + return -EBUSY;
>>> +
>>> + switch (mode) {
>>> + case CS_MODE_SYSFS:
>>> + if (drvdata->reading)
>>> + return -EBUSY;
>>> +
>>> + if (csdev->refcnt) {
>>> + /* The sink is already enabled via sysfs */
>>> + csdev->refcnt++;
>>> + return 0;
>>> + }
>>> +
>>> + /*
>>> + * A sysfs session is currently enabling ETR, preventing
>>> + * a second sysfs process from repeatedly triggering the
>>> + * enable procedure.
>>> + */
>>> + if (coresight_get_mode(csdev) == CS_MODE_SYSFS && !
>>> csdev->refcnt)
>>> + goto retry;
>>> +
>>> + /*
>>> + * Set ETR to sysfs mode before it is fully enabled, to
>>> + * prevent race conditions during mode transitions.
>>> + */
>>> + coresight_set_mode(csdev, mode);
>>> + break;
>>> + case CS_MODE_PERF:
>>> + return tmc_enable_etr_sink_perf(csdev, data);
>>> + default:
>>> + return -EINVAL;
>>> + }
>>> + }
>>> +
>>> + rc = tmc_enable_etr_sink_sysfs(csdev);
>>> + if (rc) {
>>> + guard(raw_spinlock_irqsave)(&drvdata->spinlock);
>>> + coresight_set_mode(csdev, CS_MODE_DISABLED);
>>> }
>>> +
>>> + return rc;
>>> }
>>> static int tmc_disable_etr_sink(struct coresight_device *csdev)
>>
>>
>> .
>>
>
Powered by blists - more mailing lists