lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <660cd57a-d445-63ba-b0bf-6bf6168ffee0@samsung.com>
Date:   Fri, 17 Dec 2021 11:11:52 +0100
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>,
        alsa-devel@...a-project.org
Cc:     tiwai@...e.de, broonie@...nel.org, vkoul@...nel.org,
        Sameer Pujar <spujar@...dia.com>,
        Gyeongtaek Lee <gt82.lee@...sung.com>,
        Peter Ujfalusi <peter.ujfalusi@...ux.intel.com>,
        Kuninori Morimoto <kuninori.morimoto.gx@...esas.com>,
        linux-kernel@...r.kernel.org,
        Kai Vehmanen <kai.vehmanen@...ux.intel.com>,
        Bard Liao <yung-chuan.liao@...ux.intel.com>,
        Ranjani Sridharan <ranjani.sridharan@...ux.intel.com>,
        Liam Girdwood <lgirdwood@...il.com>,
        Jaroslav Kysela <perex@...ex.cz>,
        Takashi Iwai <tiwai@...e.com>,
        Sylwester Nawrocki <s.nawrocki@...sung.com>,
        'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>,
        Krzysztof Kozlowski <krzk@...nel.org>
Subject: Re: [PATCH 4/6] ASoC: soc-pcm: serialize BE triggers

Dear All,

On 07.12.2021 18:37, Pierre-Louis Bossart wrote:
> From: Takashi Iwai <tiwai@...e.de>
>
> When more than one FE is connected to a BE, e.g. in a mixing use case,
> the BE can be triggered multiple times when the FE are opened/started
> concurrently. This race condition is problematic in the case of
> SoundWire BE dailinks, and this is not desirable in a general
> case.
>
> This patch relies on the existing BE PCM lock, which takes atomicity into
> account. The locking model assumes that all interactions start with
> the FE, so that there is no deadlock between FE and BE locks.
>
> Signed-off-by: Takashi Iwai <tiwai@...e.de>
> [test, checkpatch fix and clarification of commit message by plbossart]
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
> Reviewed-by: Kai Vehmanen <kai.vehmanen@...ux.intel.com>
> Reviewed-by: Bard Liao <yung-chuan.liao@...ux.intel.com>
> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@...ux.intel.com>

This patch recently landed in linux-next (next-20211215) as commit 
b2ae80663008 ("ASoC: soc-pcm: serialize BE triggers"). I found that it 
triggers a warning on my test boards. This is the one from 
Exynos4412-based Odroid U3 board:

# speaker-test -l1

speaker-test 1.1.8

Playback device is default
Stream parameters are 48000Hz, S16_LE, 1 channels
Using 16 octaves of pink noise
Rate set to 48000Hz (requested 48000Hz)
Buffer size range from 128 to 131072
Period size range from 64 to 65536
Using max buffer size 131072
Periods = 4
was set period_size = 32768
was set buffer_size = 131072
  0 - Front Left

============================================
WARNING: possible recursive locking detected
5.16.0-rc1-00270-gb2ae80663008 #11109 Not tainted
--------------------------------------------
speaker-test/1312 is trying to acquire lock:
c1d78ca4 (&group->lock){....}-{2:2}, at: dpcm_be_dai_trigger+0x80/0x300

but task is already holding lock:
c1d788a4 (&group->lock){....}-{2:2}, at: snd_pcm_action_lock_irq+0x68/0x7c

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&group->lock);
   lock(&group->lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

1 lock held by speaker-test/1312:
  #0: c1d788a4 (&group->lock){....}-{2:2}, at: 
snd_pcm_action_lock_irq+0x68/0x7c

stack backtrace:
CPU: 0 PID: 1312 Comm: speaker-test Not tainted 
5.16.0-rc1-00270-gb2ae80663008 #11109
Hardware name: Samsung Exynos (Flattened Device Tree)
[<c0110b30>] (unwind_backtrace) from [<c010c618>] (show_stack+0x10/0x14)
[<c010c618>] (show_stack) from [<c0b65d24>] (dump_stack_lvl+0x58/0x70)
[<c0b65d24>] (dump_stack_lvl) from [<c0193740>] 
(__lock_acquire+0x15ac/0x319c)
[<c0193740>] (__lock_acquire) from [<c0195dd8>] (lock_acquire+0x14c/0x424)
[<c0195dd8>] (lock_acquire) from [<c0b745b8>] 
(_raw_spin_lock_irqsave+0x44/0x60)
[<c0b745b8>] (_raw_spin_lock_irqsave) from [<c0926b6c>] 
(dpcm_be_dai_trigger+0x80/0x300)
[<c0926b6c>] (dpcm_be_dai_trigger) from [<c0927004>] 
(dpcm_fe_dai_do_trigger+0x124/0x1e4)
[<c0927004>] (dpcm_fe_dai_do_trigger) from [<c090728c>] 
(snd_pcm_action+0x74/0xb0)
[<c090728c>] (snd_pcm_action) from [<c0907eac>] 
(snd_pcm_action_lock_irq+0x3c/0x7c)
[<c0907eac>] (snd_pcm_action_lock_irq) from [<c02f13a0>] 
(sys_ioctl+0x568/0xd44)
[<c02f13a0>] (sys_ioctl) from [<c0100060>] (ret_fast_syscall+0x0/0x1c)
Exception stack(0xc4777fa8 to 0xc4777ff0)
7fa0:                   004f5210 b6e27394 00000004 00004142 004f5398 
004f5398
7fc0: 004f5210 b6e27394 00020000 00000036 00000000 00000000 bee588e8 
00008000
7fe0: b6e277c4 bee58874 b6d8e888 b6c751dc
Time per period = 0.253397
max98090 1-0010: PLL unlocked
BUG: sleeping function called from invalid context at 
kernel/locking/rwsem.c:1526
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1312, name: 
speaker-test
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 8158
hardirqs last  enabled at (8157): [<c0b747d0>] 
_raw_spin_unlock_irqrestore+0x5c/0x60
hardirqs last disabled at (8158): [<c0b74570>] _raw_spin_lock_irq+0x58/0x5c
softirqs last  enabled at (7854): [<c0101578>] __do_softirq+0x348/0x610
softirqs last disabled at (7849): [<c012e7a4>] __irq_exit_rcu+0x144/0x1ec
Preemption disabled at:
[<00000000>] 0x0
CPU: 0 PID: 1312 Comm: speaker-test Not tainted 
5.16.0-rc1-00270-gb2ae80663008 #11109
Hardware name: Samsung Exynos (Flattened Device Tree)
[<c0110b30>] (unwind_backtrace) from [<c010c618>] (show_stack+0x10/0x14)
[<c010c618>] (show_stack) from [<c0b65d24>] (dump_stack_lvl+0x58/0x70)
[<c0b65d24>] (dump_stack_lvl) from [<c0158b04>] 
(__might_resched+0x1c0/0x288)
[<c0158b04>] (__might_resched) from [<c0b71898>] (down_write+0x24/0x8c)
[<c0b71898>] (down_write) from [<c030ed64>] 
(simple_recursive_removal+0x6c/0x370)
[<c030ed64>] (simple_recursive_removal) from [<c04d07a4>] 
(debugfs_remove+0x38/0x4c)
[<c04d07a4>] (debugfs_remove) from [<c0928784>] 
(dpcm_be_disconnect+0x160/0x2c4)
[<c0928784>] (dpcm_be_disconnect) from [<c092895c>] 
(dpcm_fe_dai_cleanup+0x74/0xb0)
[<c092895c>] (dpcm_fe_dai_cleanup) from [<c0928d90>] 
(dpcm_fe_dai_close+0xe8/0x14c)
[<c0928d90>] (dpcm_fe_dai_close) from [<c090977c>] 
(snd_pcm_release_substream.part.0+0x3c/0xcc)
[<c090977c>] (snd_pcm_release_substream.part.0) from [<c0909878>] 
(snd_pcm_release+0x54/0xa4)
[<c0909878>] (snd_pcm_release) from [<c02dc400>] (__fput+0x88/0x258)
[<c02dc400>] (__fput) from [<c014cd44>] (task_work_run+0x8c/0xc8)
[<c014cd44>] (task_work_run) from [<c010c08c>] (do_work_pending+0x4a4/0x598)
[<c010c08c>] (do_work_pending) from [<c0100088>] 
(slow_work_pending+0xc/0x20)
Exception stack(0xc4777fb0 to 0xc4777ff8)
7fa0:                                     00000000 004f5260 004eaa9c 
00000000
7fc0: 004f5260 004f536c 004f5210 00000006 004fb700 004e6e8c 004d6120 
bee58cc4
7fe0: b6e27e64 bee58928 b6d8eda4 b6d09ac0 60000050 00000004

Let me know how I can help debugging this issue.

> ---
>   sound/soc/soc-pcm.c | 46 ++++++++++++++++++++++++++++-----------------
>   1 file changed, 29 insertions(+), 17 deletions(-)
>
> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
> index 2e282c42bac2..7043857e30b1 100644
> --- a/sound/soc/soc-pcm.c
> +++ b/sound/soc/soc-pcm.c
> @@ -46,12 +46,18 @@ static inline void snd_soc_dpcm_stream_lock_irq(struct snd_soc_pcm_runtime *rtd,
>   	snd_pcm_stream_lock_irq(snd_soc_dpcm_get_substream(rtd, stream));
>   }
>   
> +#define snd_soc_dpcm_stream_lock_irqsave(rtd, stream, flags) \
> +	snd_pcm_stream_lock_irqsave(snd_soc_dpcm_get_substream(rtd, stream), flags)
> +
>   static inline void snd_soc_dpcm_stream_unlock_irq(struct snd_soc_pcm_runtime *rtd,
>   						  int stream)
>   {
>   	snd_pcm_stream_unlock_irq(snd_soc_dpcm_get_substream(rtd, stream));
>   }
>   
> +#define snd_soc_dpcm_stream_unlock_irqrestore(rtd, stream, flags) \
> +	snd_pcm_stream_unlock_irqrestore(snd_soc_dpcm_get_substream(rtd, stream), flags)
> +
>   #define DPCM_MAX_BE_USERS	8
>   
>   static inline const char *soc_cpu_dai_name(struct snd_soc_pcm_runtime *rtd)
> @@ -2079,6 +2085,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream,
>   {
>   	struct snd_soc_pcm_runtime *be;
>   	struct snd_soc_dpcm *dpcm;
> +	unsigned long flags;
>   	int ret = 0;
>   
>   	for_each_dpcm_be(fe, stream, dpcm) {
> @@ -2087,9 +2094,11 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream,
>   		be = dpcm->be;
>   		be_substream = snd_soc_dpcm_get_substream(be, stream);
>   
> +		snd_soc_dpcm_stream_lock_irqsave(be, stream, flags);
> +
>   		/* is this op for this BE ? */
>   		if (!snd_soc_dpcm_be_can_update(fe, be, stream))
> -			continue;
> +			goto next;
>   
>   		dev_dbg(be->dev, "ASoC: trigger BE %s cmd %d\n",
>   			be->dai_link->name, cmd);
> @@ -2099,77 +2108,80 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream,
>   			if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) &&
>   			    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) &&
>   			    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
> -				continue;
> +				goto next;
>   
>   			ret = soc_pcm_trigger(be_substream, cmd);
>   			if (ret)
> -				goto end;
> +				goto next;
>   
>   			be->dpcm[stream].state = SND_SOC_DPCM_STATE_START;
>   			break;
>   		case SNDRV_PCM_TRIGGER_RESUME:
>   			if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_SUSPEND))
> -				continue;
> +				goto next;
>   
>   			ret = soc_pcm_trigger(be_substream, cmd);
>   			if (ret)
> -				goto end;
> +				goto next;
>   
>   			be->dpcm[stream].state = SND_SOC_DPCM_STATE_START;
>   			break;
>   		case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
>   			if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
> -				continue;
> +				goto next;
>   
>   			ret = soc_pcm_trigger(be_substream, cmd);
>   			if (ret)
> -				goto end;
> +				goto next;
>   
>   			be->dpcm[stream].state = SND_SOC_DPCM_STATE_START;
>   			break;
>   		case SNDRV_PCM_TRIGGER_STOP:
>   			if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) &&
>   			    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
> -				continue;
> +				goto next;
>   
>   			if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream))
> -				continue;
> +				goto next;
>   
>   			ret = soc_pcm_trigger(be_substream, cmd);
>   			if (ret)
> -				goto end;
> +				goto next;
>   
>   			be->dpcm[stream].state = SND_SOC_DPCM_STATE_STOP;
>   			break;
>   		case SNDRV_PCM_TRIGGER_SUSPEND:
>   			if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START)
> -				continue;
> +				goto next;
>   
>   			if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream))
> -				continue;
> +				goto next;
>   
>   			ret = soc_pcm_trigger(be_substream, cmd);
>   			if (ret)
> -				goto end;
> +				goto next;
>   
>   			be->dpcm[stream].state = SND_SOC_DPCM_STATE_SUSPEND;
>   			break;
>   		case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
>   			if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START)
> -				continue;
> +				goto next;
>   
>   			if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream))
> -				continue;
> +				goto next;
>   
>   			ret = soc_pcm_trigger(be_substream, cmd);
>   			if (ret)
> -				goto end;
> +				goto next;
>   
>   			be->dpcm[stream].state = SND_SOC_DPCM_STATE_PAUSED;
>   			break;
>   		}
> +next:
> +		snd_soc_dpcm_stream_unlock_irqrestore(be, stream, flags);
> +		if (ret)
> +			break;
>   	}
> -end:
>   	if (ret < 0)
>   		dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n",
>   			__func__, be->dai_link->name, ret);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ