lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87le7e6um3.wl-tiwai@suse.de>
Date: Wed, 21 Feb 2024 10:21:08 +0100
From: Takashi Iwai <tiwai@...e.de>
To: Sven van Ashbrook <svenva@...omium.org>
Cc: Vlastimil Babka <vbabka@...e.cz>,
	Takashi Iwai <tiwai@...e.de>,
	Karthikeyan Ramasubramanian <kramasub@...omium.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Brian Geffon <bgeffon@...gle.com>,
	stable@...r.kernel.org,
	Curtis Malainey <cujomalainey@...omium.org>,
	Jaroslav Kysela <perex@...ex.cz>,
	Takashi Iwai <tiwai@...e.com>,
	linux-sound@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: Stall at page allocations with __GFP_RETRY_MAYFAIL (Re: [PATCH v1] ALSA: memalloc: Fix indefinite hang in non-iommu case)

On Tue, 20 Feb 2024 16:52:14 +0100,
Sven van Ashbrook wrote:
> 
> Takaski, Vlastimil: thanks so much for the engagement! See below.
> 
> > On 2/19/24 12:36, Takashi Iwai wrote:
> > >
> > > Karthikeyan, Sven, and co: could you guys show the stack trace at the
> > > stall?  This may give us more clear light.
> 
> Here are two typical stack traces at the stall. Note that the timer interrupt
> is just a software watchdog that fires to generate the stack trace.
> This is running the v6.1 kernel.
> We should be able to reproduce this on v6.6 as well if need be.
> 
> <4>[310289.546429] <TASK>
> <4>[310289.546431] asm_sysvec_apic_timer_interrupt+0x16/0x20
> <4>[310289.546434] RIP: 0010:super_cache_count+0xc/0xea
> <4>[310289.546438] Code: ff ff e8 48 ac e3 ff 4c 89 e0 48 83 c4 20 5b
> 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 1f 44 00 00 f6 87 23
> fc ff ff 20 <75> 08 31 c0 c3 cc cc cc cc cc 55 48 89 e5 41 57 41 56 41
> 54 53 49
> <4>[310289.546440] RSP: 0018:ffffa64e8aed35c0 EFLAGS: 00000202
> <4>[310289.546443] RAX: 0000000000000080 RBX: 0000000000000400 RCX:
> 0000000000000000
> <4>[310289.546445] RDX: ffffffffa6d66bc8 RSI: ffffa64e8aed3610 RDI:
> ffff9fd2873dbc30
> <4>[310289.546447] RBP: ffffa64e8aed3660 R08: 0000000000000064 R09:
> 0000000000000000
> <4>[310289.546449] R10: ffffffffa6e3b260 R11: ffffffffa5163a52 R12:
> ffff9fd2873dbc50
> <4>[310289.546451] R13: 0000000000046c00 R14: 0000000000000000 R15:
> 0000000000000000
> <4>[310289.546453] ? super_cache_scan+0x199/0x199
> <4>[310289.546457] shrink_slab+0xb3/0x37e
> <4>[310289.546460] shrink_node+0x377/0x110e
> <4>[310289.546464] ? sysvec_apic_timer_interrupt+0x17/0x80
> <4>[310289.546467] ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> <4>[310289.546471] try_to_free_pages+0x46e/0x857
> <4>[310289.546475] ? psi_task_change+0x7f/0x9c
> <4>[310289.546478] __alloc_pages_slowpath+0x4e2/0xe5c
> <4>[310289.546482] __alloc_pages+0x225/0x2a2
> <4>[310289.546486] __dma_direct_alloc_pages+0xed/0x1cb
> <4>[310289.546489] dma_direct_alloc_pages+0x21/0xa3
> <4>[310289.546493] dma_alloc_noncontiguous+0xd1/0x144
> <4>[310289.546496] snd_dma_noncontig_alloc+0x45/0xe3
> <4>[310289.546499] snd_dma_alloc_dir_pages+0x4f/0x81
> <4>[310289.546502] hda_cl_stream_prepare+0x66/0x15e
> [snd_sof_intel_hda_common (HASH:1255 1)]
> <4>[310289.546510] hda_dsp_cl_boot_firmware+0xc4/0x2ca
> [snd_sof_intel_hda_common (HASH:1255 1)]
> <4>[310289.546518] snd_sof_run_firmware+0xca/0x2d7 [snd_sof (HASH:ecd9 2)]
> <4>[310289.546526] ? hda_dsp_resume+0x97/0x1a7
> [snd_sof_intel_hda_common (HASH:1255 1)]
> <4>[310289.546534] sof_resume+0x155/0x251 [snd_sof (HASH:ecd9 2)]
> <4>[310289.546542] ? pci_pm_suspend+0x1e7/0x1e7
> <4>[310289.546546] dpm_run_callback+0x3c/0x132
> <4>[310289.546549] device_resume+0x1f7/0x282
> <4>[310289.546552] ? dpm_watchdog_set+0x54/0x54
> <4>[310289.546555] async_resume+0x1f/0x5b
> <4>[310289.546558] async_run_entry_fn+0x2b/0xc5
> <4>[310289.546561] process_one_work+0x1be/0x381
> <4>[310289.546564] worker_thread+0x20b/0x35b
> <4>[310289.546568] kthread+0xde/0xf7
> <4>[310289.546571] ? pr_cont_work+0x54/0x54
> <4>[310289.546574] ? kthread_blkcg+0x32/0x32
> <4>[310289.546577] ret_from_fork+0x1f/0x30
> <4>[310289.546580] </TASK>
> 
> <4>[171032.151834] <TASK>
> <4>[171032.151835] asm_sysvec_apic_timer_interrupt+0x16/0x20
> <4>[171032.151839] RIP: 0010:_raw_spin_unlock_irq+0x10/0x28
> <4>[171032.151842] Code: 2c 70 74 06 c3 cc cc cc cc cc 55 48 89 e5 e8
> 7e 30 2b ff 5d c3 cc cc cc cc cc 0f 1f 44 00 00 c6 07 00 fb 65 ff 0d
> af b1 2c 70 <74> 06 c3 cc cc cc cc cc 55 48 89 e5 e8 56 30 2b ff 5d c3
> cc cc cc
> <4>[171032.151844] RSP: 0018:ffff942447b334d8 EFLAGS: 00000286
> <4>[171032.151847] RAX: 0000000000000031 RBX: 0000000000000001 RCX:
> 0000000000000034
> <4>[171032.151849] RDX: 0000000000000031 RSI: 0000000000000002 RDI:
> ffffffff9103b1b0
> <4>[171032.151851] RBP: ffff942447b33660 R08: 0000000000000032 R09:
> 0000000000000010
> <4>[171032.151853] R10: ffffffff9103b370 R11: 00000000ffffffff R12:
> ffffffff9103b160
> <4>[171032.151855] R13: ffffd055000111c8 R14: 0000000000000000 R15:
> 0000000000000031
> <4>[171032.151858] evict_folios+0xf9e/0x1307
> <4>[171032.151861] ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> <4>[171032.151866] shrink_node+0x2e8/0x110e
> <4>[171032.151870] ? common_interrupt+0x1c/0x95
> <4>[171032.151872] ? common_interrupt+0x1c/0x95
> <4>[171032.151875] ? asm_common_interrupt+0x22/0x40
> <4>[171032.151878] ? __compaction_suitable+0x7c/0x9d
> <4>[171032.151882] try_to_free_pages+0x46e/0x857
> <4>[171032.151885] ? psi_task_change+0x7f/0x9c
> <4>[171032.151889] __alloc_pages_slowpath+0x4e2/0xe5c
> <4>[171032.151893] __alloc_pages+0x225/0x2a2
> <4>[171032.151896] __dma_direct_alloc_pages+0xed/0x1cb
> <4>[171032.151900] dma_direct_alloc_pages+0x21/0xa3
> <4>[171032.151903] dma_alloc_noncontiguous+0xd1/0x144
> <4>[171032.151907] snd_dma_noncontig_alloc+0x45/0xe3
> <4>[171032.151910] snd_dma_alloc_dir_pages+0x4f/0x81
> <4>[171032.151913] hda_cl_stream_prepare+0x66/0x15e
> [snd_sof_intel_hda_common (HASH:7df0 1)]
> <4>[171032.151921] hda_dsp_cl_boot_firmware+0xc4/0x2ca
> [snd_sof_intel_hda_common (HASH:7df0 1)]
> <4>[171032.151929] snd_sof_run_firmware+0xca/0x2d7 [snd_sof (HASH:9f20 2)]
> <4>[171032.151937] ? hda_dsp_resume+0x97/0x1a7
> [snd_sof_intel_hda_common (HASH:7df0 1)]
> <4>[171032.151945] sof_resume+0x155/0x251 [snd_sof (HASH:9f20 2)]
> <4>[171032.151953] ? pci_pm_suspend+0x1e7/0x1e7
> <4>[171032.151957] dpm_run_callback+0x3c/0x132
> <4>[171032.151960] device_resume+0x1f7/0x282
> <4>[171032.151962] ? dpm_watchdog_set+0x54/0x54
> <4>[171032.151965] async_resume+0x1f/0x5b
> <4>[171032.151968] async_run_entry_fn+0x2b/0xc5
> <4>[171032.151971] process_one_work+0x1be/0x381
> <4>[171032.151975] worker_thread+0x20b/0x35b
> <4>[171032.151978] kthread+0xde/0xf7
> <4>[171032.151981] ? pr_cont_work+0x54/0x54
> <4>[171032.151984] ? kthread_blkcg+0x32/0x32
> <4>[171032.151987] ret_from_fork+0x1f/0x30
> <4>[171032.151991] </TASK>

Thanks!

Both look like the code path via async PM resume.
Were both from the runtime PM resume?  Or the system resume?


Takashi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ