lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed,  9 Aug 2023 19:27:37 +0200
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Cc:     Marek Szyprowski <m.szyprowski@...sung.com>,
        Russell King <linux@...linux.org.uk>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Rapoport <rppt@...nel.org>
Subject: [PATCH v2] arm: dma-mapping: fix potential endless loop in
 __dma_page_dev_to_cpu()

The D-cache cleaning loop should not call folio_next() beyond the
requested region and rely on its parameters. Simply stop looping if left
counter reaches zero.

This fixes the following endless loop observed by RCU stall on the ARM
32bit Exynos5422-based Odroid-XU3lite board:

--->8---
rcu: INFO: rcu_sched self-detected stall on CPU
rcu:     0-....: (27320 ticks this GP) idle=e414/1/0x40000002 softirq=36/36 fqs=13044
rcu:     (t=27385 jiffies g=-1067 q=34 ncpus=8)
CPU: 0 PID: 93 Comm: kworker/0:1H Not tainted 6.5.0-rc5-next-20230807 #6981
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: mmc_complete mmc_blk_mq_complete_work
PC is at _set_bit+0x28/0x44
LR is at __dma_page_dev_to_cpu+0xdc/0x170
..
 _set_bit from __dma_page_dev_to_cpu+0xdc/0x170
 __dma_page_dev_to_cpu from dma_direct_unmap_sg+0x100/0x130
 dma_direct_unmap_sg from dw_mci_post_req+0x68/0x6c
 dw_mci_post_req from mmc_blk_mq_post_req+0x34/0x100
 mmc_blk_mq_post_req from mmc_blk_mq_complete_work+0x50/0x60
 mmc_blk_mq_complete_work from process_one_work+0x20c/0x4d8
 process_one_work from worker_thread+0x58/0x54c
 worker_thread from kthread+0xe0/0xfc
 kthread from ret_from_fork+0x14/0x2c
--->8---

While touching this code, move the set_bit() operation, which deals with
atomics, a bit up in the call chain. The new order helps a bit compiler
to produce code computing folio_size() only once.

Fixes: cc24e9c0895c ("arm: implement the new page table range API")
Signed-off-by: Marek Szyprowski <m.szyprowski@...sung.com>
---
v2:
- changed the code and explaiation as suggested by Russell and Matthew

v1:
- https://lore.kernel.org/all/20230807152657.1692414-1-m.szyprowski@samsung.com/
---
 arch/arm/mm/dma-mapping.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 70cb7e63a9a5..0474840224d9 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -719,8 +719,10 @@ static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 		}
 
 		while (left >= (ssize_t)folio_size(folio)) {
-			set_bit(PG_dcache_clean, &folio->flags);
 			left -= folio_size(folio);
+			set_bit(PG_dcache_clean, &folio->flags);
+			if (!left)
+				break;
 			folio = folio_next(folio);
 		}
 	}
-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ