lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <16573630-4cd5-4539-8f02-85cbbeb3866d@amd.com>
Date: Tue, 20 Aug 2024 10:08:39 +0530
From: Bharata B Rao <bharata@....com>
To: Usama Arif <usamaarif642@...il.com>, akpm@...ux-foundation.org
Cc: yuzhao@...gle.com, david@...hat.com, leitao@...ian.org,
 huangzhaoyang@...il.com, willy@...radead.org, vbabka@...e.cz,
 linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH RESEND] mm: drop lruvec->lru_lock if contended when
 skipping folio

On 20-Aug-24 12:16 AM, Usama Arif wrote:
> lruvec->lru_lock is highly contended and is held when calling
> isolate_lru_folios. If the lru has a large number of CMA folios
> consecutively, while the allocation type requested is not MIGRATE_MOVABLE,
> isolate_lru_folios can hold the lock for a very long time while it
> skips those. vmscan_lru_isolate tracepoint showed that skipped can go
> above 70k in production and lockstat shows that waittime-max is x1000
> higher without this patch.
> This can cause lockups [1] and high memory pressure for extended periods of
> time [2]. Hence release the lock if its contended when skipping a folio to
> give other tasks a chance to acquire it and not stall.
> 
> [1] https://lore.kernel.org/all/CAOUHufbkhMZYz20aM_3rHZ3OcK4m2puji2FGpUpn_-DevGk3Kg@mail.gmail.com/
> [2] https://lore.kernel.org/all/ZrssOrcJIDy8hacI@gmail.com/

Though the above link[2] mentions it, can you explicitly include the 
specific condition that we saw in the patch description?

"isolate_lru_folios() can end up scanning through a huge number of 
folios with lruvec spinlock held. For FIO workload, ~150million order=0 
folios were skipped to isolate a few ZONE_DMA folios."

Regards,
Bharata.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ