lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5ccf0d5-a757-4d1b-84b9-36a5f02e117c@roeck-us.net>
Date: Sun, 20 Apr 2025 10:42:56 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Carlos Maiolino <cem@...nel.org>
Cc: Hans Holmberg <Hans.Holmberg@....com>, Dave Chinner
 <david@...morbit.com>, "Darrick J . Wong" <djwong@...nel.org>,
 hch <hch@....de>, "linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] xfs: add tunable threshold parameter for triggering zone
 GC

On 4/20/25 03:53, Carlos Maiolino wrote:
> On Sun, Apr 20, 2025 at 02:47:02AM -0700, Guenter Roeck wrote:
>> On Tue, Mar 25, 2025 at 09:10:49AM +0000, Hans Holmberg wrote:
>>> Presently we start garbage collection late - when we start running
>>> out of free zones to backfill max_open_zones. This is a reasonable
>>> default as it minimizes write amplification. The longer we wait,
>>> the more blocks are invalidated and reclaim cost less in terms
>>> of blocks to relocate.
>>>
>>> Starting this late however introduces a risk of GC being outcompeted
>>> by user writes. If GC can't keep up, user writes will be forced to
>>> wait for free zones with high tail latencies as a result.
>>>
>>> This is not a problem under normal circumstances, but if fragmentation
>>> is bad and user write pressure is high (multiple full-throttle
>>> writers) we will "bottom out" of free zones.
>>>
>>> To mitigate this, introduce a zonegc_low_space tunable that lets the
>>> user specify a percentage of how much of the unused space that GC
>>> should keep available for writing. A high value will reclaim more of
>>> the space occupied by unused blocks, creating a larger buffer against
>>> write bursts.
>>>
>>> This comes at a cost as write amplification is increased. To
>>> illustrate this using a sample workload, setting zonegc_low_space to
>>> 60% avoids high (500ms) max latencies while increasing write
>>> amplification by 15%.
>>>
>> ...
>>>   bool
>>>   xfs_zoned_need_gc(
>>>   	struct xfs_mount	*mp)
>>>   {
>>> +	s64			available, free;
>>> +
>> ...
>>> +
>>> +	free = xfs_estimate_freecounter(mp, XC_FREE_RTEXTENTS);
>>> +	if (available < mult_frac(free, mp->m_zonegc_low_space, 100))
>>> +		return true;
>>> +
>>
>> With some 32-bit builds (parisc, openrisc so far):
>>
>> Error log:
>> ERROR: modpost: "__divdi3" [fs/xfs/xfs.ko] undefined!
>> ERROR: modpost: "__umoddi3" [fs/xfs/xfs.ko] undefined!
>> ERROR: modpost: "__moddi3" [fs/xfs/xfs.ko] undefined!
>> ERROR: modpost: "__udivdi3" [fs/xfs/xfs.ko] undefined!
>>
> 
> I opened a discussion about this:
> 
> https://lore.kernel.org/lkml/20250419115157.567249-1-cem@kernel.org/

A possible local solution is below. Note the variable type change from s64 to u64.

Guenter
---

diff --git a/fs/xfs/xfs_zone_gc.c b/fs/xfs/xfs_zone_gc.c
index 8c541ca71872..6dde2a680e75 100644
--- a/fs/xfs/xfs_zone_gc.c
+++ b/fs/xfs/xfs_zone_gc.c
@@ -170,7 +170,7 @@ bool
  xfs_zoned_need_gc(
         struct xfs_mount        *mp)
  {
-       s64                     available, free;
+       u64                     available, free, rem;

         if (!xfs_group_marked(mp, XG_TYPE_RTG, XFS_RTG_RECLAIMABLE))
                 return false;
@@ -183,7 +183,12 @@ xfs_zoned_need_gc(
                 return true;

         free = xfs_estimate_freecounter(mp, XC_FREE_RTEXTENTS);
-       if (available < mult_frac(free, mp->m_zonegc_low_space, 100))
+
+       rem = do_div(free, 100);
+       free = free * mp->m_zonegc_low_space +
+               div_u64(rem * mp->m_zonegc_low_space, 100);
+
+       if (available < free)
                 return true;

         return false;


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ