lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMuHMdVXh+9htop6R=1gn5AmX0YJtqtAB9iFmZm6DweyyG-k3w@mail.gmail.com>
Date: Wed, 23 Apr 2025 08:40:12 +0200
From: Geert Uytterhoeven <geert@...ux-m68k.org>
To: Guenter Roeck <linux@...ck-us.net>
Cc: Carlos Maiolino <cem@...nel.org>, Hans Holmberg <Hans.Holmberg@....com>, 
	Dave Chinner <david@...morbit.com>, "Darrick J . Wong" <djwong@...nel.org>, hch <hch@....de>, 
	"linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] xfs: add tunable threshold parameter for triggering zone GC

On Sun, 20 Apr 2025 at 19:43, Guenter Roeck <linux@...ck-us.net> wrote:
> On 4/20/25 03:53, Carlos Maiolino wrote:
> > On Sun, Apr 20, 2025 at 02:47:02AM -0700, Guenter Roeck wrote:
> >> On Tue, Mar 25, 2025 at 09:10:49AM +0000, Hans Holmberg wrote:
> >>> Presently we start garbage collection late - when we start running
> >>> out of free zones to backfill max_open_zones. This is a reasonable
> >>> default as it minimizes write amplification. The longer we wait,
> >>> the more blocks are invalidated and reclaim cost less in terms
> >>> of blocks to relocate.
> >>>
> >>> Starting this late however introduces a risk of GC being outcompeted
> >>> by user writes. If GC can't keep up, user writes will be forced to
> >>> wait for free zones with high tail latencies as a result.
> >>>
> >>> This is not a problem under normal circumstances, but if fragmentation
> >>> is bad and user write pressure is high (multiple full-throttle
> >>> writers) we will "bottom out" of free zones.
> >>>
> >>> To mitigate this, introduce a zonegc_low_space tunable that lets the
> >>> user specify a percentage of how much of the unused space that GC
> >>> should keep available for writing. A high value will reclaim more of
> >>> the space occupied by unused blocks, creating a larger buffer against
> >>> write bursts.
> >>>
> >>> This comes at a cost as write amplification is increased. To
> >>> illustrate this using a sample workload, setting zonegc_low_space to
> >>> 60% avoids high (500ms) max latencies while increasing write
> >>> amplification by 15%.
> >>>
> >> ...
> >>>   bool
> >>>   xfs_zoned_need_gc(
> >>>     struct xfs_mount        *mp)
> >>>   {
> >>> +   s64                     available, free;
> >>> +
> >> ...
> >>> +
> >>> +   free = xfs_estimate_freecounter(mp, XC_FREE_RTEXTENTS);
> >>> +   if (available < mult_frac(free, mp->m_zonegc_low_space, 100))
> >>> +           return true;
> >>> +
> >>
> >> With some 32-bit builds (parisc, openrisc so far):
> >>
> >> Error log:
> >> ERROR: modpost: "__divdi3" [fs/xfs/xfs.ko] undefined!
> >> ERROR: modpost: "__umoddi3" [fs/xfs/xfs.ko] undefined!
> >> ERROR: modpost: "__moddi3" [fs/xfs/xfs.ko] undefined!
> >> ERROR: modpost: "__udivdi3" [fs/xfs/xfs.ko] undefined!
> >>
> >
> > I opened a discussion about this:
> >
> > https://lore.kernel.org/lkml/20250419115157.567249-1-cem@kernel.org/
>
> A possible local solution is below. Note the variable type change from s64 to u64.
>
> Guenter
> ---
>
> diff --git a/fs/xfs/xfs_zone_gc.c b/fs/xfs/xfs_zone_gc.c
> index 8c541ca71872..6dde2a680e75 100644
> --- a/fs/xfs/xfs_zone_gc.c
> +++ b/fs/xfs/xfs_zone_gc.c
> @@ -170,7 +170,7 @@ bool
>   xfs_zoned_need_gc(
>          struct xfs_mount        *mp)
>   {
> -       s64                     available, free;
> +       u64                     available, free, rem;
>
>          if (!xfs_group_marked(mp, XG_TYPE_RTG, XFS_RTG_RECLAIMABLE))
>                  return false;
> @@ -183,7 +183,12 @@ xfs_zoned_need_gc(
>                  return true;
>
>          free = xfs_estimate_freecounter(mp, XC_FREE_RTEXTENTS);
> -       if (available < mult_frac(free, mp->m_zonegc_low_space, 100))
> +
> +       rem = do_div(free, 100);
> +       free = free * mp->m_zonegc_low_space +
> +               div_u64(rem * mp->m_zonegc_low_space, 100);
> +
> +       if (available < free)
>                  return true;
>
>          return false;

There's also mul_u64_u32_div():

    static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
    ...

Unsigned, too.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ