[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7A6FE6BF-55F9-4BDD-99C0-EECB9E7E9064@fb.com>
Date: Thu, 18 Nov 2021 08:22:43 +0000
From: Nick Terrell <terrelln@...com>
To: Geert Uytterhoeven <geert@...ux-m68k.org>
CC: Nick Terrell <nickrterrell@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Parisc List <linux-parisc@...r.kernel.org>,
Helge Deller <deller@....de>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2 2/3] lib: zstd: Don't inline functions in zstd_opt.c
> On Nov 17, 2021, at 11:52 PM, Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
>
> Hi Nick,
>
> On Wed, Nov 17, 2021 at 9:08 PM Nick Terrell <nickrterrell@...il.com> wrote:
>> `zstd_opt.c` contains the match finder for the highest compression
>> levels. These levels are already very slow, and are unlikely to be used
>> in the kernel. If they are used, they shouldn't be used in latency
>> sensitive workloads, so slowing them down shouldn't be a big deal.
>>
>> This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
>> I've also opened an issue upstream [1] so that we can properly tackle
>> the code size issue in `zstd_opt.c` for all users, and can hopefully
>> remove this hack in the next zstd version we import.
>>
>> Bloat-o-meter output on x86-64:
>>
>> ```
>>> ../scripts/bloat-o-meter vmlinux.old vmlinux
>> add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
>> Function old new delta
>> ZSTD_compressBlock_opt_generic.constprop - 7559 +7559
>> ZSTD_insertBtAndGetAllMatches - 6304 +6304
>> ZSTD_insertBt1 - 1731 +1731
>> ZSTD_storeSeq - 693 +693
>> ZSTD_BtGetAllMatches - 255 +255
>> ZSTD_updateRep - 128 +128
>> ZSTD_updateTree 96 99 +3
>> ZSTD_insertAndFindFirstIndexHash3 81 - -81
>> ZSTD_setBasePrices.constprop 98 - -98
>> ZSTD_litLengthPrice.constprop 138 - -138
>> ZSTD_count 362 181 -181
>> ZSTD_count_2segments 1407 938 -469
>> ZSTD_insertBt1.constprop 2689 - -2689
>> ZSTD_compressBlock_btultra2 19990 423 -19567
>> ZSTD_compressBlock_btultra 19633 15 -19618
>> ZSTD_initStats_ultra 19825 - -19825
>> ZSTD_compressBlock_btopt 20374 12 -20362
>> ZSTD_compressBlock_btopt_extDict 29984 12 -29972
>> ZSTD_compressBlock_btultra_extDict 30718 15 -30703
>> ZSTD_compressBlock_btopt_dictMatchState 32689 12 -32677
>> ZSTD_compressBlock_btultra_dictMatchState 33574 15 -33559
>> Total: Before=6611828, After=6418562, chg -2.92%
>> ```
>>
>> [0] https://lkml.org/lkml/2021/11/14/189
>> [1] https://github.com/facebook/zstd/issues/2862
>>
>> Reported-by: Geert Uytterhoeven <geert@...ux-m68k.org>
>> Signed-off-by: Nick Terrell <terrelln@...com>
>
> Thanks for your patch!
>
> Impact on lib/zstd/zstd_compress.ko for atari_defconfig:
>
> add/remove: 5/4 grow/shrink: 1/9 up/down: 15392/-167214 (-151822)
>
> Nice!
>
> Tested-by: Geert Uytterhoeven <geert@...ux-m68k.org>
>
>> --- a/lib/zstd/compress/zstd_opt.c
>> +++ b/lib/zstd/compress/zstd_opt.c
>> @@ -894,7 +906,7 @@ static void ZSTD_optLdm_processMatchCandidate(ZSTD_optLdm_t* optLdm, ZSTD_match_
>> */
>> U32 posOvershoot = currPosInBlock - optLdm->endPosInBlock;
>> ZSTD_optLdm_skipRawSeqStoreBytes(&optLdm->seqStore, posOvershoot);
>> - }
>> + }
>> ZSTD_opt_getNextMatchAndUpdateSeqStore(optLdm, currPosInBlock, remainingBytes);
>> }
>> ZSTD_optLdm_maybeAddMatch(matches, nbMatches, optLdm, currPosInBlock);
>
> This change is unrelated. With that removed:
> Reviewed-by: Geert Uytterhoeven <geert@...ux-m68k.org>
Thanks for the review! I will remove this change before submitting
the PR. Unless you have a strong preference, I won’t put up another
version of the patch set with just this change.
Best,
Nick Terrell
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
Powered by blists - more mailing lists