[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgrc4zvZg+Sz_aLmMbaJ6ZHYaJBQ7nzByj2pMZBbh6www@mail.gmail.com>
Date: Sat, 14 Dec 2024 08:31:24 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Geert Uytterhoeven <geert+renesas@...der.be>, Dwaipayan Ray <dwaipayanray1@...il.com>,
Lukas Bulwahn <lukas.bulwahn@...il.com>, Joe Perches <joe@...ches.com>,
Jonathan Corbet <corbet@....net>, Thorsten Leemhuis <linux@...mhuis.info>, Andy Whitcroft <apw@...onical.com>,
Niklas Söderlund <niklas.soderlund@...igine.com>,
Simon Horman <horms@...nel.org>, Conor Dooley <conor@...nel.org>,
Miguel Ojeda <miguel.ojeda.sandonis@...il.com>, Junio C Hamano <gitster@...ox.com>,
workflows@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] Increase minimum git commit ID abbreviation to 16 characters
On Sat, 14 Dec 2024 at 08:03, Matthew Wilcox <willy@...radead.org> wrote:
>
> I have wondered about using a different encoding for the sha1.
> Classic Ascii85 encoding is no good; it uses characters like '"\<
> which interact poorly with every shell. RFC1924 is somewhat better,
> but still uses characters that interact poorly with shell.
I suspect that the pain would much outweigh the gain. You'd need to
teach all tools about the new format, and you'd also need to add some
additional format specifying character just to make it unambiguous
*which* format you use, since if you just extend the character set
you'll have lots of hashes that could be either.
And you could disambiguate by testing both and seeing which one works
better, but at that point, you're much better off disambiguating the
current regular hex format by being a bit smarter about the objects.
Using base36 doesn't add enough bits to then make up for such a
disambiguation character in practice (ie 11 characters vs 12 - not
really noticeable).
base62 would be better, but christ does *that* really result in an
unreadable jumble. At that point I'd rather see 16-character hex than
the complete line noise that is base62.
Also, I bet people would start looking for shorthand formats that
spell rude words. You are kind of limited with hex, and sometimes
that's an advantage.
Linus
Powered by blists - more mailing lists