linux-kernel - Re: [PATCH] setlocalversion: Add workaround for "git describe" performance issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87bjz0k17c.fsf@prevas.dk>
Date: Thu, 31 Oct 2024 11:37:27 +0100
From: Rasmus Villemoes <ravi@...vas.dk>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: Masahiro Yamada <masahiroy@...nel.org>,  linux-kernel@...r.kernel.org,
  linux-kbuild@...r.kernel.org, git@...r.kernel.org
Subject: Re: [PATCH] setlocalversion: Add workaround for "git describe"
 performance issue

On Wed, Oct 30 2024, Josh Poimboeuf <jpoimboe@...nel.org> wrote:

> If HEAD isn't associated with an annotated tag, a bug (or feature?) in
> "git describe --match" causes it to search every commit in the entire
> repository looking for additional match candidates.  Instead of it
> taking a fraction of a second, it adds 10-15 seconds to the beginning of
> every kernel build.
>
> Fix it by adding an additional dummy match which is slightly further
> away from the most recent one, along with setting the max candidate
> count to 1 (not 2, apparently another bug).
>

cc += git list

Hm, I tried looking at the git describe source code, and while I can't
claim I understand it very well, I think the main problem is around
this part:

			if (!tags && !all && n->prio < 2) {
				unannotated_cnt++;
			} else if (match_cnt < max_candidates) {
				struct possible_tag *t = &all_matches[match_cnt++];
				t->name = n;
				t->depth = seen_commits - 1;
				t->flag_within = 1u << match_cnt;
				t->found_order = match_cnt;
				c->object.flags |= t->flag_within;
				if (n->prio == 2)
					annotated_cnt++;
			}
			else {
				gave_up_on = c;
				break;
			}

So in the case where one doesn't pass any --match, we get something like

git describe --debug 5f78aec0d7e9
describe 5f78aec0d7e9
No exact match on refs or tags, searching to describe
 annotated        243 v4.19-rc5
 annotated        485 v4.19-rc4
 annotated        814 v4.19-rc3
 annotated       1124 v4.19-rc2
 annotated       1391 v4.19-rc1
 annotated      10546 v4.18
 annotated      10611 v4.18-rc8
 annotated      10819 v4.18-rc7
 annotated      11029 v4.18-rc6
 annotated      11299 v4.18-rc5
traversed 11400 commits
more than 10 tags found; listed 10 most recent
gave up search at 1e4b044d22517cae7047c99038abb444423243ca
v4.19-rc5-243-g5f78aec0d7e9

and that "gave up" commit is v4.18-rc4, the eleventh commit
encountered. That also explains why you have to add a "dummy" second
--match to make --candidates=1 have the expected behaviour.

Perhaps the logic should instead be that as soon as match_cnt hits
max_candidates (i.e. all the tags we're going to consider have actually
been visited), we break out. That is, the last "else" above should
instead be replaced by

  if (match_cnt == max_candidates) {
    ... /* ? , gave_up_on is now a misnomer */
    break;
  }

Then as a further DWIM aid, wherever the initialization logic is could
be updated so that, after expanding all the --match= wildcards, if the
number of tags is less than max_candidates, automatically lower
max_candidates to that number (which in the setlocalversion case will
always be 1 because we're not actually passing a wildcard).

Or, we could explicitly on the kernel side pass that --candidates=1, but
yes, I agree it looks like a bug that the loop requires encountering +1
tag to hit that break;.

Rasmus