lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Mar 2022 13:01:15 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Borislav Petkov <bp@...e.de>
Cc:     x86-ml <x86@...nel.org>, lkml <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] RAS updates for 5.18

[ I've written this kind of reply multiple times before and I
_thought_ we had something in the docs about this, but I can't find
them, so here goes again ]

On Wed, Mar 23, 2022 at 10:29 AM Borislav Petkov <bp@...e.de> wrote:
>
> But before you do, a git question if I may:
> [.. details removed for brevity..]
> However, when creating the diffstat for the pull request, it would
> add additional files to it from tip/locking/core even if all the
> tip/locking/core changes are already in your master branch:

So what you are describing is a very fundamental thing - your branch
has multiple separate starting points, since you had different
branches that you merged into your tree..

Sometimes having multiple branches doesn't actually cause that,
because the different branches may all have the same base starting
point.

Git calls these things "merge bases", because those starting points is
what you have to take into account when merging, they are the "base"
for actually resolving the differences that come in through multiple
branches.

And git handles that perfectly fine when merging by doing all the
appropriate magic. And "git log" has no problem with it either - you
can list all the commits that are in your head but are *not* in some
arbitrary number of merge bases just fine).

But when you do a "git diff", things are different (and "git
request-pull" basically just does a diff to show what the thing was
about).

A "diff" is fundamentally something you do on two end-points. You have
a beginning, you have an end, and you ask "what changed between these
two end-points".

But that fundamentally means that when you have multiple different
merge bases, and you ask "what changed since the beginning and the
current state", your question is fundamentally ambiguous. There is not
a "the beginning". There are *multiple* beginnings.

So what git will do it to pick _one_ beginning, and just use that.

And that means that yes, the diff will show the changes since that
beginning, but since the end result depends on the _other_ beginning
too, it will show the changes that came from that other beginning as
well.

Sometimes those changes end up being empty, because the "first
beginning" might already have had all of that. So sometimes you might
not even notice that what "git diff" gave you was ambiguous.

So "git request-pull" does both a log (for the shortlog of commits)
and a diff (for the diffstat), and the log should always be correct,
but the diffstat will have this ambiguity problem if you have multiple
merge bases.

> After poking at it a bit more, I found a hint as to what it might be
> complaining about:
>
> $ git diff --stat master...ras_core_for_v5.18_rc1
> warning: master...ras_core_for_v5.18_rc1: multiple merge bases, using 754e0b0e35608ed5206d6a67a791563c631cec07

Yeah, so using that "..." format will warn about how there are
multiple possible merge bases, and point out how it just picked one at
random.

> So it looks like the diffstat for the pull request is created by using
> the -rc4 as the merge base:
>
> $ git diff --stat v5.17-rc4..ras_core_for_v5.18_rc1
> ...
>
> while the correct diffstat should be from the merge-commit onwards:

No. There is no such thing as a *correct* merge base. You had two, and
git picked one of them. In the general case there just isn't a correct
answer.

Now, it turns out that you shouldn't have done a merge at all. I'm not
sure why that commit c0f6799de2a0 ("Merge tip:locking/core into
tip:ras/core") even exists, because it could have been done as a
fast-forward. Did you use "--no-ff" to explicitly not do that?

So *because* you have that pointless merge that could just have been a
fast-forward, you think that "hey, if it had just picked the other
merge base it would all have been fine". But in a normal merge
situation, the two merge bases would both have had some work that
wasn't in the other side, so that's just because you did something
odd.

> So, long story short, what am I doing wrong?

So in the general case, you aren't doing anything wrong: if you merge
multiple real branches, it's just that "git diff" cannot find a single
unique point to use as the base, and you'll get some odd random diff.

But if you are a developer who merges multiple real branches, you
obviously know how to merge things, and one way to sort it out is to
basically do a test-merge just for yourself:

    # WWLS? ("What would Linus See?")
    git branch -b test-merge linus
    git merge my-branch
    git diff -C --stat --summary ORIG_HEAD..
    .. save that away ..

    # go back to your real work, and remove that test-merge
    git checkout <normal-branch>
    git branch -D test-merge

will generate a diffstat of what a merge (which fundamentally knows
how to resolve multiple merge bases) would generate.

(Obviously you can just do the above in a completely separate git tree
too, if you don't like doing those temporary branches that might mess
up your working tree).

The other alternative is to just send me the bogus diffstat - I'm
sadly quite used to it, since a number of people just do "git
request-pull", see that it's odd, don't understand why, and just let
me sort it out.

Now the good news is that people who are afraid of merges and the
above kind of complexity will never actually see this situation. You
can't get multiple merge bases if you don't do any merges yourself.

So this kind of git complexity only happens to people who are supposed
to be able to handle it. You clearly figured out what was going on,
you didn't perhaps just realize the full story.



                  Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ