lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 24 May 2020 15:24:56 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Sasha Levin <sashal@...nel.org>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Heikki Krogerus <heikki.krogerus@...ux.intel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>
Subject: Re: [GIT PULL] Driver core fixes for 5.7-rc7 - take 2

On Sun, May 24, 2020 at 12:45 PM Sasha Levin <sashal@...nel.org> wrote:
>
> Interesting. My thinking around --follow was that it's like
> --full-history in the sense that it won't prune history, but it would
> also keep listing history beyond file renames.

No. It's only completely accidentally like full-history because it
sets the flag that basically says "give me the whole diff" - so that
if the file goes away, you see where it came from.

And because it wants the whole diff and doesn't limit it to just the
one file that is tracked, it ends up following both sides of the merge
because _other_ files changed in that merge.

> The --follow functionality is quite useful when looking at older
> branches and trying to understand where changes should go into on those
> older branches.

It is useful, but it is ambiguous. What happens if the file came to be
two different ways in two different branches? Or what happens if two
files were combined into one?

So "git log --follow" is not _wrong_, but the operation of trying to
follow a file identity is basically broken. In git, it's not a
fundamental operation (because git isn't broken), it's just an
emulation of that broken concept that often works in practice.

It's a "let's give people what they are used to", but it really isn't
very well-defined in the general case. You think it works, because for
the simple cases it gives the "obviously correct" answer.

> We also do have some notion of "file identity" in the kernel;

No, we really really don't.

The CVS/SVN kind of "file identity" is more like an "inode". Nothing
in the kernel sources cares about the inode number of a file. The
inode will be different depending on how something was created, and
when you rename what previously were two different files to one single
path (as a result of a merge), you have to pick one at random, and
lose the other.

So you end up with the crazy random "Attic" model of stale files in
CVS, exactly because the thing is based on a file identity that is
completely fundamentally broken.

Note how you've never seen anything like that in git. Because the
whole concept is garbage, and git isn't garbage.

Yes, I still hate CVS with a passion, almost two decades after I had
to use that horrid horrid thing. Some mental scars will  not go away.

>i t's prevalent with "quirk files". Look at these for example:
> [ deleted]
> We know that patches to those files are likely to contain quirks

No, those are not file identities AT ALL.

Those are just pathnames with some meaning. You can throw away the
file, and start a new one, and the meaning doesn't go away - because
it's attached to the path.

And yes, certain paths in the repository can be special, although
that's irrelevant to a SCM, of course. Git won't care. It's just
"contents with a name".

Which is exactly what git tracks, and is *not* what the SVN/CVS kind
of completely broken file identity is all about.

          Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ