lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.0.999.0710201140550.3794@woody.linux-foundation.org>
Date:	Sat, 20 Oct 2007 11:49:51 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Sam Ravnborg <sam@...nborg.org>
cc:	Yinghai Lu <yhlu.kernel@...il.com>, Andi Kleen <ak@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: git/cscope with x86 merge



On Sat, 20 Oct 2007, Linus Torvalds wrote:
> 
> I could perhaps look at making "git log --follow" also break up files that 
> got totally rewritten (git already has a notion of "-B" to do that), but 
> no, we don't do it right now. 

Ok, if you guys have a current git source, and want to try something out, 
this fairly small patch does this.

As mentioned, git already supports the notion of "try to break files with 
the same name if the contents are too dissimilar". In other words, even if 
a file exists under the same name in both the old revision and in the 
newer one, we'll look at just how big the changes are, and if git decides 
that it looks like the whole file was rewritten, then git will split up 
the diff into a "delete old contents" and "create new contents". That then 
allows it to consider the file for rename detection.

(The rename detection may, of course, decide that the original file was 
the best source after all.. ;)

However, "git log --follow" didn't ever actually enable that for the logic 
that tries to figure out where a file came from, so you would only see 
this when generating the diffs, never in the file history logic. That's 
an easy one-liner to tree-diff.c: try_to_follow_renames().

However, when I actually tried it, it turns out that the break logic was 
broken - nobody has ever really depended on it. So while it was a 
one-liner to make "git log --follow" understand to break files that seem 
to be totally rewritten, it still didn't actually work, because the 
changes that totally rewrote vmlinux.lds.S wouldn't trigger the break 
logic.

So most of this (still fairly small patch) is just fixing the break logic 
in diffcore-break.c:should_break().

It hasn't gotten a lot of testing, but it does actually improve other 
cases too, so I think this is the right thign to do. I'll bring it up on 
the git lists.

Oh, and with this patch, the "break same filename" is still off by 
default. You need to do

	git log --follow -B arch/x86/kernel/vmlinux_64.lds.S

to enable the file-rewritten-so-break-associations detection. I suspect 
it makes sense to enable -B by default when using --follow (--follow 
already obviously implies rename detection), but that's a separate and 
independent issue.

		Linus

----
 diffcore-break.c |   11 +++++++----
 tree-diff.c      |    1 +
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/diffcore-break.c b/diffcore-break.c
index ae8a7d0..c71a226 100644
--- a/diffcore-break.c
+++ b/diffcore-break.c
@@ -45,8 +45,8 @@ static int should_break(struct diff_filespec *src,
 	 * The value we return is 1 if we want the pair to be broken,
 	 * or 0 if we do not.
 	 */
-	unsigned long delta_size, base_size, src_copied, literal_added,
-		src_removed;
+	unsigned long delta_size, base_size, max_size;
+	unsigned long src_copied, literal_added, src_removed;
 
 	*merge_score_p = 0; /* assume no deletion --- "do not break"
 			     * is the default.
@@ -63,7 +63,8 @@ static int should_break(struct diff_filespec *src,
 		return 0; /* error but caught downstream */
 
 	base_size = ((src->size < dst->size) ? src->size : dst->size);
-	if (base_size < MINIMUM_BREAK_SIZE)
+	max_size = ((src->size > dst->size) ? src->size : dst->size);
+	if (max_size < MINIMUM_BREAK_SIZE)
 		return 0; /* we do not break too small filepair */
 
 	if (diffcore_count_changes(src, dst,
@@ -89,12 +90,14 @@ static int should_break(struct diff_filespec *src,
 	 * less than the minimum, after rename/copy runs.
 	 */
 	*merge_score_p = (int)(src_removed * MAX_SCORE / src->size);
+	if (*merge_score_p > break_score)
+		return 1;
 
 	/* Extent of damage, which counts both inserts and
 	 * deletes.
 	 */
 	delta_size = src_removed + literal_added;
-	if (delta_size * MAX_SCORE / base_size < break_score)
+	if (delta_size * MAX_SCORE / max_size < break_score)
 		return 0;
 
 	/* If you removed a lot without adding new material, that is
diff --git a/tree-diff.c b/tree-diff.c
index 26bdbdd..7c261fd 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co
 	diff_opts.detect_rename = DIFF_DETECT_RENAME;
 	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
 	diff_opts.single_follow = opt->paths[0];
+	diff_opts.break_opt = opt->break_opt;
 	paths[0] = NULL;
 	diff_tree_setup_paths(paths, &diff_opts);
 	if (diff_setup_done(&diff_opts) < 0)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ