lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.01.0907211133010.19335@localhost.localdomain>
Date:	Tue, 21 Jul 2009 12:15:39 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Krzysztof Oledzki <olel@....pl>
cc:	Greg KH <gregkh@...e.de>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>, stable@...nel.org,
	lwn@....net
Subject: Re: Linux 2.6.27.27



On Tue, 21 Jul 2009, Linus Torvalds wrote:
> 
> Great. This is all about as perfect as could be asked for. Now it's just a 
> question of trying to find the right code generation difference...

Ok, that "just" is turning out to be really painful.

I've tried to do clever things, but the best I've been able to do is to 
get the relevant differences down to about 22 thousand lines of assembler 
diffs that don't match either of the working kernels. Sadly, 22KLOC of 
assembler diffs isn't something anybody can reasonably read to even start 
to make a guess about which lines are causing problems. 

So what I'd love to do is to narrow the failure down a bit, by using 
-fno-strict-overflow only on _parts_ of the tree and then try a couple of 
kernels to see if they hang, to see which part it is that mis-compiles.

With a newer kernel, we could do something like this:

	diff --git a/Makefile b/Makefile
	index 79957b3..b096be2 100644
	--- a/Makefile
	+++ b/Makefile
	@@ -565,9 +565,6 @@ KBUILD_CFLAGS += $(call cc-option,-Wdeclaration-after-statement,)
	 # disable pointer signed / unsigned warnings in gcc 4.0
	 KBUILD_CFLAGS += $(call cc-option,-Wno-pointer-sign,)
	 
	-# disable invalid "can't wrap" optimizations for signed / pointers
	-KBUILD_CFLAGS	+= $(call cc-option,-fno-strict-overflow)
	-
	 # revert to pre-gcc-4.4 behaviour of .eh_frame
	 KBUILD_CFLAGS	+= $(call cc-option,-fno-dwarf2-cfi-asm)
	 
	diff --git a/drivers/Makefile b/drivers/Makefile
	index bc4205d..1250b55 100644
	--- a/drivers/Makefile
	+++ b/drivers/Makefile
	@@ -5,6 +5,8 @@
	 # Rewritten to use lists instead of if-statements.
	 #
	 
	+subdir-ccflags-y += -fno-strict-overflow
	+
	 obj-y				+= gpio/
	 obj-$(CONFIG_PCI)		+= pci/
	 obj-$(CONFIG_PARISC)		+= parisc/

to say "use -fno-strict-overflow only when compiling objects in the
drivers/ subdirectories", but I'm pretty sure that whole clever
'subdir-ccflags-y' thing was added pretty recently, and won't work in
2.6.27

However, since there is _some_ reason to wonder about whether the problem 
could be in radeonfb (because the last printouts before the hang are about 
that), it would be good to test just that part.

So if you have the time and energy, it would be very interesting if you 
could do something like this:

 - remove the "KBUILD_CFLAGS  += $(call cc-option,-fno-strict-overflow)"
   entirely from the main Makefile.

 - one directory at a time, add

	ccflags-y += -fno-strict-overflow

   to the Makefile in just that particular directory, and compile and test 
   the kernel. Now, since your old kernel doesn't have that nifty new
   "subdir-ccflags-y" thing, you can't do it for big parts of the kernel, 
   you can literally do it for just the contents of one subdirectory 
   (non-recusive!) at a time, but while there's two thousand 
   subdirectories in the Linux kernel sources, judicious sprinking of that 
   into the tree could hopefully make it possible to find.

 - the first Makefile's to test would be 'drivers/video/aty/Makefile'. If 
   that one doesn't work, some scripting might be in order, eg something 
   like

	for i in $(find drivers -name Makefile)
	do
		( echo "ccflags-y += -fno-strict-overflow" ; cat $i ) > $i.new
		mv $i.new $i
	done

   should add it to all the subdirectories under 'drivers', etc.

and if you can find the subdirectory where '-fno-strict-overflow' makes 
the difference, at that point I'd love to see the kernel image where 
things worked (ie the last kernel you booted successfully _before_ the 
kernel that failed) and the kernel that fails - now hopefully the 
differences should be much smaller (how small will obviously depend on 
whether you caught the difference in just one subdirectory or whether you 
scripted it over lots and lots of subdirectories).

Of course, the tighter you can do this, the better. If it happens to be in 
'drivers/video/aty/' for example, and you end up being really gung-ho 
about this and want to narrow it down to not just the subdirectory, but a 
few files, you could remove the per-directory "ccflags-y" line, and do a 
few per-file CFLAGS entries instead, like:

	CFLAGS_radeon_base.o += -fno-strict-overflow

etc.

And hey, if you think this is too much work, then you're right. It's a lot 
of work. So don't worry if you can't be bothered - it would be wonderful 
to try to get this thing resolved, but I do realize I'm asking a lot here.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ