lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Dec 2009 19:22:20 -0500
From:	Mark Hounschell <dmarkh@....rr.com>
To:	"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
CC:	"markh@...pro.net" <markh@...pro.net>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Alain Knaff <alain@...ff.lu>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"fdutils@...tils.linux.lu" <fdutils@...tils.linux.lu>,
	"Li, Shaohua" <shaohua.li@...el.com>, Ingo Molnar <mingo@...e.hu>,
	Alain Knaff <alain@...ff.lu>
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was:
 Re: Cannot format floppies under kernel 2.6.*?)

On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>
>>> [ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for 
>>>   details, but Mark is basically chasing down a situation where the floppy 
>>>   driver seems to have trouble formatting floppies, and it happened 
>>>   between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a 
>>>   memory block transfers the wrong value for the first byte of the block.
>>>
>>>   Which should be impossible, but whatever. Some part of the system has a 
>>>   cached buffer that isn't flushed.
>>>
>>>   What gets _you_ guys involved is that Mark cannot reproduce the bug if 
>>>   HPET is disabled in the BIOS or by using 'nohpet'. He found that out by 
>>>   pure luck while bisecting, because some time during his bisect, his 
>>>   machine wouldn't even boot with HPET.
>>>
>>>   So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But 
>>>   2.6.28 (and current -git) does not.  Any ideas? ]
>>>
>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>
>>>> Ok, I may have something that might help.
>>>>
>>>> # git bisect bad
>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>> Author: venkatesh.pallipadi@...el.com <venkatesh.pallipadi@...el.com>
>>>> Date:   Fri Sep 5 18:02:18 2008 -0700
>>>>
>>>>     x86: HPET_MSI Initialise per-cpu HPET timers
>>>>
>>>>     Initialize a per CPU HPET MSI timer when possible. We retain the HPET
>>>>     timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
>>>>     setup the remaining HPET timers as per CPU MSI based timers. This per CPU
>>>>     timer will eliminate the need for timer broadcasting with IRQ 0 when there
>>>>     is non-functional LAPIC timer across CPU deep C-states.
>>>>
>>>>     If there are more CPUs than number of available timers, CPUs that do not
>>>>     find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>>>>
>>>>     Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>
>>>>     Signed-off-by: Shaohua Li <shaohua.li@...el.com>
>>>>     Signed-off-by: Ingo Molnar <mingo@...e.hu>
>>>>
>>>> And of coarse this was the first commit that I could not boot if I had hpet
>>>> enabled. To get this one to boot (single user mode only) I had to add the
>>>> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>>>>
>>>> commit  5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>
>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
>>>>  {
>>>>
>>>>         if (request_irq(dev->irq, hpet_interrupt_handler,
>>>> -                       IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
>>>> +                       IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
>>>>                 return -1;
>>>>
>>>>         disable_irq(dev->irq);
>>>>
>>>> AND add the quiet cmdline option.
>>>
>>> Ok, so we know why HPET didn't boot for you, and that was fixed later (by 
>>> that 5ceb1a04). But is this also when the floppy started mis-behaving?
>>>
>>
>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
>> working
>> and also when I could no longer boot with hpet enabled.
> 
> 
> I am missing something here. Commit 26afe5f2 is where system does not
> boot with HPET or is it where the floppy stops working when you boot
> with HPET enabled.
> 

As it happens, both happen there. Commit 5ceb1a04 is where it starts
booting _again_ with hpet enabled. So I took that patch (5ceb1a04) and
applied it to (26afe5f2f) to be able to boot with hpet enabled.  I had to
use the quiet option to get to a login prompt, but there is where the
floppy format first fails, just as it does in 2.6.28 and up.

> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
> output in each case. With that option, we should be using local APIC
> timer and PIT, HPET or HPET with MSI should not really matter. Does it
> still fail with .28 with that option?
> 

Yes, I will try that for you but will have to wait until the morning. Sorry.

Regards
Mark


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ