lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Dec 2009 09:38:18 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mark Hounschell <markh@...pro.net>
cc:	Mark Hounschell <dmarkh@....rr.com>, Alain Knaff <alain@...ff.lu>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	fdutils@...tils.linux.lu,
	Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>,
	Shaohua Li <shaohua.li@...el.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was:
 Re: Cannot format floppies under kernel 2.6.*?)


[ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for 
  details, but Mark is basically chasing down a situation where the floppy 
  driver seems to have trouble formatting floppies, and it happened 
  between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a 
  memory block transfers the wrong value for the first byte of the block.

  Which should be impossible, but whatever. Some part of the system has a 
  cached buffer that isn't flushed.

  What gets _you_ guys involved is that Mark cannot reproduce the bug if 
  HPET is disabled in the BIOS or by using 'nohpet'. He found that out by 
  pure luck while bisecting, because some time during his bisect, his 
  machine wouldn't even boot with HPET.

  So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But 
  2.6.28 (and current -git) does not.  Any ideas? ]

On Tue, 22 Dec 2009, Mark Hounschell wrote:
> 
> Ok, I may have something that might help.
> 
> # git bisect bad
> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
> Author: venkatesh.pallipadi@...el.com <venkatesh.pallipadi@...el.com>
> Date:   Fri Sep 5 18:02:18 2008 -0700
> 
>     x86: HPET_MSI Initialise per-cpu HPET timers
> 
>     Initialize a per CPU HPET MSI timer when possible. We retain the HPET
>     timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
>     setup the remaining HPET timers as per CPU MSI based timers. This per CPU
>     timer will eliminate the need for timer broadcasting with IRQ 0 when there
>     is non-functional LAPIC timer across CPU deep C-states.
> 
>     If there are more CPUs than number of available timers, CPUs that do not
>     find any timer to use will continue using LAPIC and IRQ 0 broadcast.
> 
>     Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>
>     Signed-off-by: Shaohua Li <shaohua.li@...el.com>
>     Signed-off-by: Ingo Molnar <mingo@...e.hu>
> 
> And of coarse this was the first commit that I could not boot if I had hpet
> enabled. To get this one to boot (single user mode only) I had to add the
> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
> 
> commit  5ceb1a04187553e08c6ab60d30cee7c454ee139a
> 
> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
>  {
> 
>         if (request_irq(dev->irq, hpet_interrupt_handler,
> -                       IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
> +                       IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
>                 return -1;
> 
>         disable_irq(dev->irq);
> 
> AND add the quiet cmdline option.

Ok, so we know why HPET didn't boot for you, and that was fixed later (by 
that 5ceb1a04). But is this also when the floppy started mis-behaving?

IOW, _if_ you boot with that fix from commit 5ceb1a04 (and the quiet 
option - I wonder what that is about: do you have any ideas?), is the 
per-CPU HPET timer commit also the commit that causes floppy problems, or 
is this purely a "bisect when HPET became a boot-up problem"?

			Linus

---
> Also, of all the machines it does work on with hpets enabled, I don't see
> the HPET2 in /proc/interupts as below.
> 
> 
> cat /proc/interrupts
>            CPU0       CPU1       CPU2       CPU3
>   0:         82          0          3          0   IO-APIC-edge      timer
>   1:          0          0       1712          6   IO-APIC-edge      i8042
>   3:          0          0          6          0   IO-APIC-edge
>   4:          0          0          6          0   IO-APIC-edge
>   6:          0          0          4          0   IO-APIC-edge      floppy
>   8:          0          0         60          0   IO-APIC-edge      rtc0
>   9:          0          0          0          0   IO-APIC-fasteoi   acpi
>  12:          0          0      37798        179   IO-APIC-edge      i8042
>  14:          0          0      16462         71   IO-APIC-edge      pata_atiixp
>  15:          0          0       5713         17   IO-APIC-edge      pata_atiixp
>  16:          0          0        904          2   IO-APIC-fasteoi   aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel, ni-pci-gpib
>  17:          0          0          2          0   IO-APIC-fasteoi   ehci_hcd:usb1, parport0, ni-pci-gpib
>  18:          0          0      49940         90   IO-APIC-fasteoi   ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, nvidia
>  19:          0          0        703          2   IO-APIC-fasteoi   aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
>  22:          0          0       1303         15   IO-APIC-fasteoi   ahci
> 
>  24:     261763          0          0          0  HPET_MSI-edge      hpet2
> 
>  29:          0          0        220          5   PCI-MSI-edge      sky2@pci:0000:04:00.0
> NMI:          0          0          0          0   Non-maskable interrupts
> LOC:        138     271356     264446     261050   Local timer interrupts
> SPU:          0          0          0          0   Spurious interrupts
> PMI:          0          0          0          0   Performance monitoring interrupts
> PND:          0          0          0          0   Performance pending work
> RES:       4511       9275       8470       8086   Rescheduling interrupts
> CAL:       3624       8666        523       4543   Function call interrupts
> TLB:        981       1111       1065       1058   TLB shootdowns
> ERR:          0
> MIS:          0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ