lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 21 Oct 2009 12:01:33 +0200
From:	Alexander Huemer <alexander.huemer@....ac.at>
To:	Jean Delvare <jdelvare@...e.de>
CC:	Tejun Heo <tj@...nel.org>, Frans Pop <elendil@...net.nl>,
	linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org,
	Jeff Garzik <jgarzik@...ox.com>, alexander.huemer@....ac.at
Subject: Re: 2.6.{30,31} x86_64 ahci problem - irq 23: nobody cared

Jean Delvare wrote:
> Hi Tejun, Alexander,
>
> Le mardi 13 octobre 2009, Tejun Heo a écrit :
>   
>> Alexander Huemer wrote:
>>     
>>> i compiled gcc in a loop over night, 14 times. no error.
>>> it really seams i2c_i801 was the cause...
>>> unfortunately i still don't know how i can extract the part of the gcc
>>> compilation process that causes the error on an affected kernel.
>>> that would enable me to create a simple test program.
>>>       
>> Given that i2c is used for temperature monitoring, I think it is not
>> triggered by any single step of the compiling but rather by the
>> accumulated heat load during compilation.  Let's wait for Jean to
>> chime in.  :-)
>>     
>
> OK, here I am, sorry for the delay. I've read the discussion thread.
> Here are the few data points I can offer, in the hope it will help:
>
> * While the i2c-i801 driver received some changes in kernel 2.6.30,
>   none of these are related to PCI nor interrupts. So as the problem
>   is new in kernel 2.6.30, the i2c-i801 driver alone is unlikely to
>   cause it. This may, however, be a combination of something i2c-i801
>   does and something the pci subsystem does since kernel 2.6.30. For
>   this reason, I would still recommend a bisection if the problem can
>   be reliably reproduced. I know it takes time, but it is always
>   easier to fix a bug when we know which commit introduced it.
>
> * The i2c-i801 driver does _not_ make use of interrupts. It is
>   poll-based (I am not exactly proud of that, but that's the way it
>   is.)
>
>   #define ENABLE_INT9		0	/* set to 0x01 to enable - untested */
>
>   So I am very surprised to read that this driver would cause an IRQ
>   storm.
>
> * One thing the i2c-i801 driver does on the PCI device is:
>
>   err = pci_enable_device(dev);
>
>   I presume this is what causes the following message in dmesg:
>
>   i801_smbus 0000:00:1f.3: PCI INT B -> GSI 23 (level, low) -> IRQ 23
>
>   Basically, even though the driver doesn't make use of interrupts,
>   the IRQ is still registered because this is how the hardware is
>   setup.
>
> As a conclusion, I suspect that 2 things may be happening: either
> the SMBus is triggering interrupts when told not to. The ICH6 is a
> bit different from all the other supported chips, I'll double check
> if we may have missed something. Or, something else is triggering
> SMBus transactions. SMI and ACPI come to mind. If this is the case
> then you do not want to use i2c-i801 on this motherboard.
>
> Questions to Alexander :
>
> * Can I please see the output of "sensors" on your system?
> * What are the brand and model of your motherboard?
> * Can we get an acpidump for your system?
>
>   
many thanks for your response. i appreciate that.
first, the data you requested:

    sensors:        http://xx.vu/~ahuemer/sensors-ahuemer-20091021.txt
    acpidump:       http://xx.vu/~ahuemer/acpidump-ahuemer-20091021.txt
    motherboard:    tyan tempest i5400pw/s5397 with one intel xeon e5420.

the output of sensors was made _without_ i801_smbus in the kernel.
i noticed that the data of w83627hf-isa-0290 is quite weird. i do not
have an explanation for that.
if a bisection is what will bring light into this, i am willing to take
the time.
so that would be a bisection between 2.6.29 and 2.6.30 ?
a quicker test case would be good for that, but i don't have one yet,
just the compilation of gcc, which takes time, even on this machine with
tmpfs and ccache.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ