lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <55D4A85F.1080304@real-time-systems.com>
Date:	Wed, 19 Aug 2015 18:01:35 +0200
From:	Stefan Fausser <kernel_tk@...l-time-systems.com>
To:	intel-linux-scu@...el.com, artur.paszkiewicz@...el.com,
	JBottomley@...n.com, linux-scsi@...r.kernel.org,
	linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: isci, INTx mode, race condition

Dear all,

attached are two patches for the "isci" module (CONFIG_SCSI_ISCI).

Both patches apply to the current Linux kernel, retrieved by GIT 
(4.2.0-rc7).

The first patch (init.patch) is for reproducing the problem with the 
"Intel(R) C600 SAS Controller" in INTx Mode, see below. The second patch 
(host.patch) is for fixing this problem.

The problem:

By applying the first patch "init.patch", the "Intel(R) C600 SAS 
Controller" (now abbreviated by SAS) generates level-triggered INTx 
Interrupts instead of (edge-triggered) MSI-X Interrupts.

In the ISR (isci_intx_isr), the controller determines if the interrupt 
is due to a normal operation (normal interrrupt) or an error. In the 
case of a normal interrupt, a tasklet is scheduled that should handle 
the normal interrupt. However, in the ISR, the interrupts are left 
unmasked and the SAS device may trigger the next interrupt after the ISR 
has left and before the tasklet has been scheduled.

Thus, with this patch "init.patch" and on my system (Intel C600 chipset 
series), the SAS device repeatedly level-triggers the interrupt and the 
tasklet to handle the interrupt never gets scheduled. This will result 
in a soft-lockup on the executing core.

In my investigations, the above described problem occurs in all Linux 
kernel version starting from 3.5 and up to to-day.

The fix:

By applying the second patch "host.patch", the interrupts are masked in 
the INTx ISR in case of a normal interrupt. Thus, the scheduler has 
enough time to schedule the handling tasklet. In the tasklet (see 
sci_controller_completion_handler), the interrupts are unmasked again.

Please let me know if you need any other information.

Kind Regards,

Stefan


View attachment "host.patch" of type "text/x-patch" (558 bytes)

View attachment "init.patch" of type "text/x-patch" (449 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ