lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <500D93F5.4090305@linux.vnet.ibm.com>
Date:	Mon, 23 Jul 2012 15:12:05 -0300
From:	Kleber Sacilotto de Souza <klebers@...ux.vnet.ibm.com>
To:	Or Gerlitz <ogerlitz@...lanox.com>
CC:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	jackm@....mellanox.co.il, yevgenyp@...lanox.co.il,
	cascardo@...ux.vnet.ibm.com, brking@...ux.vnet.ibm.com,
	shlomop@...lanox.com
Subject: Re: [PATCH] mlx4: Add support for EEH error recovery

On 07/23/2012 10:45 AM, Or Gerlitz wrote:

> On 7/23/2012 4:18 PM, Kleber Sacilotto de Souza wrote:
>> Exactly. The callbacks implemented are from standard PCI error recovery
>> (Documentation/PCI/pci-error-recovery.txt) and the changes doesn't
>> assume any platform in specific. The code was tested only on powerpc
>> systems [...]
> 
> So how did you test that? using the kernel provided error injection
> support and user space tool (which?) or in another way? we've trying
> quickly here to inject errors using /sbin/ear-inject from
> ras-utils-6.1-1.el6.x86_64 on a kernel built with
> 
> CONFIG_PCIEAER=y
> CONFIG_PCIEAER_INJECT=m


For powerpc we have an IBM internal user space tool that injects the
error on the bus with the aid of the system firmware. The kernel used
was built with the option:

CONFIG_EEH=y

and without the AER options. I will run some more tests with the AER
options activated.

> 
> and it failed to inject errors, SB details.
> 
> Or.
>> since I don't have any mlx4 card on other platforms, however,
>> these changes shouldn't make the error recover any worse than the
>> current state.
> 
>> # lspci | grep 08.00.1
>> 08:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network
>> Connection (rev 02)
> 
>> # cat /tmp/intel.aer
>> AER
>> BUS 8 DEV 0 FN 1
>> COR_STATUS BAD_TLP
>> HEADER_LOG 0 1 2 3
> 
>> # /sbin/aer-inject < /tmp/intel.aer
>> Error: Failed to write, Invalid argument
> 
> 
> 
>> # strace -F -f /sbin/aer-inject < /tmp/intel.aer
>> [...]
> 
>> open("/dev/aer_inject", O_WRONLY)       = 3
>> write(3, "\10\0\1\0\0\0\0\0@\0\0\0\0\0\0\0\1\0\0\0\2\0\0\0\3\0\0\0",
>> 28) = -1 EINVAL (Invalid argument)
>> write(2, "Error: ", 7Error: )                  = 7
>> write(2, "Failed to write", 15Failed to write)         = 15
>> write(2, ", Invalid argument\n", 19, Invalid argument
>> )    = 19
>> exit_group(-1)                          = ?
> 
> 
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ