lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5321D412.1040501@linux.vnet.ibm.com>
Date:	Thu, 13 Mar 2014 10:51:46 -0500
From:	Carol Soto <clsoto@...ux.vnet.ibm.com>
To:	Eli Cohen <eli@....mellanox.co.il>
CC:	Ben Hutchings <ben@...adent.org.uk>, eli@...lanox.com,
	roland@...nel.org, sean.hefty@...el.com, hal.rosenstock@...il.com,
	linux-rdma@...r.kernel.org, netdev@...r.kernel.org,
	brking@...ux.vnet.ibm.com
Subject: Re: [Patch 1/2] IB/mlx5: Implementation of PCI error handler


On 3/13/2014 10:40 AM, Eli Cohen wrote:
> On Thu, Mar 13, 2014 at 10:12:19AM -0500, Carol Soto wrote:
>> In mlx4 code, I do not recall a timeout for commands this big. So
>> the reason in mlx5 is 2 hrs is just for
>> debugging purposes? So if for any reason a command hang then the
>> user can not remove this module
>> for the next 2 hrs?
>>
> Hi Carol,
> well I haven't seen any such case with latest firmware releases.
> Anyway, 10 msec is really too short timeout value since there are
> commands that can take more than that (e.g. memory registartion of
> regions larger then 512 MB - though this will be changed soon). I
> wonder what was the original motivation and have you been able to
> simulate PCI errors and see this in action.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hi Eli,

The motivation to reduce that timeout is that if there is a process in 
the middle of a HW command
in the middle of the PCI error, I probably did not want to wait 2hrs 
since the command will never complete
since the card is dead. Now you are right, I forgot the case of big 
memory registration where commands can
take longer than that. Do you have an idea of what is the longest time 
that a command can take in mlx5?


Carol


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ