lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140114153342.GC32193@1wt.eu>
Date:	Tue, 14 Jan 2014 16:33:42 +0100
From:	Willy Tarreau <w@....eu>
To:	Ben Hutchings <ben@...adent.org.uk>
Cc:	davem@...emloft.net, netdev@...r.kernel.org,
	Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
	Gregory CLEMENT <gregory.clement@...e-electrons.com>
Subject: Re: [PATCH 3/5] net: mvneta: do not schedule in mvneta_tx_timeout

Hi Ben,

On Sun, Jan 12, 2014 at 05:38:53PM +0000, Ben Hutchings wrote:
> I think this will DTRT, but it's compile-tested only.  I have been given
> an OpenBlocks AX3 but haven't set it up yet.

OK I just managed to test your patch. I managed to force a Tx timeout by
forcing the link to 100/half and transfering 1000 concurrent streams.

Unfortunately for now the patch doesn't manage to recover, and the system
randomly panics one or two seconds after the link is brought up. Twice the
system did not panic but I lost all communications until a down/up cycle,
after which a panic happened during transfers.

However I could verify that the scheduled function is correctly called. I
suspect that something else might be wrong in the driver's reset sequence
(eg: unmapping pages still in use by the NIC or I don't know what), but
your patch does exactly what it's supposed to do.

At least, if the restart function does not do anything, everything works
fine. I see that the function is called (I added printk there) and the
transfer is not perturbated at all anymore.

So now I'm wondering whether the right thing should not be to just keep
your scheduled function and make it only log that a timeout was caught.

Another point which bothers me is that I suspect we're triggering Tx
timeouts too fast, because I regularly get these on 100 Mbps during
regular traffic (which ended up in immediate panics with previous code).

Thanks,
Willy

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ