lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 28 Dec 2011 18:13:43 +0100
From:	Ben Hutchings <bhutchings@...arflare.com>
To:	Bjarke Istrup Pedersen <gurligebis@...too.org>
CC:	<linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
	Roger Luethi <rl@...lgate.ch>
Subject: Re: [PATCH 1/1] via-rhine: Fix hanging with high CPU load on
 low-end broads.

On Wed, 2011-12-28 at 16:14 +0100, Bjarke Istrup Pedersen wrote:
> 2011/12/28 Ben Hutchings <bhutchings@...arflare.com>:
> > On Wed, 2011-12-28 at 12:28 +0000, Bjarke Istrup Pedersen wrote:
> >> Working around problem causing high CPU load and hanging system when
> >> there is alot of network trafic.
> >>
> >> It is kind of an ugly way to work around it, but it allows the Soekris
> >> net5501 to have trafic between two of it's NICs without hanging so much
> >> that the watchdog kicks in and does a hard reboot of the system.
> >>
> >> There is more info on the problem here:
> >> http://http://lists.soekris.com/pipermail/soekris-tech/2010-October/016889.html
> >>
> >> Tested with positive results on two Soekris net5501-70 boxes.
> >
> > This is completely wrong.  In a UP configuration the extra spinlock
> > calls have no effect (except perhaps a small delay).  In an SMP
> > configuration they will cause rhine_tx() to deadlock when it also tries
> > to acquire the spinlock.
> >
> > Ben.
> 
> Okay, the Soekris net5501-70 boxes are single-core, and I haven't got
> any SMP boxes with that nic.
> Is there a better solution for the problem then, to avoid it hanging
> the box on a non-smp machine with a slow (500mhz) cpu?

If the system actually hangs then I assume there is some bug in the
driver.  I would guess the actual problem is that the interrupt and NAPI
handlers are running constantly so that user processes never run (which
I think counts as soft lockup).

If the hardware supports it, interrupt moderation may help a little by
slightly reducing the per-packet processing cost, but it isn't a full
solution.  Or you can use a real-time kernel, which schedules interrupt
and NAPI handlers as tasks, and adjust priorities so that user processes
can still run.  But that brings its own problems (including generally
lower throughput).

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ