linux-kernel - [ANNOUNCE] 2.6.29.5-rt21

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0906170903370.2800@localhost.localdomain>
Date:	Wed, 17 Jun 2009 10:45:38 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	LKML <linux-kernel@...r.kernel.org>
cc:	rt-users <linux-rt-users@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Carsten Emde <ce@...g.ch>,
	Clark Williams <williams@...hat.com>,
	Frank Rowand <frank.rowand@...sony.com>,
	Robin Gareus <robin@...eus.org>,
	Gregory Haskins <ghaskins@...ell.com>,
	Philippe Reynes <philippe.reynes@...smpp.fr>,
	Fernando Lopez-Lezcano <nando@...ma.Stanford.EDU>,
	Will Schmidt <will_schmidt@...t.ibm.com>,
	Darren Hart <dvhltc@...ibm.com>, Jan Blunck <jblunck@...e.de>,
	Sven-Thorsten Dietrich <sdietrich@...ell.com>,
	Jon Masters <jcm@...hat.com>
Subject: [ANNOUNCE] 2.6.29.5-rt21

We are pleased to announce the next update to our new preempt-rt
series.

   - update to 2.6.29.5 (2.6.29.5-rt20, which I uploaded yesterday but
     did not announce due to the findings below)

   - softirq: lower default priority below hardirq default priority

This fixes a long standing default priority configuration problem of
the -rt series. On UP machines this can result in net_tx softirq
running in an endless loop and starving the irq threads and the other
softirq threads and of course everything with lower priority. It might
be possible to happen on a SMP machine when the hardirq thread
affinities are tweaked in the right way.

What happens is:

     tx interrupt
        lock(card->tx_lock);
        dev_kfree_skb_any(skb);
          blocks on a contended lock

     net_tx softirq runs
        unlocks contended lock but does not schedule away due to equal prio
	repeat:
         calls xmit
	 try_lock(card->tx_lock) fails
	   -> reschedule skb which keeps net_tx running
         goto repeat;

The scheduler does not schedule away net_tx, so this goes on forever.

This has been there forever, but it seems to be easier to trigger in
the 29 -rt series which is probably due to the slab cache lock breaks
we did.

The problem is restricted to a dozen of wireless adapters and network
cards where e1000e is the most popular one. We could patch the
affected drivers for -rt, but we need to have a closer look at the
general assumptions of drivers vs. hardirq/softirq. Note, this is not
a mainline problem as the semantics are entirely correct there.

Lowering the priorities of the softirq threads below the hardirq
threads priorities is a safe workaround for now. It prevents the
runaway scenario under all circumstances as it resembles the mainline
semantics closely.

For all existing -rt systems the problem can be solved w/o patching
the kernel by adjusting the priority of the softirq threads from the
init scripts with chrt.

It's extremly hard to trigger this, we never had a report of that
before, and I want to say thanks to Bernd Oelker who meticulously
worked on reproducing the problem and debugging it with all evil
methods and patches I could come up with. And no, I'm not going to
tell you which nasty hacks made it possible to decode this :)

Download locations:

    http://rt.et.redhat.com/download/
    http://www.kernel.org/pub/linux/kernel/projects/rt/

Information on the RT patch can be found at:

    http://rt.wiki.kernel.org/index.php/Main_Page

to build the 2.6.29.5-rt21 tree, the following patches should be
applied:

    http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.29.5.tar.bz2
    http://www.kernel.org/pub/linux/kernel/projects/rt/patch-2.6.29.5-rt21.bz2

The broken out patches are also available at the same download
locations.

Enjoy !

       tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/