[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20141112221949.2018701d@uryu.home.lan>
Date: Wed, 12 Nov 2014 22:19:49 -0500
From: Stephen Hemminger <stephen@...workplumber.org>
To: netdev@...r.kernel.org
Subject: Fw: [Bug 88111] New: Race condition in net_tx_action?
Begin forwarded message:
Date: Wed, 12 Nov 2014 10:52:10 -0800
From: "bugzilla-daemon@...zilla.kernel.org" <bugzilla-daemon@...zilla.kernel.org>
To: "stephen@...workplumber.org" <stephen@...workplumber.org>
Subject: [Bug 88111] New: Race condition in net_tx_action?
https://bugzilla.kernel.org/show_bug.cgi?id=88111
Bug ID: 88111
Summary: Race condition in net_tx_action?
Product: Networking
Version: 2.5
Kernel Version: all
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Other
Assignee: shemminger@...ux-foundation.org
Reporter: angelo.rizzi@...utomazione.it
Regression: No
Hi all,
I have a question about a strange situation i've faced on my linux-based
embedded system:
Using 2 network device (transmitting asynchronously), i found a kind of "leak"
in sk_buff alloc/free that drives my test program, after some days of
continuous transmission, to be unable to write on the xmitting socket ("poll()"
function using POLLOUT request always returning 0).
After a lot of test, i've found the reason for such behaviour in the
net_tx_action() function (net/core/dev.c):
Let me explain what i've found:
The following code is used in order to get the current list of sk_buff to free:
static void net_tx_action(struct softirq_action *h)
{
struct softnet_data *sd = &__get_cpu_var(softnet_data);
if (sd->completion_queue) {
struct sk_buff *clist;
local_irq_disable();
clist = sd->completion_queue;
sd->completion_queue = NULL;
local_irq_enable();
Transmitting asynchronously on all the network devices available i've noticed
the following behaviour:
a) The instruction "if (sd->completion_queue) {" saves on a CPU register the
pointer value (register contents is used for the comparison)
b) The interupt is disabled (using "local_irq_disable")
c) when the content of "clist" is updated, the register is used, instead of
re-read the "completion_queue" variable.
So, when a low-level tx interrupt arrives after the latching of
"completion_queue", but before "local_irq_disable", the value stored in "clist"
reflect the situation before low-level tx interrupt, resulting in a sk_buff
leak
I've changed the declaration of "sd" as follows:
volatile struct softnet_data *sd = &__get_cpu_var(softnet_data);
and everything is now ok.
Is that correct?
Thanks,
Angelo
--
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists