lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20141112221949.2018701d@uryu.home.lan>
Date:	Wed, 12 Nov 2014 22:19:49 -0500
From:	Stephen Hemminger <stephen@...workplumber.org>
To:	netdev@...r.kernel.org
Subject: Fw: [Bug 88111] New: Race condition in net_tx_action?



Begin forwarded message:

Date: Wed, 12 Nov 2014 10:52:10 -0800
From: "bugzilla-daemon@...zilla.kernel.org" <bugzilla-daemon@...zilla.kernel.org>
To: "stephen@...workplumber.org" <stephen@...workplumber.org>
Subject: [Bug 88111] New: Race condition in net_tx_action?


https://bugzilla.kernel.org/show_bug.cgi?id=88111

            Bug ID: 88111
           Summary: Race condition in net_tx_action?
           Product: Networking
           Version: 2.5
    Kernel Version: all
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
          Assignee: shemminger@...ux-foundation.org
          Reporter: angelo.rizzi@...utomazione.it
        Regression: No

Hi all,

I have a question about a strange situation i've faced on my linux-based
embedded system:

Using 2 network device (transmitting asynchronously), i found a kind of "leak"
in sk_buff alloc/free that drives my test program, after some days of
continuous transmission, to be unable to write on the xmitting socket ("poll()"
function using POLLOUT request always returning 0).

After a lot of test, i've found the reason for such behaviour in the
net_tx_action() function (net/core/dev.c):

Let me explain what i've found:

The following code is used in order to get the current list of sk_buff to free:

static void net_tx_action(struct softirq_action *h)
{
        struct softnet_data *sd = &__get_cpu_var(softnet_data);

        if (sd->completion_queue) {
                 struct sk_buff *clist;

                 local_irq_disable();
                 clist = sd->completion_queue;
                 sd->completion_queue = NULL;
                 local_irq_enable();

Transmitting asynchronously on all the network devices available i've noticed
the following behaviour:
a) The instruction "if (sd->completion_queue) {" saves on a CPU register the
pointer value (register contents is used for the comparison)
b) The interupt is disabled (using "local_irq_disable")
c) when the content of "clist" is updated, the register is used, instead of
re-read the "completion_queue" variable.

So, when a low-level tx interrupt arrives after the latching of
"completion_queue", but before "local_irq_disable", the value stored in "clist"
reflect the situation before low-level tx interrupt, resulting in a sk_buff
leak

I've changed the declaration of "sd" as follows:

        volatile struct softnet_data *sd = &__get_cpu_var(softnet_data);

and everything is now ok.

Is that correct?

Thanks,
Angelo

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ