lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAN8Q1Ecx-pDR=NBQm4O=VaynQsxBU1wku9e_kcKkWB=xU0OtMA@mail.gmail.com>
Date:	Fri, 26 Oct 2012 14:25:52 -0700
From:	Peter LaDow <petela@...ougs.wsu.edu>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org,
	netfilter@...r.kernel.org
Subject: Re: Process Hang in __read_seqcount_begin

On Fri, Oct 26, 2012 at 2:05 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> Do you know what is per cpu data in linux kernel ?

I sorta did.  But since your response, I did more reading, and now I
see what you mean.  But I don't think this is a per cpu issue.  More
below.

> Because its not needed. Really I dont know why you want that.
>
> Once you are sure a thread cannot be interrupted by a softirq, and
> cannot migrate to another cpu, access to percpu data doesnt need other
> synchronization at all.

Because there are multiple entry points on the same CPU.  In
net/ipv4/netfilter/ip_tables, there are two entries to
xt_write_recseq_begin().  The first is in ipt_do_table and the other
is in get_counters.  Where we are seeing the lockup is with a
getsockopt syscall leading to do_counters.  The other path is through
ipt_do_table, which is installed as a hook.  I'm not sure from what
context the hooks are called, but it is certainly from a different
context than the syscall.

> Following sequence is safe :
>
> addend = (__this_cpu_read(xt_recseq.sequence) + 1) & 1;
> /*
>  * This is kind of a write_seqcount_begin(), but addend is 0 or 1
>  * We dont check addend value to avoid a test and conditional jump,
>  * since addend is most likely 1
>  */
> __this_cpu_add(xt_recseq.sequence, addend);

If this were safe, we wouldn't be seeing this lockup and your patch
wouldn't be needed.  So it seems that your patch doesn't really
address the issue that we are not "sure a thread cannot be interrupted
by a softirq, and cannot migrate to another cpu".  Well, we know it
cannot migrate to another CPU, because there isn't another CPU.  So
apparently, it can be interrupted by a softirq.  So local_bh_disable
isn't doing anything useful in the RT patches with regard to this.

As I mentioned earlier, I think perhaps what your patch did was ensure
an atomic update of the sequence counter.  But it does nothing to
synchronize two writers.  If they were already synchronized (such as
via the call to local_bh_disable), then we wouldn't see sequence
counter corruption, no?

Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ