lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 22 Oct 2012 19:01:39 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Peter LaDow <petela@...ougs.wsu.edu>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: Process Hang in __read_seqcount_begin

On Mon, 2012-10-22 at 09:46 -0700, Peter LaDow wrote:
> I posted this problem some time back on the linux-rt-users and
> netfilter lists.  Since then, we thought we had a workaround to avoid
> this problem, so we dropped the issue.  But now 5 months later, the
> problem has reappeared.  And this time it is much more serious and
> much more difficult to re-create.  After perusing both those lists,
> I'm not sure if those were the proper places to post.  The netfilter
> list seems to be more focused on the user space side of things, and
> the RT page indicates that kernel side RT issues should go to lkml.
> 
> Anyway, here's a repost of that problem from July.  Perhaps somebody
> here can point us in the right direction.
> 
> We are running 3.0.36-rt57 on a powerpc box.  During some testing with
> heavy loads and interfaces coming up/going down (specifically PPP), we
> have run into a case where iptables hangs and cannot be killed.  It
> requires a reboot to fix the problem.
> 
> Connecting the BDI and debugging the kernel, we get:
> 
> #0  get_counters (t=0xdd5145a0, counters=0xe3458000)
>     at include/linux/seqlock.h:66
> #1  0xc026b4ac in do_ipt_get_ctl (sk=<value optimized out>,
>     cmd=<value optimized out>, user=0x10612078, len=<value optimized out>)
>     at net/ipv4/netfilter/ip_tables.c:918
> #2  0xc022226c in nf_sockopt (sk=<value optimized out>, pf=2 '\002',
>     val=<value optimized out>, opt=<value optimized out>, len=0xdd4c7d4c,
>     get=1) at net/netfilter/nf_sockopt.c:109
> #3  0xc0236b1c in ip_getsockopt (sk=0xdf071480, level=<value optimized out>,
>     optname=65, optval=0x10612078 <Address 0x10612078 out of bounds>,
>     optlen=0xbfbe0c2c) at net/ipv4/ip_sockglue.c:1308
> #4  0xc02522a8 in raw_getsockopt (sk=0xdf071480, level=<value optimized out>,
>     optname=<value optimized out>, optval=<value optimized out>,
>     optlen=<value optimized out>) at net/ipv4/raw.c:811
> #5  0xc01f4c38 in sock_common_getsockopt (sock=<value optimized out>,
>     level=<value optimized out>, optname=<value optimized out>,
>     optval=<value optimized out>, optlen=<value optimized out>)
>     at net/core/sock.c:2157
> #6  0xc01f2df8 in sys_getsockopt (fd=<value optimized out>, level=0,
>     optname=65, optval=0x10612078 <Address 0x10612078 out of bounds>,
>     optlen=0xbfbe0c2c) at net/socket.c:1839
> #7  0xc01f45b4 in sys_socketcall (call=15, args=<value optimized out>)
>     at net/socket.c:2421
> 
> It seems to be stuck in __read_seqcount_begin.  From include/linux/seqlock.h:
> 
> static inline unsigned __read_seqcount_begin(const seqcount_t *s)
> {
>         unsigned ret;
> 
> repeat:
>         ret = ACCESS_ONCE(s->sequence);
>         if (unlikely(ret & 1)) {
>                 cpu_relax();
>   <----- It is always here
>                 goto repeat;
>         }
>         return ret;
> }
> 
> I've been scouring the mailing lists and Google searches trying to
> find something, but thus far have come up with nothing.
> 
> Any tips would be appreciated.

This looks like a corruption of s->sequence, and is value is odd, even
if no writer is alive.

Does local_bh_disable() disables preemption on RT ?




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ