lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANZZXhBNSVmFTsteG7AAHuuYK7qU8J2LaHfcO16-2mcc+7kxag@mail.gmail.com>
Date:	Fri, 20 Mar 2015 16:22:11 +0200
From:	Vitaly Chernooky <vitalii.chernookyi@...ballogic.com>
To:	Al Viro <viro@...iv.linux.org.uk>
Cc:	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Vrabel <david.vrabel@...rix.com>,
	Marek Marczykowski-Górecki 
	<marmarek@...isiblethingslab.com>,
	Iurii Konovalenko <iurii.konovalenko@...ballogic.com>,
	Ian Campbell <ian.campbell@...rix.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	Andrii Anisov <andrii.anisov@...ballogic.com>,
	Artem Mygaiev <artem.mygaiev@...ballogic.com>
Subject: Re: [PATCH] [RFC] Fix deadlock on regular nonseekable files

> AFAICS, any threads blocked on f_pos_lock are not holding anything else and
> cannot impede the rest.  What am I missing here?

As far as I understand it is true only for files on regular filesystem
like ext4. Let's to see how XEN guys run into trouble with that
f_pos_lock:

---------- Forwarded message ----------
From: Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
Date: Thu, Mar 19, 2015 at 3:19 AM
Subject: [Xen-devel] Deadlock in /proc/xen/xenbus watch+read on 3.17+
(maybe earlier)
To: xen-devel <xen-devel@...ts.xen.org>
Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>, David Vrabel
<david.vrabel@...rix.com>


Hi,

I've hit some deadlock in kernel xenstore client exposed via
/proc/xen/xenbus. Steps to reproduce are simple:
int main() {
        struct xs_handle *xs;
        xs = xs_open(0);
        xs_watch(xs, "domid", "token");
        xs_read(xs, 0, "name", NULL);
        return 0;
}

xs_watch internally creates new thread, which uses read to wait for the
watch. And in the same time, the program tries to read some value,
but actually it hangs at sending the command (before even sending a path to be
read). Strace gives this (simplified for readability):
[pid  2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 = 16
[pid  2494] write(3, "domid\0", 6)      = 6
[pid  2494] write(3, "token\0", 6)      = 6
[pid  2495] read(3,  <unfinished ...>
[pid  2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  2495] <... read resumed>
"\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) = 16
[pid  2495] read(3, "domid\0token\0", 12) = 12
[pid  2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16
[pid  2495] read(3, "OK\0", 3)          = 3
[pid  2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
[pid  2494] <... futex resumed> )       = 0
[pid  2495] <... futex resumed> )       = 1
[pid  2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2494] <... futex resumed> )       = -1 EAGAIN (Resource
temporarily unavailable)
[pid  2495] <... futex resumed> )       = 0
[pid  2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2495] read(3,  <unfinished ...>
[pid  2494] <... futex resumed> )       = 0
[pid  2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER,
0x7fc78c1488f0}, NULL, 8) = 0
[pid  2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER,
0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) = 0
[pid  2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16

And thats all - 2494 is waiting on write, 2495 is waiting on read.

On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't
checked versions in the middle.

Any ideas?

--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab




And Iurii Konovalenko has debugged that the root of their troubles is
that Commit 9c225f2655e36a470c4f58dbbc99244c5fc7f2d4.

With best regards,

-- 
Vitaly Chernooky | Senior Developer - Product Engineering and Development
GlobalLogic
P +380.44.4929695 ext.1136 M +380.63.6011802 S cvv_2k
www.globallogic.com

http://www.globallogic.com/email_disclaimer.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ