[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a961ead-e77d-7334-3c29-399e071670fb@gmail.com>
Date: Thu, 19 Apr 2018 16:15:33 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Eric Dumazet <edumazet@...gle.com>,
"David S . Miller" <davem@...emloft.net>
Cc: netdev <netdev@...r.kernel.org>,
Neal Cardwell <ncardwell@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>,
Soheil Hassas Yeganeh <soheil@...gle.com>
Subject: Re: [PATCH net-next 4/5] tcp: implement mmap() for zero copy receive
On 04/16/2018 10:33 AM, Eric Dumazet wrote:
> Some networks can make sure TCP payload can exactly fit 4KB pages,
> with well chosen MSS/MTU and architectures.
>
> Implement mmap() system call so that applications can avoid
> copying data without complex splice() games.
>
> Note that a successful mmap( X bytes) on TCP socket is consuming
> bytes, as if recvmsg() has been done. (tp->copied += X)
>
Oh well, I should have run this code with LOCKDEP enabled :/
[ 974.320412] ======================================================
[ 974.326631] WARNING: possible circular locking dependency detected
[ 974.332816] 4.16.0-dbx-DEV #40 Not tainted
[ 974.336927] ------------------------------------------------------
[ 974.343107] b78299096/15790 is trying to acquire lock:
[ 974.348246] 000000006074c9cf (sk_lock-AF_INET6){+.+.}, at: tcp_mmap+0x7c/0x550
[ 974.355505]
but task is already holding lock:
[ 974.361366] 000000008dbe063b (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0x99/0x100
[ 974.368801]
which lock already depends on the new lock.
[ 974.377010]
the existing dependency chain (in reverse order) is:
[ 974.384501]
-> #1 (&mm->mmap_sem){++++}:
[ 974.389911] __might_fault+0x68/0x90
[ 974.394025] _copy_from_user+0x23/0xa0
[ 974.398311] sock_setsockopt+0x4a2/0xac0
[ 974.402761] __sys_setsockopt+0xd9/0xf0
[ 974.407118] SyS_setsockopt+0xe/0x20
[ 974.411242] do_syscall_64+0x6e/0x1a0
[ 974.415431] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 974.421011]
-> #0 (sk_lock-AF_INET6){+.+.}:
[ 974.426690] lock_acquire+0x95/0x1e0
[ 974.430813] lock_sock_nested+0x71/0xa0
[ 974.435196] tcp_mmap+0x7c/0x550
[ 974.438940] sock_mmap+0x23/0x30
[ 974.442695] mmap_region+0x3a4/0x5d0
[ 974.446808] do_mmap+0x313/0x530
[ 974.450571] vm_mmap_pgoff+0xc7/0x100
[ 974.454769] ksys_mmap_pgoff+0x1d5/0x260
[ 974.459247] SyS_mmap+0x1b/0x30
[ 974.462936] do_syscall_64+0x6e/0x1a0
[ 974.467114] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 974.472678]
other info that might help us debug this:
[ 974.480677] Possible unsafe locking scenario:
[ 974.486600] CPU0 CPU1
[ 974.491152] ---- ----
[ 974.495684] lock(&mm->mmap_sem);
[ 974.499089] lock(sk_lock-AF_INET6);
[ 974.505285] lock(&mm->mmap_sem);
[ 974.511211] lock(sk_lock-AF_INET6);
[ 974.514885]
*** DEADLOCK ***
[ 974.520825] 1 lock held by b78299096/15790:
[ 974.525018] #0: 000000008dbe063b (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0x99/0x100
[ 974.532852]
stack backtrace:
[ 974.537224] CPU: 25 PID: 15790 Comm: b78299096 Not tainted 4.16.0-dbx-DEV #40
[ 974.544371] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
[ 974.551333] Call Trace:
[ 974.553792] dump_stack+0x70/0xa5
[ 974.557111] print_circular_bug.isra.39+0x1d8/0x1e6
[ 974.561982] __lock_acquire+0x1284/0x1340
[ 974.565992] ? tcp_mmap+0x7c/0x550
[ 974.569419] lock_acquire+0x95/0x1e0
[ 974.573011] ? lock_acquire+0x95/0x1e0
[ 974.576767] ? tcp_mmap+0x7c/0x550
[ 974.580167] lock_sock_nested+0x71/0xa0
[ 974.584023] ? tcp_mmap+0x7c/0x550
[ 974.587437] tcp_mmap+0x7c/0x550
[ 974.590677] sock_mmap+0x23/0x30
[ 974.593909] mmap_region+0x3a4/0x5d0
[ 974.597506] do_mmap+0x313/0x530
[ 974.600749] vm_mmap_pgoff+0xc7/0x100
[ 974.604414] ksys_mmap_pgoff+0x1d5/0x260
[ 974.608341] ? fd_install+0x25/0x30
[ 974.611849] ? trace_hardirqs_on_caller+0xef/0x180
[ 974.616641] SyS_mmap+0x1b/0x30
[ 974.619804] do_syscall_64+0x6e/0x1a0
[ 974.623462] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 974.628549] RIP: 0033:0x433749
[ 974.631600] RSP: 002b:00007ffd29fdb438 EFLAGS: 00000216 ORIG_RAX: 0000000000000009
[ 974.639197] RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 0000000000433749
[ 974.646323] RDX: 0000000000000008 RSI: 0000000000004000 RDI: 0000000020ab7000
[ 974.653463] RBP: 00007ffd29fdb460 R08: 0000000000000003 R09: 0000000000000000
[ 974.660603] R10: 0000000000000012 R11: 0000000000000216 R12: 0000000000401670
[ 974.667737] R13: 0000000000401700 R14: 0000000000000000 R15: 0000000000000000
I am not sure we can keep mmap() API, since we probably need to first lock the socket,
then grab vm semaphore.
Powered by blists - more mailing lists