lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AA27A6E6-2376-437A-9508-FD9C2427A465@vmware.com>
Date:   Wed, 13 Sep 2017 18:58:13 +0000
From:   "Jorgen S. Hansen" <jhansen@...are.com>
To:     Michal Hocko <mhocko@...nel.org>
CC:     Aditya Sarwade <asarwade@...are.com>,
        Thomas Hellstrom <thellstrom@...are.com>,
        LKML <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Masik Petr <Petr.Masik@...z.cz>,
        Ben Hutchings <ben@...adent.org.uk>,
        Sasha Levin <alexander.levin@...izon.com>,
        Stable tree <stable@...r.kernel.org>
Subject: Re: scheduling while atomic from vmci_transport_recv_stream_cb in
 3.16 kernels


> On Sep 13, 2017, at 5:19 PM, Michal Hocko <mhocko@...nel.org> wrote:
> 
> On Wed 13-09-17 15:07:26, Jorgen S. Hansen wrote:
>> 
>>> On Sep 12, 2017, at 11:08 AM, Michal Hocko <mhocko@...nel.org> wrote:
>>> 
>>> Hi,
>>> we are seeing the following splat with Debian 3.16 stable kernel
>>> 
>>> BUG: scheduling while atomic: MATLAB/26771/0x00000100
>>> Modules linked in: veeamsnap(O) hmac cbc cts nfsv4 dns_resolver rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc vmw_vso$
>>> CPU: 0 PID: 26771 Comm: MATLAB Tainted: G           O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt20-1+deb8u3
>>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015
>>> ffff88315c1e4c20 ffffffff8150db3f ffff88193f803dc8 ffffffff8150acdf
>>> ffffffff815103a2 0000000000012f00 ffff8819423dbfd8 0000000000012f00
>>> ffff88315c1e4c20 ffff88193f803dc8 ffff88193f803d50 ffff88193f803dc0
>>> Call Trace:
>>> <IRQ>  [<ffffffff8150db3f>] ? dump_stack+0x41/0x51
>>> [<ffffffff8150acdf>] ? __schedule_bug+0x48/0x55
>>> [<ffffffff815103a2>] ? __schedule+0x5d2/0x700
>>> [<ffffffff8150f9b9>] ? schedule_timeout+0x229/0x2a0
>>> [<ffffffff8109ba70>] ? select_task_rq_fair+0x390/0x700
>>> [<ffffffff8109f780>] ? check_preempt_wakeup+0x120/0x1d0
>>> [<ffffffff81510eb8>] ? wait_for_completion+0xa8/0x120
>>> [<ffffffff81096de0>] ? wake_up_state+0x10/0x10
>>> [<ffffffff810c3da0>] ? call_rcu_bh+0x20/0x20
>>> [<ffffffff810c180b>] ? wait_rcu_gp+0x4b/0x60
>>> [<ffffffff810c17b0>] ? ftrace_raw_output_rcu_utilization+0x40/0x40
>>> [<ffffffffa02ca6f5>] ? vmci_event_unsubscribe+0x75/0xb0 [vmw_vmci]
>>> [<ffffffffa031f5cd>] ? vmci_transport_destruct+0x1d/0xe0 [vmw_vsock_vmci_transport]
>>> [<ffffffffa03167e3>] ? vsock_sk_destruct+0x13/0x60 [vsock]
>>> [<ffffffff81409f7a>] ? __sk_free+0x1a/0x130
>>> [<ffffffffa0320218>] ? vmci_transport_recv_stream_cb+0x1e8/0x2d0 [vmw_vsock_vmci_transport]
>>> [<ffffffffa02c9cba>] ? vmci_datagram_invoke_guest_handler+0xaa/0xd0 [vmw_vmci]
>>> [<ffffffffa02cab51>] ? vmci_dispatch_dgs+0xc1/0x200 [vmw_vmci]
>>> [<ffffffff8106c294>] ? tasklet_action+0xf4/0x100
>>> [<ffffffff8106c681>] ? __do_softirq+0xf1/0x290
>>> [<ffffffff8106ca55>] ? irq_exit+0x95/0xa0
>>> [<ffffffff81516b22>] ? do_IRQ+0x52/0xe0
>>> [<ffffffff8151496d>] ? common_interrupt+0x6d/0x6d
>>> 
>>> AFAICS this has been fixed by 4ef7ea9195ea ("VSOCK: sock_put wasn't safe
>>> to call in interrupt context") but this patch hasn't been backported to
>>> stable trees. It applies cleanly on top of 3.16 stable tree but I am not
>>> familiar with the code to send the backport to the stable maintainer
>>> directly.
>>> 
>>> Could you double check that the patch below (just a blind cherry-pick)
>>> is correct and it doesn't need additional patches on top?
>> 
>> Hi,
>> 
>> The patch below has been used to fix the above issue by other distros
>> - among them Redhat for the 3.10 kernel, so it should work for 3.16 as
>> well.
> 
> Thanks for the confirmation. I do not see 4ef7ea9195ea ("VSOCK: sock_put
> wasn't safe to call in interrupt context") in 3.10 stable branch
> though.
> 
>> In addition to the patch above, there are two other patches that
>> need to be applied on top for the fix to be correct:
>> 
>> 8566b86ab9f0f45bc6f7dd422b21de9d0cf5415a "VSOCK: Fix lockdep issue."
>> 
>> and
>> 
>> 8ab18d71de8b07d2c4d6f984b718418c09ea45c5 "VSOCK: Detach QP check should filter out non matching QPs."
> 
> Good to know. I will send all three patches cherry-picked on top of the
> current 3.16 stable branch. Could you have a look please?

The patch series look good to me.

Thanks for taking care of this,
Jorgen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ