netdev - Re: BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a0c4462e-7061-d71b-973d-b4ba6bd3eecd@citrix.com>
Date:   Thu, 2 Mar 2017 14:55:57 +0000
From:   Igor Druzhinin <igor.druzhinin@...rix.com>
To:     Paul Durrant <Paul.Durrant@...rix.com>,
        'Juergen Gross' <jgross@...e.com>,
        Wei Liu <wei.liu2@...rix.com>
CC:     xen-devel <xen-devel@...ts.xenproject.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        David Miller <davem@...emloft.net>
Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus
 disconnect"

On 02/03/17 12:19, Paul Durrant wrote:
>> -----Original Message-----
>> From: Juergen Gross [mailto:jgross@...e.com]
>> Sent: 02 March 2017 12:13
>> To: Wei Liu <wei.liu2@...rix.com>
>> Cc: Igor Druzhinin <igor.druzhinin@...rix.com>; xen-devel <xen-
>> devel@...ts.xenproject.org>; Linux Kernel Mailing List <linux-
>> kernel@...r.kernel.org>; netdev@...r.kernel.org; Boris Ostrovsky
>> <boris.ostrovsky@...cle.com>; David Miller <davem@...emloft.net>; Paul
>> Durrant <Paul.Durrant@...rix.com>
>> Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus
>> disconnect"
>>
>> On 02/03/17 13:06, Wei Liu wrote:
>>> On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote:
>>>> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing:
>>>>
>>>> [  174.512861] switch: port 2(vif3.0) entered disabled state
>>>> [  174.522735] BUG: sleeping function called from invalid context at
>>>> /home/build/linux-linus/mm/vmalloc.c:1441
>>>> [  174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch
>>>> [  174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: G        W
>>>> 4.10.0upstream-11073-g4977ab6-dirty #1
>>>> [  174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS
>> V17.0
>>>> 03/14/2011
>>>> [  174.525517] Call Trace:
>>>> [  174.526217]  show_stack+0x23/0x60
>>>> [  174.526899]  dump_stack+0x5b/0x88
>>>> [  174.527562]  ___might_sleep+0xde/0x130
>>>> [  174.528208]  __might_sleep+0x35/0xa0
>>>> [  174.528840]  ? _raw_spin_unlock_irqrestore+0x13/0x20
>>>> [  174.529463]  ? __wake_up+0x40/0x50
>>>> [  174.530089]  remove_vm_area+0x20/0x90
>>>> [  174.530724]  __vunmap+0x1d/0xc0
>>>> [  174.531346]  ? delete_object_full+0x13/0x20
>>>> [  174.531973]  vfree+0x40/0x80
>>>> [  174.532594]  set_backend_state+0x18a/0xa90
>>>> [  174.533221]  ? dwc_scan_descriptors+0x24d/0x430
>>>> [  174.533850]  ? kfree+0x5b/0xc0
>>>> [  174.534476]  ? xenbus_read+0x3d/0x50
>>>> [  174.535101]  ? xenbus_read+0x3d/0x50
>>>> [  174.535718]  ? xenbus_gather+0x31/0x90
>>>> [  174.536332]  ? ___might_sleep+0xf6/0x130
>>>> [  174.536945]  frontend_changed+0x6b/0xd0
>>>> [  174.537565]  xenbus_otherend_changed+0x7d/0x80
>>>> [  174.538185]  frontend_changed+0x12/0x20
>>>> [  174.538803]  xenwatch_thread+0x74/0x110
>>>> [  174.539417]  ? woken_wake_function+0x20/0x20
>>>> [  174.540049]  kthread+0xe5/0x120
>>>> [  174.540663]  ? xenbus_printf+0x50/0x50
>>>> [  174.541278]  ? __kthread_init_worker+0x40/0x40
>>>> [  174.541898]  ret_from_fork+0x21/0x2c
>>>> [  174.548635] switch: port 2(vif3.0) entered disabled state
>>>>
>>>> I believe calling vfree() when holding a spin_lock isn't a good idea.
>>>>
>>>
>>> Use vfree_atomic instead?
>>
>> Hmm, isn't this overkill here?
>>
>> You can just set a local variable with the address and do vfree() after
>> releasing the lock.
>>
> 
> Yep, that's what I was thinking. Patch coming shortly.
> 
>   Paul

We have an internal patch that was just recently tested without using
spinlocks. Calling vfree in the spinlock section is not the worst that
could happen as our testing revealed. I switched to RCU for protecting
the environment from memory release. I'll share it today.

Igor

> 
>>
>> Juergen