[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afd66c2f-25ad-3bb2-8731-719fde979f0d@godking.net>
Date: Sat, 9 Dec 2017 14:38:44 -0800
From: Alexander Kappner <agk@...king.net>
To: Mathias Nyman <mathias.nyman@...ux.intel.com>,
Mathias Nyman <mathias.nyman@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
Lu Baolu <baolu.lu@...ux.intel.com>
Subject: Re: [PATCH] usb-core: Fix potential null pointer dereference in
xhci-debugfs.c
Hi Mathias,
thanks for the patch! The system now resumes cleanly from hibernate even
with usbmuxd doing its thing.
Tested-by: Alexander Kappner <agk@...king.net>
While testing this I hit some other issues with xhci-debugfs.c but I'll
write these up in a separate bug.
On 12/08/2017 09:01 AM, Mathias Nyman wrote:
> On 08.12.2017 13:06, Alexander Kappner wrote:
>> Hi,
>>
>>> I think we need to dig a bit deeper. It's good to check if spriv is
>>> valid
>>> but there are probably other reasons than kzalloc failing.
>>
>> I agree -- this small allocation is unlikely to fail in practice.
>> Also, while my patch prevents the kernel oops, it also prevents the
>> debugfs entries from being created.
>>
>> I've been debugging this more trying to come up with a better
>> solution, but I might need some guidance as I'm not too familiar with
>> the USB subsystem. The immediate cause of the crash was usbmuxd
>> sending a USBDEVFS_SETCONFIGURATION ioctl to a device, which _only if
>> it fails_ calls usb_hcd_alloc_bandwidth to try and reset the device,
>> which in turn calls xhci_debugfs_create_endpoint. The ioctl handler
>> acquires a device-specific lock via usb_lock_device.
>>
>> When the system resumed from hibernate, xhci_resume was called. This
>> in turn called xhci_mem_cleanup to deallocate the device structures,
>> which include setting the debugfs_private pointer to NULL (via
>> xhci_free_virt_devices_depth_first). It thus seems likely that the
>> ioctl is somehow racing with the hibernate. The call to xhci_resume
>> is protected by a host-controller specific lock (xhci->lock) but it
>> doesn't attempt to take the usb_lock_device device-specific lock.
>>
>> Now my suspicion is that xhci_resume freed and zeroed the device
>> structures while racing with the ioctl handler. I can't seem to find
>> any exclusion mechanism that would prevent xhci_resume from racing
>> with the USBDEVFS_SETCONFIGURATION ioctl (or any other ioctl, for
>> that matter). Am I missing something? If not, is there any reason why
>> an ioctl might need to execute in parallel with the xhci_resume? If
>> not, can we just do a busy wait in xhci_resume until all pending
>> ioctls have returned?
>
> I'm not sure, but if I recall correctly then power management is supposed
> to make sure a driver doesn't access usb devices while the host
> controller
> is still resuming.
>
> The odd thing here is that
> xhci_debugfs_remove_slot(xhci, slot_id), and
> xhci_free_virt_device(xhci, slot_id) are called together when
> xhci_mem_cleanup() calls xhci_free_virt_devices_depth_first()
>
> That means both the xhci_virt_device *dev and dev->debugfs_private
> should both be freed and xhci->devs[slot_id] set NULL for that
> virt_device.
>
> so xhci_add_endpoint() should fail a lot earlier because the
> xhci->devs[slot_id]
> should be a null pointer as well.
>
> Allocation is also done together in xhci_alloc_dev()
>
> Looking at it more closely there is actually the .free_dev callback that
> first frees the dev->debugs_private but the virt_dev is only freed
> conditionally later
>
> Attached a patch that frees them together, can you try it out?
>
> If it doesn't help we need to add some elaborate tracing
>
> Thanks
> -Mathias
>
>
Powered by blists - more mailing lists