lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 22 May 2020 10:33:20 -0500
From:   Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To:     Jason Gunthorpe <jgg@...pe.ca>
Cc:     Ranjani Sridharan <ranjani.sridharan@...ux.intel.com>,
        Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
        davem@...emloft.net, gregkh@...uxfoundation.org,
        netdev@...r.kernel.org, linux-rdma@...r.kernel.org,
        nhorman@...hat.com, sassmann@...hat.com,
        Fred Oh <fred.oh@...ux.intel.com>
Subject: Re: [net-next v4 10/12] ASoC: SOF: Introduce descriptors for SOF
 client



On 5/22/20 9:55 AM, Jason Gunthorpe wrote:
> On Fri, May 22, 2020 at 09:29:57AM -0500, Pierre-Louis Bossart wrote:
>>
>>>>>> +	ret = virtbus_register_device(vdev);
>>>>>> +	if (ret < 0)
>>>>>> +		return ret;
>>>>>> +
>>>>>> +	/* make sure the probe is complete before updating client list
>>>>>> */
>>>>>> +	timeout = msecs_to_jiffies(SOF_CLIENT_PROBE_TIMEOUT_MS);
>>>>>> +	time = wait_for_completion_timeout(&cdev->probe_complete,
>>>>>> timeout);
>>>>>
>>>>> This seems bonkers - the whole point of something like virtual bus is
>>>>> to avoid madness like this.
>>>>
>>>> Thanks for your review, Jason. The idea of the times wait here is to
>>>> make the registration of the virtbus devices synchronous so that the
>>>> SOF core device has knowledge of all the clients that have been able to
>>>> probe successfully. This part is domain-specific and it works very well
>>>> in the audio driver case.
>>>
>>> This need to be hot plug safe. What if the module for this driver is
>>> not available until later in boot? What if the user unplugs the
>>> driver? What if the kernel runs probing single threaded?
>>>
>>> It is really unlikely you can both have the requirement that things be
>>> synchronous and also be doing all the other lifetime details properly..
>>
>> Can you suggest an alternate solution then?
> 
> I don't even know what problem you are trying to solve.
> 
>> The complete/wait_for_completion is a simple mechanism to tell that the
>> action requested by the parent is done. Absent that, we can end-up in a
>> situation where the probe may fail, or the requested module does not exist,
>> and the parent knows nothing about the failure - so the system is in a
>> zombie state and users are frustrated. It's not great either, is it?
> 
> Maybe not great, but at least it is consistent with all the lifetime
> models and the operation of the driver core.

I agree your comments are valid ones, I just don't have a solution to be 
fully compliant with these models and report failures of the driver 
probe for a child device due to configuration issues (bad audio 
topology, etc).

My understanding is that errors on probe are explicitly not handled in 
the driver core, see e.g. comments such as:

/*
  * Ignore errors returned by ->probe so that the next driver can try
  * its luck.
  */
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L636

If somehow we could request the error to be reported then probably we 
wouldn't need this complete/wait_for_completion mechanism as a custom 
notification.

>> This is not an hypothetical case, we've had this recurring problem when a
>> PCI device creates an audio card represented as a platform device. When the
>> card registration fails, typically due to configuration issues, the PCI
>> probe still completes. That's really confusing and the source of lots of
>> support questions. If we use these virtual bus extensions to stpo abusing
>> platform devices, it'd be really nice to make those unreported probe
>> failures go away.
> 
> I think you need to address this in some other way that is hot plug
> safe.
> 
> Surely you can make this failure visible to users in some other way?

Not at the moment, no. there are no failures reported in dmesg, and the 
user does not see any card created. This is a silent error.

This is probably domain-specific btw, the use of complete() is only part 
of the SOF core where we extended the virtual bus to support SOF 
clients. This is not a requirement in general for virtual bus users. We 
are not forcing anyone to rely on this complete/wait_for_completion, and 
if someone has a better idea to help us report probe failures we are all 
ears.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ