lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 31 Mar 2021 10:36:23 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     Halil Pasic <pasic@...ux.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, freude@...ux.ibm.com, borntraeger@...ibm.com,
        cohuck@...hat.com, mjrosato@...ux.ibm.com,
        alex.williamson@...hat.com, kwankhede@...dia.com,
        fiuczy@...ux.ibm.com, frankja@...ux.ibm.com, david@...hat.com,
        hca@...ux.ibm.com, gor@...ux.ibm.com
Subject: Re: [PATCH v13 06/15] s390/vfio-ap: allow assignment of unavailable
 AP queues to mdev device



On 1/14/21 8:44 PM, Halil Pasic wrote:
> On Thu, 14 Jan 2021 12:54:39 -0500
> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>
>>>>    /**
>>>>     * vfio_ap_mdev_verify_no_sharing
>>>>     *
>>>> - * Verifies that the APQNs derived from the cross product of the AP adapter IDs
>>>> - * and AP queue indexes comprising the AP matrix are not configured for another
>>>> - * mediated device. AP queue sharing is not allowed.
>>>> + * Verifies that each APQN derived from the Cartesian product of the AP adapter
>>>> + * IDs and AP queue indexes comprising the AP matrix are not configured for
>>>> + * another mediated device. AP queue sharing is not allowed.
>>>>     *
>>>> - * @matrix_mdev: the mediated matrix device
>>>> + * @matrix_mdev: the mediated matrix device to which the APQNs being verified
>>>> + *		 are assigned.
>>>> + * @mdev_apm: mask indicating the APIDs of the APQNs to be verified
>>>> + * @mdev_aqm: mask indicating the APQIs of the APQNs to be verified
>>>>     *
>>>> - * Returns 0 if the APQNs are not shared, otherwise; returns -EADDRINUSE.
>>>> + * Returns 0 if the APQNs are not shared, otherwise; returns -EBUSY.
>>>>     */
>>>> -static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>>>> +static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev,
>>>> +					  unsigned long *mdev_apm,
>>>> +					  unsigned long *mdev_aqm)
>>>>    {
>>>>    	struct ap_matrix_mdev *lstdev;
>>>>    	DECLARE_BITMAP(apm, AP_DEVICES);
>>>> @@ -523,20 +426,31 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>>>>    		 * We work on full longs, as we can only exclude the leftover
>>>>    		 * bits in non-inverse order. The leftover is all zeros.
>>>>    		 */
>>>> -		if (!bitmap_and(apm, matrix_mdev->matrix.apm,
>>>> -				lstdev->matrix.apm, AP_DEVICES))
>>>> +		if (!bitmap_and(apm, mdev_apm, lstdev->matrix.apm, AP_DEVICES))
>>>>    			continue;
>>>>    
>>>> -		if (!bitmap_and(aqm, matrix_mdev->matrix.aqm,
>>>> -				lstdev->matrix.aqm, AP_DOMAINS))
>>>> +		if (!bitmap_and(aqm, mdev_aqm, lstdev->matrix.aqm, AP_DOMAINS))
>>>>    			continue;
>>>>    
>>>> -		return -EADDRINUSE;
>>>> +		vfio_ap_mdev_log_sharing_err(dev_name(mdev_dev(lstdev->mdev)),
>>>> +					     apm, aqm);
>>>> +
>>>> +		return -EBUSY;
>>> Why do we change -EADDRINUSE to -EBUSY? This gets bubbled up to
>>> userspace, or? So a tool that checks for the other mdev has it
>>> condition by checking for -EADDRINUSE, would be confused...
>> Back in v8 of the series, Christian suggested the occurrences
>> of -EADDRINUSE should be replaced by the more appropriate
>> -EBUSY (Message ID <d7954c15-b14f-d6e5-0193-aadca61883a8@...ibm.com>),
>> so I changed it here. It does get bubbled up to userspace, so you make a
>> valid point. I will
>> change it back. I will, however, set the value returned from the
>> __verify_card_reservations() function in ap_bus.c to -EBUSY as
>> suggested by Christian.
> As long as the error code for an ephemeral failure due to can't take a
> lock right now, and the error code for a failure due to a sharing
> conflict are (which most likely requires admin action to be resolved)
> I'm fine.
>
> Choosing EBUSY for sharing conflict, and something else for can't take
> lock for the bus attributes, while choosing EADDRINUSE for sharing
> conflict, and EBUSY for can't take lock in the case of the mdev
> attributes (assign_*; unassign_*) sounds confusing to me, but is still
> better than collating the two conditions. Maybe we can choose EAGAIN
> or EWOULDBLOCK for the can't take the lock right now. I don't know.

I was in the process of creating the change log for v14 of
this patch series and realized I never addressed this.
I think EAGAIN would be a better return code for the
mutex_trylock failures in the mdev assign/unassign
operations.

>
> I'm open to suggestions. And if Christian wants to change this for
> the already released interfaces, I will have to live with that. But it
> has to be a conscious decision at least.
>
> What I consider tricky about EBUSY, is that according to my intuition,
> in pseudocode, object.operation(argument) returns -EBUSY probably tells
> me that object is busy (i.e. is in the middle of something incompatible
> with performing operation). In our case, it is not the object that is
> busy, but the resource denoted by the argument.
>
> Regards,
> Halil

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ