[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <762bee4a-7daa-948a-2054-41b3e172fa8c@linux.ibm.com>
Date: Wed, 31 Mar 2021 10:36:23 -0400
From: Tony Krowiak <akrowiak@...ux.ibm.com>
To: Halil Pasic <pasic@...ux.ibm.com>
Cc: linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, freude@...ux.ibm.com, borntraeger@...ibm.com,
cohuck@...hat.com, mjrosato@...ux.ibm.com,
alex.williamson@...hat.com, kwankhede@...dia.com,
fiuczy@...ux.ibm.com, frankja@...ux.ibm.com, david@...hat.com,
hca@...ux.ibm.com, gor@...ux.ibm.com
Subject: Re: [PATCH v13 06/15] s390/vfio-ap: allow assignment of unavailable
AP queues to mdev device
On 1/14/21 8:44 PM, Halil Pasic wrote:
> On Thu, 14 Jan 2021 12:54:39 -0500
> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>
>>>> /**
>>>> * vfio_ap_mdev_verify_no_sharing
>>>> *
>>>> - * Verifies that the APQNs derived from the cross product of the AP adapter IDs
>>>> - * and AP queue indexes comprising the AP matrix are not configured for another
>>>> - * mediated device. AP queue sharing is not allowed.
>>>> + * Verifies that each APQN derived from the Cartesian product of the AP adapter
>>>> + * IDs and AP queue indexes comprising the AP matrix are not configured for
>>>> + * another mediated device. AP queue sharing is not allowed.
>>>> *
>>>> - * @matrix_mdev: the mediated matrix device
>>>> + * @matrix_mdev: the mediated matrix device to which the APQNs being verified
>>>> + * are assigned.
>>>> + * @mdev_apm: mask indicating the APIDs of the APQNs to be verified
>>>> + * @mdev_aqm: mask indicating the APQIs of the APQNs to be verified
>>>> *
>>>> - * Returns 0 if the APQNs are not shared, otherwise; returns -EADDRINUSE.
>>>> + * Returns 0 if the APQNs are not shared, otherwise; returns -EBUSY.
>>>> */
>>>> -static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>>>> +static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev,
>>>> + unsigned long *mdev_apm,
>>>> + unsigned long *mdev_aqm)
>>>> {
>>>> struct ap_matrix_mdev *lstdev;
>>>> DECLARE_BITMAP(apm, AP_DEVICES);
>>>> @@ -523,20 +426,31 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>>>> * We work on full longs, as we can only exclude the leftover
>>>> * bits in non-inverse order. The leftover is all zeros.
>>>> */
>>>> - if (!bitmap_and(apm, matrix_mdev->matrix.apm,
>>>> - lstdev->matrix.apm, AP_DEVICES))
>>>> + if (!bitmap_and(apm, mdev_apm, lstdev->matrix.apm, AP_DEVICES))
>>>> continue;
>>>>
>>>> - if (!bitmap_and(aqm, matrix_mdev->matrix.aqm,
>>>> - lstdev->matrix.aqm, AP_DOMAINS))
>>>> + if (!bitmap_and(aqm, mdev_aqm, lstdev->matrix.aqm, AP_DOMAINS))
>>>> continue;
>>>>
>>>> - return -EADDRINUSE;
>>>> + vfio_ap_mdev_log_sharing_err(dev_name(mdev_dev(lstdev->mdev)),
>>>> + apm, aqm);
>>>> +
>>>> + return -EBUSY;
>>> Why do we change -EADDRINUSE to -EBUSY? This gets bubbled up to
>>> userspace, or? So a tool that checks for the other mdev has it
>>> condition by checking for -EADDRINUSE, would be confused...
>> Back in v8 of the series, Christian suggested the occurrences
>> of -EADDRINUSE should be replaced by the more appropriate
>> -EBUSY (Message ID <d7954c15-b14f-d6e5-0193-aadca61883a8@...ibm.com>),
>> so I changed it here. It does get bubbled up to userspace, so you make a
>> valid point. I will
>> change it back. I will, however, set the value returned from the
>> __verify_card_reservations() function in ap_bus.c to -EBUSY as
>> suggested by Christian.
> As long as the error code for an ephemeral failure due to can't take a
> lock right now, and the error code for a failure due to a sharing
> conflict are (which most likely requires admin action to be resolved)
> I'm fine.
>
> Choosing EBUSY for sharing conflict, and something else for can't take
> lock for the bus attributes, while choosing EADDRINUSE for sharing
> conflict, and EBUSY for can't take lock in the case of the mdev
> attributes (assign_*; unassign_*) sounds confusing to me, but is still
> better than collating the two conditions. Maybe we can choose EAGAIN
> or EWOULDBLOCK for the can't take the lock right now. I don't know.
I was in the process of creating the change log for v14 of
this patch series and realized I never addressed this.
I think EAGAIN would be a better return code for the
mutex_trylock failures in the mdev assign/unassign
operations.
>
> I'm open to suggestions. And if Christian wants to change this for
> the already released interfaces, I will have to live with that. But it
> has to be a conscious decision at least.
>
> What I consider tricky about EBUSY, is that according to my intuition,
> in pseudocode, object.operation(argument) returns -EBUSY probably tells
> me that object is busy (i.e. is in the middle of something incompatible
> with performing operation). In our case, it is not the object that is
> busy, but the resource denoted by the argument.
>
> Regards,
> Halil
Powered by blists - more mailing lists