linux-kernel - Re: [PATCH] vfio pci: kernel support of error recovery only for non fatal error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58B4D2D9.5090900@cn.fujitsu.com>
Date:   Tue, 28 Feb 2017 09:31:05 +0800
From:   Cao jin <caoj.fnst@...fujitsu.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
CC:     <linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
        <qemu-devel@...gnu.org>, <alex.williamson@...hat.com>,
        <izumi.taku@...fujitsu.com>
Subject: Re: [PATCH] vfio pci: kernel support of error recovery only for non
 fatal error



On 02/28/2017 12:16 AM, Michael S. Tsirkin wrote:
> On Mon, Feb 27, 2017 at 03:28:43PM +0800, Cao jin wrote:
>> Subject: Re: [PATCH] vfio pci: kernel support of error recovery only for non
>> fatal error
> 
> Don't make the subject so long. This is why I had
> 	[PATCH v3] vfio error recovery: kernel support
> you also want to add versioning as you inherited my v3,
> you should make this v4 etc.
> 

Ok. I didn' see [PATCH v3], I guess I am not CCed.

>> 0. What happens now (PCIE AER only)
>>    Fatal errors cause a link reset.
>>    Non fatal errors don't.
>>    All errors stop the VM eventually, but not immediately
>>    because it's detected and reported asynchronously.
>>    Interrupts are forwarded as usual.
>>    Correctable errors are not reported to guest at all.
>>    Note: PPC EEH is different. This focuses on AER.
>>
>> 1. Correctable errors
>>    There is no need to report these to guest. So let's not.
>>
>> 2. Fatal errors
>>    It's not easy to handle them gracefully since link reset
>>    is needed. As a first step, let's use the existing mechanism
>>    in that case.
>>
>> 2. Non-fatal errors
>>    Here we could make progress by reporting them to guest
>>    and have guest handle them.
>>
>>    Issues:
>>    a. this behaviour should only be enabled with new userspace,
>>       old userspace should work without changes.
>>
>>       Suggestion: One way to address this would be to add a new eventfd
>>       non_fatal_err_trigger. If not set, invoke err_trigger.
>>
>>    b. drivers are supposed to stop MMIO when error is reported,
>>       if vm keeps going, we will keep doing MMIO/config.
>>
>>       Suggestion 1: ignore this. vm stop happens much later when
>>       userspace runs anyway, so we are not making things much worse.
>>
>>       Suggestion 2: try to stop MMIO/config, resume on resume call
>>
>>       Patch below implements Suggestion 1.
>>
>>       Note that although this is really against the documentation, which
>>       states error_detected() is the point at which the driver should quiesce
>>       the device and not touch it further (until diagnostic poking at
>>       mmio_enabled or full access at resume callback).
>>
>>       Fixing this won't be easy. However, this is not a regression.
>>
>>       Also note this does nothing about interrupts, documentation
>>       suggests returning IRQ_NONE until reset.
>>       Again, not a regression.
>>
>>    c. PF driver might detect that function is completely broken,
>>       if vm keeps going, we will keep doing MMIO/config.
>>
>>       Suggestion 1: ignore this. vm stop happens much later when
>>       userspace runs anyway, so we are not making things much worse.
>>
>>       Suggestion 2: detect this and invoke err_trigger to stop VM.
>>
>>       Patch below implements Suggestion 2.
>>
>> Suggested-by: Michael S. Tsirkin <mst@...hat.com>
> 
> It's more than this, you are really reusing parts of my patch,
> so you should say so and include my signature.
> 
> If you only added a line or two you can keep
> the original author. To do this you add
> 	From: Michael S. Tsirkin <mst@...hat.com>
> before commit text.

On this topic, I am really bewildered for a while, thanks for clarification.

-- 
Sincerely,
Cao jin

> 
>> Signed-off-by: Cao jin <caoj.fnst@...fujitsu.com>
>> ---
> 
> Changelog from my v3?
> 
>>  drivers/vfio/pci/vfio_pci.c         | 38 +++++++++++++++++++++++++++++++++++--
>>  drivers/vfio/pci/vfio_pci_intrs.c   | 19 +++++++++++++++++++
>>  drivers/vfio/pci/vfio_pci_private.h |  1 +
>>  include/uapi/linux/vfio.h           |  1 +
>>  4 files changed, 57 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
>> index 324c52e..3551cc9 100644
>> --- a/drivers/vfio/pci/vfio_pci.c
>> +++ b/drivers/vfio/pci/vfio_pci.c
>> @@ -441,7 +441,8 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device *vdev, int irq_type)
>>  
>>  			return (flags & PCI_MSIX_FLAGS_QSIZE) + 1;
>>  		}
>> -	} else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX) {
>> +	} else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX ||
>> +		   irq_type == VFIO_PCI_NON_FATAL_ERR_IRQ_INDEX) {
>>  		if (pci_is_pcie(vdev->pdev))
>>  			return 1;
>>  	} else if (irq_type == VFIO_PCI_REQ_IRQ_INDEX) {
>> @@ -796,6 +797,7 @@ static long vfio_pci_ioctl(void *device_data,
>>  		case VFIO_PCI_REQ_IRQ_INDEX:
>>  			break;
>>  		case VFIO_PCI_ERR_IRQ_INDEX:
>> +		case VFIO_PCI_NON_FATAL_ERR_IRQ_INDEX:
>>  			if (pci_is_pcie(vdev->pdev))
>>  				break;
>>  		/* pass thru to return error */
>> @@ -1282,7 +1284,9 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
>>  
>>  	mutex_lock(&vdev->igate);
>>  
>> -	if (vdev->err_trigger)
>> +	if (state == pci_channel_io_normal && vdev->non_fatal_err_trigger)
>> +		eventfd_signal(vdev->non_fatal_err_trigger, 1);
>> +	else if (vdev->err_trigger)
>>  		eventfd_signal(vdev->err_trigger, 1);
>>  
>>  	mutex_unlock(&vdev->igate);
>> @@ -1292,8 +1296,38 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
>>  	return PCI_ERS_RESULT_CAN_RECOVER;
>>  }
>>  
>> +static pci_ers_result_t vfio_pci_aer_slot_reset(struct pci_dev *pdev)
>> +{
>> +	struct vfio_pci_device *vdev;
>> +	struct vfio_device *device;
>> +	static pci_ers_result_t err = PCI_ERS_RESULT_NONE;
>> +
>> +	device = vfio_device_get_from_dev(&pdev->dev);
>> +	if (!device)
>> +		goto err_dev;
>> +
>> +	vdev = vfio_device_data(device);
>> +	if (!vdev)
>> +		goto err_data;
>> +
>> +	mutex_lock(&vdev->igate);
>> +
>> +	if (vdev->err_trigger)
>> +		eventfd_signal(vdev->err_trigger, 1);
>> +
>> +	mutex_unlock(&vdev->igate);
>> +
>> +	err = PCI_ERS_RESULT_RECOVERED;
>> +
>> +err_data:
>> +	vfio_device_put(device);
>> +err_dev:
>> +	return err;
>> +}
>> +
>>  static const struct pci_error_handlers vfio_err_handlers = {
>>  	.error_detected = vfio_pci_aer_err_detected,
>> +	.slot_reset = vfio_pci_aer_slot_reset,
>>  };
>>  
>>  static struct pci_driver vfio_pci_driver = {
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>> index 1c46045..270d568 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -611,6 +611,17 @@ static int vfio_pci_set_err_trigger(struct vfio_pci_device *vdev,
>>  					       count, flags, data);
>>  }
>>  
>> +static int vfio_pci_set_non_fatal_err_trigger(struct vfio_pci_device *vdev,
>> +				    unsigned index, unsigned start,
>> +				    unsigned count, uint32_t flags, void *data)
>> +{
>> +	if (index != VFIO_PCI_NON_FATAL_ERR_IRQ_INDEX || start != 0 || count > 1)
>> +		return -EINVAL;
>> +
>> +	return vfio_pci_set_ctx_trigger_single(&vdev->non_fatal_err_trigger,
>> +					       count, flags, data);
>> +}
>> +
>>  static int vfio_pci_set_req_trigger(struct vfio_pci_device *vdev,
>>  				    unsigned index, unsigned start,
>>  				    unsigned count, uint32_t flags, void *data)
>> @@ -664,6 +675,14 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, uint32_t flags,
>>  			break;
>>  		}
>>  		break;
>> +	case VFIO_PCI_NON_FATAL_ERR_IRQ_INDEX:
>> +		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
>> +		case VFIO_IRQ_SET_ACTION_TRIGGER:
>> +			if (pci_is_pcie(vdev->pdev))
>> +				func = vfio_pci_set_non_fatal_err_trigger;
>> +			break;
>> +		}
>> +		break;
>>  	case VFIO_PCI_REQ_IRQ_INDEX:
>>  		switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
>>  		case VFIO_IRQ_SET_ACTION_TRIGGER:
>> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
>> index f561ac1..d30f8a3 100644
>> --- a/drivers/vfio/pci/vfio_pci_private.h
>> +++ b/drivers/vfio/pci/vfio_pci_private.h
>> @@ -93,6 +93,7 @@ struct vfio_pci_device {
>>  	struct pci_saved_state	*pci_saved_state;
>>  	int			refcnt;
>>  	struct eventfd_ctx	*err_trigger;
>> +	struct eventfd_ctx	*non_fatal_err_trigger;
>>  	struct eventfd_ctx	*req_trigger;
>>  	struct list_head	dummy_resources_list;
>>  };
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 519eff3..affa973 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -443,6 +443,7 @@ enum {
>>  	VFIO_PCI_MSIX_IRQ_INDEX,
>>  	VFIO_PCI_ERR_IRQ_INDEX,
>>  	VFIO_PCI_REQ_IRQ_INDEX,
>> +	VFIO_PCI_NON_FATAL_ERR_IRQ_INDEX,
>>  	VFIO_PCI_NUM_IRQS
>>  };
>>  
>> -- 
>> 1.8.3.1
>>
>>
> 
> 
> .
>