lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 30 Oct 2019 12:51:11 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     Pierre Morel <pmorel@...ux.ibm.com>,
        Harald Freudenberger <freude@...ux.ibm.com>,
        linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     heiko.carstens@...ibm.com, gor@...ux.ibm.com,
        borntraeger@...ibm.com, cohuck@...hat.com, mjrosato@...ux.ibm.com,
        pasic@...ux.ibm.com, jjherne@...ux.ibm.com
Subject: Re: [PATCH] s390: vfio-ap: disable IRQ in remove callback results in
 kernel OOPS

On 10/30/19 10:00 AM, Pierre Morel wrote:
> 
> 
> 
> On 10/30/19 8:44 AM, Harald Freudenberger wrote:
>> On 29.10.19 23:09, Tony Krowiak wrote:
>>> From: aekrowia <akrowiak@...ux.ibm.com>
>>>
>>> When an AP adapter card is configured off via the SE or the SCLP
>>> Deconfigure Adjunct Processor command and the AP bus subsequently 
>>> detects
>>> that the adapter card is no longer in the AP configuration, the card
>>> device representing the adapter card as well as each of its associated
>>> AP queue devices will be removed by the AP bus. If one or more of the
>>> affected queue devices is bound to the VFIO AP device driver, its remove
>>> callback will be invoked for each queue to be removed. The remove 
>>> callback
>>> resets the queue and disables IRQ processing. If interrupt processing 
>>> was
>>> never enabled for the queue, disabling IRQ processing will fail 
>>> resulting
>>> in a kernel OOPS.
>>>
>>> This patch verifies IRQ processing is enabled before attempting to 
>>> disable
>>> interrupts for the queue.
>>>
>>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>
>>> Signed-off-by: aekrowia <akrowiak@...ux.ibm.com>
>>> ---
>>>   drivers/s390/crypto/vfio_ap_drv.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/s390/crypto/vfio_ap_drv.c 
>>> b/drivers/s390/crypto/vfio_ap_drv.c
>>> index be2520cc010b..42d8308fd3a1 100644
>>> --- a/drivers/s390/crypto/vfio_ap_drv.c
>>> +++ b/drivers/s390/crypto/vfio_ap_drv.c
>>> @@ -79,7 +79,8 @@ static void vfio_ap_queue_dev_remove(struct 
>>> ap_device *apdev)
>>>       apid = AP_QID_CARD(q->apqn);
>>>       apqi = AP_QID_QUEUE(q->apqn);
>>>       vfio_ap_mdev_reset_queue(apid, apqi, 1);
>>> -    vfio_ap_irq_disable(q);
>>> +    if (q->saved_isc != VFIO_AP_ISC_INVALID)
>>> +        vfio_ap_irq_disable(q);
>>>       kfree(q);
>>>       mutex_unlock(&matrix_dev->lock);
>>>   }
>> Reset of an APQN does also clear IRQ processing. I don't say that the
>> resources associated with IRQ handling for the APQN are also cleared.
>> But when you call PQAP(AQIC) after an PQAP(RAPQ) or PQAP(ZAPQ)
>> it is superfluous. However, there should not appear any kernel OOPS.
>> So can you please give me more details about this kernel oops - maybe
>> I need to add exception handler code to the inline ap_aqic() function.
>>
>> regards, Harald Freudenberger
>>
> 
> Hi Tony,
> 
> wasn't it already solved by the patch 5c4c2126  from Christian ?

No, that patch merely sets the 'matrix_mdev' field of the
'struct vfio_ap_queue' to NULL in the vfio_ap_free_aqic_resources()
function. Also, with the latest master branch which has 5c4c2126
installed, the failure occurs.

> 
> Can you send the trace to me please?

[  266.989476] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=B, 
anc=0, erc=0, rsid=0
[  266.989617] ------------[ cut here ]------------
[  266.989622] vfio_ap_wait_for_irqclear: tapq rc 03: 0504
[  266.989681] WARNING: CPU: 0 PID: 7 at 
drivers/s390/crypto/vfio_ap_ops.c:101 vfio_ap_irq_disable+0x13c/0x1b0 
[vfio_ap]
[  266.989682] Modules linked in: xt_CHECKSUM xt_MASQUERADE tun bridge 
stp llc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack 
ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security 
iptable_nat nf_nat iptable_mangle iptable_raw iptable_security 
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink 
ebtable_filter ebtables ip6table_filter ip6_tables sunrpc ghash_s390 
prng aes_s390 des_s390 libdes vfio_ccw sha512_s390 sha1_s390 eadm_sch 
zcrypt_cex4 qeth_l2 crc32_vx_s390 dasd_eckd_mod sha256_s390 qeth 
sha_common dasd_mod ccwgroup qdio pkey zcrypt vfio_ap kvm
[  266.989704] CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.4.0-rc5 #81
[  266.989705] Hardware name: IBM 2964 NE1 749 (LPAR)
[  266.989710] Workqueue: events_long ap_scan_bus
[  266.989711] Krnl PSW : 0704c00180000000 000003ff8007d89c 
(vfio_ap_irq_disable+0x13c/0x1b0 [vfio_ap])
[  266.989714]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 
PM:0 RI:0 EA:3
[  266.989716] Krnl GPRS: 000000000000000a 0000000000000006 
000000000000002b 0000000000000007
[  266.989717]            0000000000000007 000000007fe06000 
000003ff00000005 0000000000000000
[  266.989718]            0000000100000504 0000000000000003 
00000001f9d27e40 000003e00003bb5c
[  266.989719]            00000001fe765d00 0000000000000504 
000003ff8007d898 000003e00003ba60
[  266.989724] Krnl Code: 000003ff8007d88c: c02000000ce6    larl 
%r2,3ff8007f258
                           000003ff8007d892: c0e5fffff4c7    brasl 
%r14,3ff8007c220
                          #000003ff8007d898: a7f40001        brc 
15,3ff8007d89a
                          >000003ff8007d89c: a7f4ff9d        brc 
15,3ff8007d7d6
                           000003ff8007d8a0: a7100100        tmlh    %r1,256
                           000003ff8007d8a4: a784ff99        brc 
8,3ff8007d7d6
                           000003ff8007d8a8: a7290014        lghi    %r2,20
                           000003ff8007d8ac: c0e5fffff4b0    brasl 
%r14,3ff8007c20c
[  266.989772] Call Trace:
[  266.989777] ([<000003ff8007d898>] vfio_ap_irq_disable+0x138/0x1b0 
[vfio_ap])
[  266.989779]  [<000003ff8007c4d2>] vfio_ap_queue_dev_remove+0x6a/0x90 
[vfio_ap]
[  266.989782]  [<00000000bf0f24f0>] ap_device_remove+0x50/0x110
[  266.989784]  [<00000000beffbaac>] 
device_release_driver_internal+0x114/0x1f0
[  266.989787]  [<00000000beff9c88>] bus_remove_device+0x108/0x190
[  266.989789]  [<00000000beff5418>] device_del+0x178/0x3a0
[  266.989790]  [<00000000beff5670>] device_unregister+0x30/0x90
[  266.989791]  [<00000000bf0f0f04>] 
__ap_queue_devices_with_id_unregister+0x44/0x50
[  266.989793]  [<00000000beff86ea>] bus_for_each_dev+0x82/0xb0
[  266.989794]  [<00000000bf0f2aba>] ap_scan_bus+0x262/0x878
[  266.989798]  [<00000000beb4785c>] process_one_work+0x1e4/0x410
[  266.989800]  [<00000000beb47ca8>] worker_thread+0x220/0x460
[  266.989802]  [<00000000beb4e99a>] kthread+0x12a/0x160
[  266.989805]  [<00000000bf2d8eb0>] ret_from_fork+0x28/0x2c
[  266.989806]  [<00000000bf2d8eb4>] kernel_thread_starter+0x0/0xc
[  266.989807] Last Breaking-Event-Address:
[  266.989809]  [<000003ff8007d898>] vfio_ap_irq_disable+0x138/0x1b0 
[vfio_ap]
[  266.989810] ---[ end trace 59b4020890dbd391 ]---


> 
> Thanks,
> 
> Pierre
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ