linux-kernel - Re: [V9fs-developer] [PATCH] net/9p: Fix a deadlock case in the virtio transport

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5B4BFB29.3080507@huawei.com>
Date:   Mon, 16 Jul 2018 09:55:53 +0800
From:   jiangyiwen <jiangyiwen@...wei.com>
To:     Dominique Martinet <asmadeus@...ewreck.org>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Eric Van Hensbergen <ericvh@...il.com>,
        Ron Minnich <rminnich@...dia.gov>,
        Latchesar Ionkov <lucho@...kov.net>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        <v9fs-developer@...ts.sourceforge.net>
Subject: Re: [V9fs-developer] [PATCH] net/9p: Fix a deadlock case in the
 virtio transport

On 2018/7/14 20:47, Dominique Martinet wrote:
> jiangyiwen wrote on Sat, Jul 14, 2018:
>> On 2018/7/14 17:05, Dominique Martinet wrote:
>>> jiangyiwen wrote on Sat, Jul 14, 2018:
>>>> When client has multiple threads that issue io requests all the
>>>> time, and the server has a very good performance, it may cause
>>>> cpu is running in the irq context for a long time because it can
>>>> check virtqueue has buf in the *while* loop.
>>>>
>>>> So we should keep chan->lock in the whole loop.
>>>
>>> Hmm, this is generally bad practice to hold a spin lock for long.
>>> In general, spin locks are meant to protect data, not code.
>>>
>>> I'd want some numbers to decide on this one, even if I think this
>>> particular case is safe (e.g. this cannot dead-lock)
>>>
>>
>> Actually, the loop will not hold a spin lock for long, because other
>> threads will not issue new requests in this case. In addition,
>> virtio-blk or virtio-scsi also use this solution, I guess it may also
>> encounter this problem before.
> 
> Fair enough. If you do have some numbers to give though (throughput
> and/or iops before/after) I'd still be really curious.
> 
>>>>  		chan->ring_bufs_avail = 1;
>>>> -		spin_unlock_irqrestore(&chan->lock, flags);
>>>>  		/* Wakeup if anyone waiting for VirtIO ring space. */
>>>>  		wake_up(chan->vc_wq);
>>>
>>> In particular, the wake up here echoes to wait events that will
>>> immediately try to grab the lock, and will needlessly spin on it until
>>> this thread is done.
>>> If we do go this way I'd want setting chan->ring_bufs_avail to be done
>>> just before unlocking and the wakeup to be done just after unlocking out
>>> of the loop iff we processed at least one iteration here.
>>
>> I can move the wakeup operation after the unlocking. Like what I said
>> above, I think this loop will not execute for long.
> 
> Please do, you listed virtio_blk as doing this and they have the same
> kind of pattern with a req_done bool and only restarting stopped queues
> if they processed something
> 

You're right, this wake up operation should be put after the unlocking,
I will resend it. In addition, whether I should resend this patch based
on your 9p-next branch?

Thanks,
Yiwen.