linux-kernel - Re: [RFC 0/2] target: Add TFO->complete_irq queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <55708455.2080500@dev.mellanox.co.il>
Date:	Thu, 04 Jun 2015 20:01:09 +0300
From:	Sagi Grimberg <sagig@....mellanox.co.il>
To:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	Christoph Hellwig <hch@....de>
CC:	"Nicholas A. Bellinger" <nab@...erainc.com>,
	target-devel <target-devel@...r.kernel.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Hannes Reinecke <hare@...e.de>,
	Sagi Grimberg <sagig@...lanox.com>
Subject: Re: [RFC 0/2] target: Add TFO->complete_irq queue_work bypass

On 6/4/2015 10:06 AM, Nicholas A. Bellinger wrote:
> On Wed, 2015-06-03 at 14:57 +0200, Christoph Hellwig wrote:
>> This makes lockdep very unhappy, rightly so.  If you execute
>> one end_io function inside another you basіcally nest every possible
>> lock taken in the I/O completion path.  Also adding more work
>> to the hardirq path generally isn't a smart idea.  Can you explain
>> what issues you were seeing and how much this helps?  Note that
>> the workqueue usage in the target core so far is fairly basic, so
>> there should some low hanging fruit.
>
> So I've been using tcm_loop + RAMDISK backends for prototyping, but this
> patch is intended for vhost-scsi so it can avoid the unnecessary
> queue_work() context switch within target_complete_cmd() for all backend
> driver types.
>
> This is because vhost_work_queue() is just updating vhost_dev->work_list
> and immediately wake_up_process() into a different vhost_worker()
> process context.  For heavy small block workloads into fast IBLOCK
> backends, avoiding this extra context switch should be a nice efficiency
> win.

I can see that, did you get a chance to measure the expected latency
improvement?

>
> Also, AFAIK RDMA fabrics are allowed to do ib_post_send() response
> callbacks directly from IRQ context as well.

This is correct in general, ib_post_send is not allowed to schedule.
isert/srpt might benefit here in latency, but it would require the
the drivers to pre-allocate the sgls (ib_sge's) and use a worst-case
approach (or use GFP_ATOMIC allocations - I'm not sure which is
better...)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/