[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55708455.2080500@dev.mellanox.co.il>
Date: Thu, 04 Jun 2015 20:01:09 +0300
From: Sagi Grimberg <sagig@....mellanox.co.il>
To: "Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
Christoph Hellwig <hch@....de>
CC: "Nicholas A. Bellinger" <nab@...erainc.com>,
target-devel <target-devel@...r.kernel.org>,
linux-scsi <linux-scsi@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Hannes Reinecke <hare@...e.de>,
Sagi Grimberg <sagig@...lanox.com>
Subject: Re: [RFC 0/2] target: Add TFO->complete_irq queue_work bypass
On 6/4/2015 10:06 AM, Nicholas A. Bellinger wrote:
> On Wed, 2015-06-03 at 14:57 +0200, Christoph Hellwig wrote:
>> This makes lockdep very unhappy, rightly so. If you execute
>> one end_io function inside another you basŃ–cally nest every possible
>> lock taken in the I/O completion path. Also adding more work
>> to the hardirq path generally isn't a smart idea. Can you explain
>> what issues you were seeing and how much this helps? Note that
>> the workqueue usage in the target core so far is fairly basic, so
>> there should some low hanging fruit.
>
> So I've been using tcm_loop + RAMDISK backends for prototyping, but this
> patch is intended for vhost-scsi so it can avoid the unnecessary
> queue_work() context switch within target_complete_cmd() for all backend
> driver types.
>
> This is because vhost_work_queue() is just updating vhost_dev->work_list
> and immediately wake_up_process() into a different vhost_worker()
> process context. For heavy small block workloads into fast IBLOCK
> backends, avoiding this extra context switch should be a nice efficiency
> win.
I can see that, did you get a chance to measure the expected latency
improvement?
>
> Also, AFAIK RDMA fabrics are allowed to do ib_post_send() response
> callbacks directly from IRQ context as well.
This is correct in general, ib_post_send is not allowed to schedule.
isert/srpt might benefit here in latency, but it would require the
the drivers to pre-allocate the sgls (ib_sge's) and use a worst-case
approach (or use GFP_ATOMIC allocations - I'm not sure which is
better...)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists