lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190918143732.GA19364@ming.t460p>
Date:   Wed, 18 Sep 2019 22:37:33 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Sagi Grimberg <sagi@...mberg.me>
Cc:     Keith Busch <keith.busch@...el.com>,
        Hannes Reinecke <hare@...e.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Bart Van Assche <bvanassche@....org>,
        linux-scsi@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
        Long Li <longli@...rosoft.com>,
        John Garry <john.garry@...wei.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-nvme@...ts.infradead.org, Jens Axboe <axboe@...com>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Christoph Hellwig <hch@....de>
Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

On Mon, Sep 09, 2019 at 08:10:07PM -0700, Sagi Grimberg wrote:
> Hey Ming,
> 
> > > > Ok, so the real problem is per-cpu bounded tasks.
> > > > 
> > > > I share Thomas opinion about a NAPI like approach.
> > > 
> > > We already have that, its irq_poll, but it seems that for this
> > > use-case, we get lower performance for some reason. I'm not
> > > entirely sure why that is, maybe its because we need to mask interrupts
> > > because we don't have an "arm" register in nvme like network devices
> > > have?
> > 
> > Long observed that IOPS drops much too by switching to threaded irq. If
> > softirqd is waken up for handing softirq, the performance shouldn't
> > be better than threaded irq.
> 
> Its true that it shouldn't be any faster, but what irqpoll already has
> and we don't need to reinvent is a proper budgeting mechanism that
> needs to occur when multiple devices map irq vectors to the same cpu
> core.
> 
> irqpoll already maintains a percpu list and dispatch the ->poll with
> a budget that the backend enforces and irqpoll multiplexes between them.
> Having this mechanism in irq (hard or threaded) context sounds
> unnecessary a bit.
> 
> It seems like we're attempting to stay in irq context for as long as we
> can instead of scheduling to softirq/thread context if we have more than
> a minimal amount of work to do. Without at least understanding why
> softirq/thread degrades us so much this code seems like the wrong
> approach to me. Interrupt context will always be faster, but it is
> not a sufficient reason to spend as much time as possible there, is it?

If extra latency is added in IO completion path, this latency will be
introduced in the submission path, because the hw queue depth is fixed,
which is often small. Especially in case of multiple submission vs.
single(shared) completion, the whole hw queue tags can be exhausted
easily. 

I guess no such effect for networking IO.

> 
> We should also keep in mind, that the networking stack has been doing
> this for years, I would try to understand why this cannot work for nvme
> before dismissing.

The above may be one reason.

Thanks,
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ