[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CY4PR21MB074168DE7729C131CE4394CCCE880@CY4PR21MB0741.namprd21.prod.outlook.com>
Date: Fri, 20 Sep 2019 19:12:04 +0000
From: Long Li <longli@...rosoft.com>
To: Sagi Grimberg <sagi@...mberg.me>, Ming Lei <ming.lei@...hat.com>
CC: Jens Axboe <axboe@...com>, Hannes Reinecke <hare@...e.com>,
John Garry <john.garry@...wei.com>,
Bart Van Assche <bvanassche@....org>,
"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
LKML <linux-kernel@...r.kernel.org>,
"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
Keith Busch <keith.busch@...el.com>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Christoph Hellwig <hch@....de>
Subject: RE: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
> >> Long, does this patch make any difference?
> >
> > Sagi,
> >
> > Sorry it took a while to bring my system back online.
> >
> > With the patch, the IOPS is about the same drop with the 1st patch. I think
> the excessive context switches are causing the drop in IOPS.
> >
> > The following are captured by "perf sched record" for 30 seconds during
> tests.
> >
> > "perf sched latency"
> > With patch:
> > fio:(82) | 937632.706 ms | 1782255 | avg: 0.209 ms | max: 63.123
> ms | max at: 768.274023 s
> >
> > without patch:
> > fio:(82) |2348323.432 ms | 18848 | avg: 0.295 ms | max: 28.446
> ms | max at: 6447.310255 s
>
> Without patch means the proposed hard-irq patch?
It means the current upstream code without any patch. But It's prone to soft lockup.
Ming's proposed hard-irq patch gets similar results to "without patch", however it fixes the soft lockup.
>
> If we are context switching too much, it means the soft-irq operation is not
> efficient, not necessarily the fact that the completion path is running in soft-
> irq..
>
> Is your kernel compiled with full preemption or voluntary preemption?
The tests are based on Ubuntu 18.04 kernel configuration. Here are the parameters:
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
>
> > Look closer at each CPU, we can see ksoftirqd is competing CPU with
> > fio (and effectively throttle other fio processes) (captured in
> > /sys/kernel/debug/tracing, echo sched:* >set_event)
> >
> > On CPU1 with patch: (note that the prev_state for fio is "R", it's
> preemptively scheduled)
> > <...>-4077 [001] d... 66456.805062: sched_switch: prev_comm=fio
> prev_pid=4077 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/1
> next_pid=17 next_prio=120
> > <...>-17 [001] d... 66456.805859: sched_switch:
> prev_comm=ksoftirqd/1 prev_pid=17 prev_prio=120 prev_state=S ==>
> next_comm=fio next_pid=4077 next_prio=120
> > <...>-4077 [001] d... 66456.844049: sched_switch: prev_comm=fio
> prev_pid=4077 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/1
> next_pid=17 next_prio=120
> > <...>-17 [001] d... 66456.844607: sched_switch:
> prev_comm=ksoftirqd/1 prev_pid=17 prev_prio=120 prev_state=S ==>
> next_comm=fio next_pid=4077 next_prio=120
> >
> > On CPU1 without patch: (the prev_state for fio is "S", it's voluntarily
> scheduled)
> > <idle>-0 [001] d... 6725.392308: sched_switch:
> prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==>
> next_comm=fio next_pid=14342 next_prio=120
> > fio-14342 [001] d... 6725.392332: sched_switch: prev_comm=fio
> prev_pid=14342 prev_prio=120 prev_state=S ==> next_comm=swapper/1
> next_pid=0 next_prio=120
> > <idle>-0 [001] d... 6725.392356: sched_switch:
> prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==>
> next_comm=fio next_pid=14342 next_prio=120
> > fio-14342 [001] d... 6725.392425: sched_switch:
> > prev_comm=fio prev_pid=14342 prev_prio=120 prev_state=S ==>
> > next_comm=swapper/1 next_pid=0 next_prio=12
Powered by blists - more mailing lists