[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMzD94QR-+408wf+dindhaw+NMJ1GK9W-4xuiJpY2FkhtMVLig@mail.gmail.com>
Date: Mon, 16 Dec 2024 15:13:15 -0500
From: Brian Vazquez <brianvv@...gle.com>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: Brian Vazquez <brianvv.kernel@...il.com>, Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
intel-wired-lan@...ts.osuosl.org, David Decotigny <decot@...gle.com>,
Vivek Kumar <vivekmr@...gle.com>, Anjali Singhai <anjali.singhai@...el.com>,
Sridhar Samudrala <sridhar.samudrala@...el.com>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, emil.s.tantilov@...el.com,
Marco Leogrande <leogrande@...gle.com>, Manoj Vishwanathan <manojvishy@...gle.com>,
Jacob Keller <jacob.e.keller@...el.com>, Pavan Kumar Linga <pavan.kumar.linga@...el.com>
Subject: Re: [iwl-next PATCH v4 2/3] idpf: convert workqueues to unbound
On Mon, Dec 16, 2024 at 1:11 PM Alexander Lobakin
<aleksander.lobakin@...el.com> wrote:
>
> From: Brian Vazquez <brianvv@...gle.com>
> Date: Mon, 16 Dec 2024 16:27:34 +0000
>
> > From: Marco Leogrande <leogrande@...gle.com>
> >
> > When a workqueue is created with `WQ_UNBOUND`, its work items are
> > served by special worker-pools, whose host workers are not bound to
> > any specific CPU. In the default configuration (i.e. when
> > `queue_delayed_work` and friends do not specify which CPU to run the
> > work item on), `WQ_UNBOUND` allows the work item to be executed on any
> > CPU in the same node of the CPU it was enqueued on. While this
> > solution potentially sacrifices locality, it avoids contention with
> > other processes that might dominate the CPU time of the processor the
> > work item was scheduled on.
> >
> > This is not just a theoretical problem: in a particular scenario
> > misconfigured process was hogging most of the time from CPU0, leaving
> > less than 0.5% of its CPU time to the kworker. The IDPF workqueues
> > that were using the kworker on CPU0 suffered large completion delays
> > as a result, causing performance degradation, timeouts and eventual
> > system crash.
>
> Wasn't this inspired by [0]?
>
> [0]
> https://lore.kernel.org/netdev/20241126035849.6441-11-milena.olech@intel.com
The root cause is exactly the same so I do see the similarity and I'm
not surprised that both were addressed with a similar patch, we hit
this problem some time ago and the first attempt to have this was in
August [0].
[0]
https://lore.kernel.org/netdev/20240813182747.1770032-4-manojvishy@google.com/
>
> Thanks,
> Olek
Powered by blists - more mailing lists