lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250612155226.770f0676@kernel.org>
Date: Thu, 12 Jun 2025 15:52:26 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Cosmin Ratiu <cratiu@...dia.com>, Mark Bloch <mbloch@...dia.com>, Dragos
 Tatulea <dtatulea@...dia.com>, "andrew+netdev@...n.ch"
 <andrew+netdev@...n.ch>, "hawk@...nel.org" <hawk@...nel.org>,
 "davem@...emloft.net" <davem@...emloft.net>, "john.fastabend@...il.com"
 <john.fastabend@...il.com>, "leon@...nel.org" <leon@...nel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "pabeni@...hat.com" <pabeni@...hat.com>, "ast@...nel.org" <ast@...nel.org>,
 "richardcochran@...il.com" <richardcochran@...il.com>, Leon Romanovsky
 <leonro@...dia.com>, "linux-rdma@...r.kernel.org"
 <linux-rdma@...r.kernel.org>, "edumazet@...gle.com" <edumazet@...gle.com>,
 "horms@...nel.org" <horms@...nel.org>, "daniel@...earbox.net"
 <daniel@...earbox.net>, Tariq Toukan <tariqt@...dia.com>, Saeed Mahameed
 <saeedm@...dia.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Gal
 Pressman <gal@...dia.com>, "bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: Re: [PATCH net-next v3 10/12] net/mlx5e: Implement queue mgmt ops
 and single channel swap

On Thu, 12 Jun 2025 13:44:44 -0700 Mina Almasry wrote:
> > On Wed, 2025-06-11 at 22:33 -0700, Mina Almasry wrote:  
> > > Is this really better than maintaining uniformity of behavior between
> > > the drivers that support the queue mgmt api and just doing the
> > > mlx5e_deactivate_priv_channels and mlx5e_close_channel in the stop
> > > like core sorta expects?
> > >
> > > We currently use the ndos to restart a queue, but I'm imagining in
> > > the
> > > future we can expand it to create queues on behalf of the queues. The
> > > stop queue API may be reused in other contexts, like maybe to kill a
> > > dynamically created devmem queue or something, and this specific
> > > driver may stop working because stop actually doesn't do anything?
> > >  
> >
> > The .ndo_queue_stop operation doesn't make sense by itself for mlx5,
> > because the current mlx5 architecture is to atomically swap in all of
> > the channels.
> > The scenario you are describing, with a hypothetical ndo_queue_stop for
> > dynamically created devmem queues would leave all of the queues stopped
> > and the old channel deallocated in the channel array. Worse problems
> > would happen in that state than with today's approach, which leaves the
> > driver in functional state.
> >
> > Perhaps Saeed can add more details to this?  
> 
> I see, so essentially mlx5 supports restarting a queue but not
> necessarily stopping and starting a queue as separate actions?
> 
> If so, can maybe the comment on the function be reworded to more
> strongly indicate that this is a limitation? Just asking because
> future driver authors interested in implementing the queue API will
> probably look at one of mlx5/gve/bnxt to see what an existing
> implementation looks like, and I would rather them follow bnxt/gve
> that is more in line with core's expectations if possible. But that's
> a minor concern; I'm fine with this patch.
> 
> FWIW this may break in the future if core decides to add code that
> actually uses the stop operation as a 'stop', not as a stepping stone
> to 'restart', but I'm not sure we can do anything about that if it's a
> driver limitation.

Agreed, would be good to add a TODO and follow up on this.
It will bite us sooner or later. I suppose state_lock may
need to be dropped in favor of the netdev instance lock first?

I'm disappointed that mlx5 once again disrupts all rings to restart 
a single one. But all existing drivers seem to do this, so I guess
it'd be unfair to push back based on just that :|

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ