lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 14 Dec 2020 11:45:43 -0800
From:   Wei Wang <weiwan@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     David Miller <davem@...emloft.net>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Felix Fietkau <nbd@....name>, Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH net-next v4 2/3] net: implement threaded-able napi poll
 loop support

On Mon, Dec 14, 2020 at 11:02 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Mon, 14 Dec 2020 09:59:21 -0800 Wei Wang wrote:
> > On Sat, Dec 12, 2020 at 2:55 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > >
> > > On Sat, 12 Dec 2020 14:50:22 -0800 Jakub Kicinski wrote:
> > > > > @@ -6731,6 +6790,7 @@ void napi_disable(struct napi_struct *n)
> > > > >             msleep(1);
> > > > >
> > > > >     hrtimer_cancel(&n->timer);
> > > > > +   napi_kthread_stop(n);
> > > >
> > > > I'm surprised that we stop the thread on napi_disable() but there is no
> > > > start/create in napi_enable(). NAPIs can (and do get) disabled and
> > > > enabled again. But that'd make your code crash with many popular
> > > > drivers if you tried to change rings with threaded napi enabled so I
> > > > feel like I must be missing something..
> > >
> > > Ah, not crash, 'cause the flag gets cleared. Is it intentional that any
> > > changes that disable NAPIs cause us to go back to non-threaded NAPI?
> > > I think I had the "threaded" setting stored in struct netdevice in my
> > > patches, is there a reason not to do that?
> >
> > Thanks for the comments!
> >
> > The reason that I did not record it in dev is: there is a slight
> > chance that during creation of the kthreads, failures occur and we
> > flip back all NAPIs to use non-threaded mode. I am not sure the
> > recorded value in dev should be what user desires, or what the actual
> > situation is. Same as after the driver does a
> > napi_disabe()/napi_enable(). It might occur that the dev->threaded =
> > true, but the operation to re-create the kthreads fail and we flip
> > back to non-thread mode. This seems to get things more complicated.
> > What I expect is the user only enables the threaded mode after the
> > device is up and alive, with all NAPIs attached to dev, and enabled.
> > And user has to check the sysfs to make sure that the operation
> > succeeds.
> > And any operation that brings down the device, will flip this back to
> > default, which is non-threaded mode.
>
> It is quite an annoying problem to address, given all relevant NAPI
> helpers seem to return void :/ But we're pushing the problem onto the
> user just because of internal API structure.
>
> This reminds me of PTP / timestamping issues some NICs had once upon
> a time. The timing application enables HW time stamping, then later some
> other application / orchestration changes a seemingly unrelated config,
> and since NIC has to reset itself it looses the timestamping config.
> Now the time app stops getting HW time stamps, but those are best
> effort anyway, so it just assumes the NIC couldn't stamp given frame
> (for every frame), not that config got completely broken. The system
> keeps running with suboptimal time for months.
>
> What does the deployment you're expecting to see looks like? What
> entity controls enabling the threaded mode on a system? Application?
> Orchestration? What's the flow?
>
I see your point. In our deployment, we have a system daemon which is
responsible for setting up all the system tunings after the host boots
up (before application starts to run). If certain operation fails, it
prints out error msg, and will exit with error. For applications that
require threaded mode, I think a check to the sysfs entry to make sure
it is enabled is necessary at the startup phase.


> "Forgetting" config based on driver-dependent events feels very fragile.
I think we could add a recorded value in dev to represent the user
setting, and try to enable threaded mode after napi_disable/enable.
But I think user/application still has to check the sysfs entry value
to make sure if it is enabled successfully.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ