lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y/3onX5XyyYdrgZE@pc636>
Date:   Tue, 28 Feb 2023 12:42:21 +0100
From:   Uladzislau Rezki <urezki@...il.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Uladzislau Rezki <urezki@...il.com>, paulmck@...nel.org,
        "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>, linux-kernel@...r.kernel.org,
        Frederic Weisbecker <frederic@...nel.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        linux-doc@...r.kernel.org, rcu@...r.kernel.org
Subject: Re: [PATCH RFC v2] rcu: Add a minimum time for marking boot as
 completed

> On Mon, Feb 27, 2023 at 1:57 PM Uladzislau Rezki <urezki@...il.com> wrote:
> >
> > On Mon, Feb 27, 2023 at 01:27:20PM -0500, Joel Fernandes wrote:
> > >
> > >
> > > > On Feb 27, 2023, at 1:20 PM, Uladzislau Rezki <urezki@...il.com> wrote:
> > > >
> > > > On Mon, Feb 27, 2023 at 01:15:47PM -0500, Joel Fernandes wrote:
> > > >>
> > > >>
> > > >>>> On Feb 27, 2023, at 1:06 PM, Uladzislau Rezki <urezki@...il.com> wrote:
> > > >>>
> > > >>> On Mon, Feb 27, 2023 at 10:16:51AM -0500, Joel Fernandes wrote:
> > > >>>>> On Mon, Feb 27, 2023 at 9:55 AM Paul E. McKenney <paulmck@...nel.org> wrote:
> > > >>>>>
> > > >>>>> On Mon, Feb 27, 2023 at 08:22:06AM -0500, Joel Fernandes wrote:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>> On Feb 27, 2023, at 2:53 AM, Zhuo, Qiuxu <qiuxu.zhuo@...el.com> wrote:
> > > >>>>>>>
> > > >>>>>>> 
> > > >>>>>>>>
> > > >>>>>>>> From: Joel Fernandes (Google) <joel@...lfernandes.org>
> > > >>>>>>>> Sent: Saturday, February 25, 2023 11:34 AM
> > > >>>>>>>> To: linux-kernel@...r.kernel.org
> > > >>>>>>>> Cc: Joel Fernandes (Google) <joel@...lfernandes.org>; Frederic Weisbecker
> > > >>>>>>>> <frederic@...nel.org>; Lai Jiangshan <jiangshanlai@...il.com>; linux-
> > > >>>>>>>> doc@...r.kernel.org; Paul E. McKenney <paulmck@...nel.org>;
> > > >>>>>>>> rcu@...r.kernel.org
> > > >>>>>>>> Subject: [PATCH RFC v2] rcu: Add a minimum time for marking boot as
> > > >>>>>>>> completed
> > > >>>>>>>>
> > > >>>>>>>> On many systems, a great deal of boot happens after the kernel thinks the
> > > >>>>>>>> boot has completed. It is difficult to determine if the system has really
> > > >>>>>>>> booted from the kernel side. Some features like lazy-RCU can risk slowing
> > > >>>>>>>> down boot time if, say, a callback has been added that the boot
> > > >>>>>>>> synchronously depends on.
> > > >>>>>>>>
> > > >>>>>>>> Further, it is better to boot systems which pass 'rcu_normal_after_boot' to
> > > >>>>>>>> stay expedited for as long as the system is still booting.
> > > >>>>>>>>
> > > >>>>>>>> For these reasons, this commit adds a config option
> > > >>>>>>>> 'CONFIG_RCU_BOOT_END_DELAY' and a boot parameter
> > > >>>>>>>> rcupdate.boot_end_delay.
> > > >>>>>>>>
> > > >>>>>>>> By default, this value is 20s. A system designer can choose to specify a value
> > > >>>>>>>> here to keep RCU from marking boot completion.  The boot sequence will not
> > > >>>>>>>> be marked ended until at least boot_end_delay milliseconds have passed.
> > > >>>>>>>
> > > >>>>>>> Hi Joel,
> > > >>>>>>>
> > > >>>>>>> Just some thoughts on the default value of 20s, correct me if I'm wrong :-).
> > > >>>>>>>
> > > >>>>>>> Does the OS with CONFIG_PREEMPT_RT=y kernel concern more about the
> > > >>>>>>> real-time latency than the overall OS boot time?
> > > >>>>>>
> > > >>>>>> But every system has to boot, even an RT system.
> > > >>>>>>
> > > >>>>>>>
> > > >>>>>>> If so, we might make rcupdate.boot_end_delay = 0 as the default value
> > > >>>>>>> (NOT the default 20s) for CONFIG_PREEMPT_RT=y kernels?
> > > >>>>>>
> > > >>>>>> Could you measure how much time your RT system takes to boot before the application runs?
> > > >>>>>>
> > > >>>>>> I can change it to default 0 essentially NOOPing it, but I would rather have a saner default (10 seconds even), than having someone forget to tune this for their system.
> > > >>>>>
> > > >>>>> Provide a /sys location that the userspace code writes to when it
> > > >>>>> is ready?  Different systems with different hardware and software
> > > >>>>> configurations are going to take different amounts of time to boot,
> > > >>>>> correct?
> > > >>>>
> > > >>>> I could add a sysfs node, but I still wanted this patch as well
> > > >>>> because I am wary of systems where yet more userspace changes are
> > > >>>> required. I feel the kernel should itself be able to do this. Yes, it
> > > >>>> is possible the system completes "booting" at a different time than
> > > >>>> what the kernel thinks. But it does that anyway (even without this
> > > >>>> patch), so I am not seeing a good reason to not do this in the kernel.
> > > >>>> It is also only a minimum cap, so if the in-kernel boot takes too
> > > >>>> long, then the patch will have no effect.
> > > >>>>
> > > >>>> Thoughts?
> > > >>>>
> > > >>> Why "rcu_boot_ended" is not enough? As i see right after that an "init"
> > > >>> process or shell or panic is going to be invoked by the kernel. It basically
> > > >>> indicates that a kernel is fully functional.
> > > >>>
> > > >>> Or an idea to wait even further? Until all kernel modules are loaded by
> > > >>> user space.
> > > >>
> > > >> I mentioned in commit message it is daemons, userspace initialization etc. There is a lot of userspace booting up as well and using the kernel while doing so.
> > > >>
> > > >> So, It does not make sense to me to mark kernel as booted too early. And no harm in adding some builtin kernel hysteresis. What am I missing?
> > > >>
> > > > Than it is up to user space to decide when it is ready in terms of "boot completed".
> > >
> > > I dont know if you caught up with the other threads. See replies from Paul and my reply to that.
> > >
> > > Also what you are proposing can be more harmful. If user space has a bug and does not notify the kernel that boot completed, then the boot can stay incomplete forever. The idea with this patch is to make things better, not worse.
> > >
> > I saw that Paul proposed to have a sysfs attribute using which you can
> > send a notification.
> 
> Maybe I am missing something but how will a sysfs node on its own work really?
> 
> 1. delete kernel marking itself boot completed  -- and then sysfs
> marks it completed?
> 
> 2. delete kernel marking itself boot completed  -- and then sysfs
> marks it completed, if sysfs does not come in in N seconds, then
> kernel marks as completed?
> 
> #1 is a no go, that just means a bug waiting to happen if userspace
> forgets to write to sysfs.
> 
> #2 is just an extension of this patch. So I can add a sysfs node on
> top of this. And we can make the minimum time as a long period of
> time, as you noted below:
> 
> > IMHO, to me this patch does not provide a clear correlation between what
> > is a boot complete and when it occurs. A boot complete is a synchronous
> > event whereas the patch thinks that after some interval a "boot" is completed.
> 
> But that is exactly how the kernel code is now without this patch, so
> it is already broken in that sense, I am not really breaking it more
> ;-)
> 
> > We can imply that after, say 100 seconds an initialization of user space
> > is done. Maybe 100 seconds then? :)
> 
> Yes I am Ok with that. So are you suggesting we change the default to
> 100 seconds and then add a sysfs node to mark as boot done whenever
> userspace notifies?
> 
As i see it correctly the patch tries to address at least two issues. Fist
one is about lazy callbacks. If it stucks the boot-time is degraded so you
want it to be disabled during the boot.

I wonder why not keeping lazy stuff to be disabled by __default__. User decides
if it is needed or not? No matter if user space hangs or whatever. Or is it hard
to add such runtime switcher?

Second you want to have an expidited gp during a boot. I guess same here
it is about boot performance. Should not it be as a separate patch with
different commit message and with some data if you see a speedup?

Thanks!

--
Uladzislau Rezki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ