lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pm1kbiou.fsf@email.froward.int.ebiederm.org>
Date:   Wed, 11 Oct 2023 22:53:05 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Brian Geffon <bgeffon@...gle.com>
Cc:     Kees Cook <keescook@...omium.org>,
        Christian Brauner <brauner@...nel.org>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Matthias Kaehlcke <mka@...omium.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Frederic Weisbecker <frederic@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] pid: Allow frozen userspace to reboot from non-init pid ns

Brian Geffon <bgeffon@...gle.com> writes:

> On Fri, Sep 29, 2023 at 4:09 PM Kees Cook <keescook@...omium.org> wrote:
>>
>> On Fri, Sep 29, 2023 at 01:44:42PM -0400, Brian Geffon wrote:
>> > When the system has a frozen userspace, for example, during hibernation
>> > the child reaper task will also be frozen. Attmepting to deliver a
>> > signal to it to handle the reboot(2) will ultimately lead to the system
>> > hanging unless userspace is thawed.
>> >
>> > This change checks if the current task is the suspending task and if so
>> > it will allow it to proceed with a reboot from the non-init pid ns.
>>
>> I don't know the code flow too well here, but shouldn't init_pid_ns
>> always be doing the reboot regardless of anything else?
>
> I think the point of this is, normally the reaper is runnable and so
> an appropriate signal will be delivered allowing them to also clean up
> [2]. In our case, they won't be runnable and doing this wouldn't make
> sense.

The entire reboot_pid_ns thing is just a polite way of keeping
applications like /sbin/reboot working inside a pid namespace.

Ordinarily the process calling reboot (inside the container) won't
have the privileges to request an entire system reboot.  So I don't
see anything making sense to promote that reboot into a system-wide
reboot.

Which leads me to the question.  What is actually happening with
hibernation that we want something inside a pid namespace to somehow
have the permissions to reboot the entire machine?

>> Also how is this syscall running if current is frozen? This feels weird
>> to me... shouldn't the frozen test be against pid_ns->child_reaper
>> instead of current?
>
> The task which froze the system won't be frozen to make sure this
> happens it will have the flag PF_SUSPEND_TASK added, so we know if we
> have this flag we're the only running user space task [1].

Someone has a task inside a container that is successfully suspending
the entire system?

I don't see how that makes sense.

But on the level that it somehow does I would put a test in
kernel/reboot.c something like:

/*
 * If the caller can't perform a normal reboot call
 * reboot_pid_ns
 */
if ((pid_ns != &init_pid_ns) &&
    !((current->flags & PF_SUSPEND_TASK) && capable(CAP_SYS_BOOT))) {
	return reboot_pid_ns(pid_ns, cmd);
}

Making reboot_pid_ns responsible for the logic that should be bypassing
it is quite confusing.

> I hope my understanding is correct and it makes sense. Thanks for
> taking the time to review.
>
> Brian
>
> 1. https://elixir.bootlin.com/linux/latest/source/kernel/power/process.c#L130
> 2. https://elixir.bootlin.com/linux/latest/source/kernel/pid_namespace.c#L327


I really don't know if allowing PF_SUSPEND_TASK so that hibernation and
the like can work from inside a container makes any sense at all.

But the above is roughly how I would make it work.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ