[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250218.aePeer3oole2@digikod.net>
Date: Tue, 18 Feb 2025 20:21:08 +0100
From: Mickaël Salaün <mic@...ikod.net>
To: Paul Moore <paul@...l-moore.com>
Cc: Eric Paris <eparis@...hat.com>,
Günther Noack <gnoack@...gle.com>, "Serge E . Hallyn" <serge@...lyn.com>,
Ben Scarlato <akhna@...gle.com>, Casey Schaufler <casey@...aufler-ca.com>,
Charles Zaffery <czaffery@...lox.com>, Daniel Burgener <dburgener@...ux.microsoft.com>,
Francis Laniel <flaniel@...ux.microsoft.com>, James Morris <jmorris@...ei.org>, Jann Horn <jannh@...gle.com>,
Jeff Xu <jeffxu@...gle.com>, Jorge Lucangeli Obes <jorgelo@...gle.com>,
Kees Cook <kees@...nel.org>, Konstantin Meskhidze <konstantin.meskhidze@...wei.com>,
Matt Bobrowski <mattbobrowski@...gle.com>, Mikhail Ivanov <ivanov.mikhail1@...wei-partners.com>,
Phil Sutter <phil@....cc>, Praveen K Paladugu <prapal@...ux.microsoft.com>,
Robert Salvet <robert.salvet@...lox.com>, Shervin Oloumi <enlightened@...gle.com>,
Song Liu <song@...nel.org>, Tahera Fahimi <fahimitahera@...il.com>,
Tyler Hicks <code@...icks.com>, audit@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-security-module@...r.kernel.org
Subject: Re: [PATCH v5 10/24] landlock: Add AUDIT_LANDLOCK_DOMAIN and log
domain status
On Fri, Feb 14, 2025 at 05:52:49PM -0500, Paul Moore wrote:
> On Jan 31, 2025 =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= <mic@...ikod.net> wrote:
> >
> > Asynchronously log domain information when it first denies an access.
> > This minimize the amount of generated logs, which makes it possible to
> > always log denials since they should not happen (except with the new
> > LANDLOCK_RESTRICT_SELF_QUIET flag). These records are identified with
> > the new AUDIT_LANDLOCK_DOMAIN type.
> >
> > The AUDIT_LANDLOCK_DOMAIN message contains:
> > - the "domain" ID which is described;
> > - the "status" which can either be "allocated" or "deallocated";
> > - the "mode" which is for now only "enforcing";
> > - for the "allocated" status, a minimal set of properties to easily
> > identify the task that loaded the domain's policy with
> > landlock_restrict_self(2): "pid", "uid", executable path ("exe"), and
> > command line ("comm");
> > - for the "deallocated" state, the number of "denials" accounted to this
> > domain, which is at least 1.
> >
> > This requires each domain to save these task properties at creation
> > time in the new struct landlock_details. A reference to the PID is kept
> > for the lifetime of the domain to avoid race conditions when
> > investigating the related task. The executable path is resolved and
> > stored to not keep a reference to the filesystem and block related
> > actions. All these metadata are stored for the lifetime of the related
> > domain and should then be minimal. The required memory is not accounted
> > to the task calling landlock_restrict_self(2) contrary to most other
> > Landlock allocations (see related comment).
> >
> > The AUDIT_LANDLOCK_DOMAIN record follows the first AUDIT_LANDLOCK_ACCESS
> > record for the same domain, which is always followed by AUDIT_SYSCALL
> > and AUDIT_PROCTITLE. This is in line with the audit logic to first
> > record the cause of an event, and then add context with other types of
> > record.
> >
> > Audit event sample for a first denial:
> >
> > type=LANDLOCK_ACCESS msg=audit(1732186800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
> > type=LANDLOCK_DOMAIN msg=audit(1732186800.349:44): domain=195ba459b status=allocated mode=enforcing pid=300 uid=0 exe="/root/sandboxer" comm="sandboxer"
> > type=SYSCALL msg=audit(1732186800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0
> >
> > Audit event sample for a following denial:
> >
> > type=LANDLOCK_ACCESS msg=audit(1732186800.372:45): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
> > type=SYSCALL msg=audit(1732186800.372:45): arch=c000003e syscall=101 success=no [...] pid=300 auid=0
> >
> > Log domain deletion with the "deallocated" state when a domain was
> > previously logged. This makes it possible for log parsers to free
> > potential resources when a domain ID will never show again.
> >
> > The number of denied access requests is useful to easily check how many
> > access requests a domain blocked and potentially if some of them are
> > missing in logs because of audit rate limiting or audit rules. Rate
> > limiting could also drop this record though.
> >
> > Audit event sample for a deletion of a domain that denied something:
> >
> > type=LANDLOCK_DOMAIN msg=audit(1732186800.393:46): domain=195ba459b status=deallocated denials=2
> >
> > Cc: Günther Noack <gnoack@...gle.com>
> > Cc: Paul Moore <paul@...l-moore.com>
> > Signed-off-by: Mickaël Salaün <mic@...ikod.net>
> > Link: https://lore.kernel.org/r/20250131163059.1139617-11-mic@digikod.net
> > ---
> > Changes since v4:
> > - Rename AUDIT_LANDLOCK_DOM_{INFO,DROP} to AUDIT_LANDLOCK_DOMAIN and add
> > a "status" field, as requested by Paul.
> > - Add a harcoded "mode=enforcing" to leave room for a potential future
> > permissive mode, as suggested by Paul.
> > - Remove the "creation" timestamp, as suggested by Paul.
> > - Move LANDLOCK_PATH_MAX_SIZE to domain.h, check the size of the
> > greatest landlock_details at build time, and improve comments.
> > - Improve audit check in landlock_log_drop_domain().
> > - Add missing headers.
> > - Fix typo in comment.
> > - Rebase on top of the landlock_log_denial() and subject type changes.
> >
> > Changes since v3:
> > - Log number of denied access requests with AUDIT_LANDLOCK_DOM_DROP
> > records, suggested by Tyler.
> > - Do not store a struct path pointer but the resolved string instead.
> > This enables us to not block unmount of the initially restricted task
> > executable's mount point. See the new get_current_info() and
> > get_current_exe(). A following patch add tests for this case.
> > - Create and allocate a new struct landlock_details for initially
> > restricted task's information.
> > - Remove audit_get_ctime() call, as requested by Paul. We now always
> > have a standalone timestamp per Landlock domain creations.
> > - Fix docstring.
> >
> > Changes since v2:
> > - Fix docstring.
> > - Fix log_status check in log_hierarchy() to also log
> > LANDLOCK_LOG_DISABLED.
> > - Add audit's creation time to domain's properties.
> > - Use hexadecimal notation for domain IDs.
> > - Remove domain's parent records: parent domains are not really useful
> > in the logs. They will be available with the upcoming introspection
> > feature though.
> > - Extend commit message with audit's timestamp explanation.
> >
> > Changes since v1:
> > - Add a ruleset's version for atomic logs.
> > - Rebased on the TCP patch series.
> > - Rename operation using "_" instead of "-".
> > - Rename AUDIT_LANDLOCK to AUDIT_LANDLOCK_RULESET.
> > - Only log when audit is enabled, but always set domain IDs.
> > - Don't log task's PID/TID with log_task() because it would be redundant
> > with the SYSCALL record.
> > - Remove race condition when logging ruleset creation and logging
> > ruleset modification while the related file descriptor was already
> > registered but the ruleset creation not logged yet.
> > - Fix domain drop logs.
> > - Move the domain drop record from the previous patch into this one.
> > - Do not log domain creation but log first domain use instead.
> > - Save task's properties that sandbox themselves.
> > ---
> > include/uapi/linux/audit.h | 1 +
> > security/landlock/audit.c | 90 ++++++++++++++++++++++++++++++--
> > security/landlock/audit.h | 7 +++
> > security/landlock/domain.c | 101 ++++++++++++++++++++++++++++++++++++
> > security/landlock/domain.h | 68 ++++++++++++++++++++++++
> > security/landlock/ruleset.c | 6 +++
> > 6 files changed, 270 insertions(+), 3 deletions(-)
>
> Some minor questions below, but from an audit perspective this is okay.
>
> Acked-by: Paul Moore <paul@...l-moore.com> (Audit)
>
> > diff --git a/security/landlock/audit.c b/security/landlock/audit.c
> > index b0dde6bcfb76..a5b055306757 100644
> > --- a/security/landlock/audit.c
> > +++ b/security/landlock/audit.c
> > @@ -8,6 +8,8 @@
> > #include <kunit/test.h>
> > #include <linux/audit.h>
> > #include <linux/lsm_audit.h>
> > +#include <linux/pid.h>
> > +#include <linux/uidgid.h>
> >
> > #include "audit.h"
> > #include "cred.h"
> > @@ -31,6 +33,40 @@ static void log_blockers(struct audit_buffer *const ab,
> > audit_log_format(ab, "%s", get_blocker(type));
> > }
> >
> > +static void log_node(struct landlock_hierarchy *const node)
> > +{
> > + struct audit_buffer *ab;
> > +
> > + if (WARN_ON_ONCE(!node))
> > + return;
> > +
> > + /* Ignores already logged domains. */
> > + if (READ_ONCE(node->log_status) == LANDLOCK_LOG_RECORDED)
> > + return;
> > +
> > + ab = audit_log_start(audit_context(), GFP_ATOMIC,
> > + AUDIT_LANDLOCK_DOMAIN);
>
> You use __GFP_NOWARN in the other calls to audit_log_start(), did you
> mean to use it here as well?
Good catch, this is at least inconsistent.
My initial though was that we should use __GFP_NOWARN for access
requests that can occure frequently, and common_lsm_audit() already uses
this flag, and the fact that audit events might not be logged at all for
other reasons (e.g. burst of entries). But __GFP_NOWARN is not use for
all audit_log_start() calls.
For domain allocations and deallocations, this should rarely happen
during the life of a process, and it looked reasonable to warn about an
issue at this point.
Are there guidance about __GFP_NOWARN for audit or other subsystems?
>
> > + if (!ab)
> > + return;
> > +
> > + WARN_ON_ONCE(node->id == 0);
> > + audit_log_format(
> > + ab,
> > + "domain=%llx status=allocated mode=enforcing pid=%d uid=%u exe=",
> > + node->id, pid_nr(node->details->pid),
> > + from_kuid(&init_user_ns, node->details->cred->uid));
> > + audit_log_untrustedstring(ab, node->details->exe_path);
> > + audit_log_format(ab, " comm=");
> > + audit_log_untrustedstring(ab, node->details->comm);
> > + audit_log_end(ab);
> > +
> > + /*
> > + * There may be race condition leading to logging of the same domain
> > + * several times but that is OK.
> > + */
> > + WRITE_ONCE(node->log_status, LANDLOCK_LOG_RECORDED);
> > +}
> > +
> > static struct landlock_hierarchy *
> > get_hierarchy(const struct landlock_ruleset *const domain, const size_t layer)
> > {
> > @@ -106,16 +142,24 @@ void landlock_log_denial(const struct landlock_cred_security *const subject,
> > if (!is_valid_request(request))
> > return;
> >
> > - if (!unlikely(audit_context() && audit_enabled))
> > - return;
> > -
> > youngest_layer = request->layer_plus_one - 1;
> > youngest_denied = get_hierarchy(subject->domain, youngest_layer);
> >
> > + /*
> > + * Consistently keeps track of the number of denied access requests
> > + * even if audit is currently disabled, if audit rules currently
> > + * exclude this record type, or if landlock_restrict_self(2)'s flags
> > + * quiet logs.
> > + */
> > + atomic64_inc(&youngest_denied->num_denials);
> > +
> > /* Ignores denials after an execution. */
> > if (!(subject->domain_exec & (1 << youngest_layer)))
> > return;
> >
> > + if (!unlikely(audit_context() && audit_enabled))
> > + return;
> > +
>
> Not a big deal either way, but it seems like the check above should
> probably be in patch 09/24.
It is in 09/24 but it's moved in this patch because we need to count
denials consistently, see a few lines above.
>
> > ab = audit_log_start(audit_context(), GFP_ATOMIC | __GFP_NOWARN,
> > AUDIT_LANDLOCK_ACCESS);
> > if (!ab)
> > @@ -125,6 +169,46 @@ void landlock_log_denial(const struct landlock_cred_security *const subject,
> > log_blockers(ab, request->type);
> > audit_log_lsm_data(ab, &request->audit);
> > audit_log_end(ab);
> > +
> > + /* Logs this domain if it is the first time. */
> > + log_node(youngest_denied);
> > +}
> > +
> > +/**
> > + * landlock_log_drop_domain - Create an audit record when a domain is deleted
> > + *
> > + * @domain: The domain being deleted.
> > + *
> > + * Only domains which previously appeared in the audit logs are logged again.
> > + * This is useful to know when a domain will never show again in the audit log.
> > + *
> > + * This record is not directly tied to a syscall entry.
> > + *
> > + * Called by the cred_free() hook, in an uninterruptible context.
> > + */
> > +void landlock_log_drop_domain(const struct landlock_ruleset *const domain)
> > +{
> > + struct audit_buffer *ab;
> > +
> > + if (WARN_ON_ONCE(!domain->hierarchy))
> > + return;
> > +
> > + if (!unlikely(audit_enabled))
> > + return;
>
> I'm guessing you probably also want to check the audit context given
> that you are doing it elsewhere?
The context is NULL when this function is called, I guess because it is
not directly triggered by a syscall but deferred to the domain
deallocation.
>
> > + /* Ignores domains that were not logged. */
> > + if (READ_ONCE(domain->hierarchy->log_status) != LANDLOCK_LOG_RECORDED)
> > + return;
> > +
> > + ab = audit_log_start(audit_context(), GFP_ATOMIC,
> > + AUDIT_LANDLOCK_DOMAIN);
> > + if (!ab)
> > + return;
> > +
> > + audit_log_format(ab, "domain=%llx status=deallocated denials=%llu",
> > + domain->hierarchy->id,
> > + atomic64_read(&domain->hierarchy->num_denials));
> > + audit_log_end(ab);
> > }
> >
> > #ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
>
> --
> paul-moore.com
>
Powered by blists - more mailing lists