lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Nov 2022 10:55:38 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Nicolai Stange <nstange@...e.de>
Cc:     Josh Poimboeuf <jpoimboe@...nel.org>,
        Marcos Paulo de Souza <mpdesouza@...e.com>,
        linux-kernel@...r.kernel.org, live-patching@...r.kernel.org,
        jpoimboe@...hat.com, joe.lawrence@...hat.com
Subject: Re: [PATCH v2 4/4] livepatch/shadow: Add garbage collection of
 shadow variables

On Fri 2022-11-11 10:20:44, Nicolai Stange wrote:
> Hi all,
> 
> Petr Mladek <pmladek@...e.com> writes:
> 
> > On Thu 2022-11-03 18:03:27, Josh Poimboeuf wrote:
> >> On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> >> > The life of shadow variables is not completely trivial to maintain.
> >> > They might be used by more livepatches and more livepatched objects.
> >> > They should stay as long as there is any user.
> >> > 
> >> > In practice, it requires to implement reference counting in callbacks
> >> > of all users. They should register all the user and remove the shadow
> >> > variables only when there is no user left.
> >> > 
> >> > This patch hides the reference counting into the klp_shadow API.
> >> > The counter is connected with the shadow variable @id. It requires
> >> > an API to take and release the reference. The release function also
> >> > calls the related dtor() when defined.
> >> > 
> >> > An easy solution would be to add some get_ref()/put_ref() API.
> >> > But it would need to get called from pre()/post_un() callbacks.
> >> > It might be easy to forget a callback and make it wrong.
> >> > 
> >> > A more safe approach is to associate the klp_shadow_type with
> >> > klp_objects that use the shadow variables. The livepatch core
> >> > code might then handle the reference counters on background.
> >> > 
> >> > The shadow variable type might then be added into a new @shadow_types
> >> > member of struct klp_object. They will get then automatically registered
> >> > and unregistered when the object is being livepatched. The registration
> >> > increments the reference count. Unregistration decreases the reference
> >> > count. All shadow variables of the given type are freed when the reference
> >> > count reaches zero.
> >> > 
> >> > All klp_shadow_alloc/get/free functions also checks whether the requested
> >> > type is registered. It will help to catch missing registration and might
> >> > also help to catch eventual races.
> >> 
> >> Is there a reason the shadow variable lifetime is tied to klp_object
> >> rather than klp_patch?
> >
> > Good question!
> >
> > My thinking was that shadow variables might be tight to variables
> > from a particular livepatched module. It would make sense to free them
> > when the only user (livepatched module) is removed.
> >
> > The question is if it is really important.
> 
> I don't think so. For shadow variables attached to "real" data objects,
> I would expect those objects to be gone by the time the patched ko gets
> removed. For dummy shadows attached to NULL (see below), used for
> handing over shared state between atomic-replace livepatch modules, the
> memory footprint is negligible, I'd say.
> 
> OTOH, callbacks are per klp_object already, so for API consistency, it
> might make sense to stick to this pattern for the shadows as well. But I
> don't have a real opinion on this.

Good point!

> >From my experience, there are basically two relevant usage patterns of
> shadow variables.
> 1.) To hand over global state from one sublivepatch to its pendant in
>     the to-be-applied livepatch module. Example: a new global mutex or
>     alike.
> 2.) The "regular" intended usage, attaching schadow variables to real
>     (data) objects.
> 
> To manage lifetime for 1.), we usually implement some refcount scheme,
> managed from the livepatches' module_init()/_exit(): the next livepatch
> would subscribe to the shared state before the previous one got a chance
> to release it. This works in practice, but the code related to it is
> tedious to write and quite verbose.
> 
> The second usage pattern is much more difficult to implement correctly
> in light of possible livepatch downgrades to a subset of
> sublivepatches. Usually a sublivepatch making use of a shadow variable
> attached to real objects would livepatch the associated object's
> destruction code to free up the associated shadow, if any. If the next
> livepatch to be applied happened to not contain this sublivepatch in
> question as well, the destruction code would effectively become
> unpatched, and any existing shadows leaked. Depending on the object type
> in question, this memory leakage might or might not be an actual
> problem, but it isn't nice either way.
> 
> Often, there's a more subtle issue with the latter usecase though: the
> shadow continues to exist, but becomes unmaintained once the transitions
> has started. If said sublivepatch happens to become reactivated later
> on, it would potentially find stale shadows, and these could even get
> wrongly associated with a completely different object which happened to
> get allocated at the same memory address. Depending on the shadow type,
> this might or might not be Ok. New per-object locks or a "TLB flush
> needed" boolean would probably be Ok, but some kind of refcount would
> certainly not. There's not much which could be done from the pre-unpatch
> callbacks, because these aren't getting invoked for atomic-replace
> downgrades.
> 
> We had quite some discussion internally on how to best approach these
> limitations, the outcome being (IIRC), that a more versatile callbacks
> support would perhaps quickly become too complex or error-prone to use
> correctly. So Petr proposed this garbage collection/refcounting
> mechanism posted here, which would solve the memory leakage issue as a
> first step (and would make shadow variable usage less verbose IMO).
> 
> The consistency problem would still not be fully solved: consider a
> refcount-like shadow, where during the transition some acquirer had been
> unpatched already, while a releaser has not yet. But my hope is that we
> can later build on this work here and somehow resolve this as well.
> 
> > Nicolai, your have the practical experience. Should the reference
> > counting be per-livepatched object or per-livepatch, please?
> 
> See above, I think it won't matter much from a functionality POV.

I would personally keep it tied together with the livepatched object
just to be on the safe side.

Is this acceptable for kPatch users in the end?

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ