[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cb703dc8-03a3-a877-2e5f-72a349e0f2d7@redhat.com>
Date: Fri, 26 Jul 2019 09:46:46 -0400
From: Prarit Bhargava <prarit@...hat.com>
To: Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Cc: Barret Rhoden <brho@...gle.com>, David Arcari <darcari@...hat.com>,
Jessica Yu <jeyu@...nel.org>,
Heiko Carstens <heiko.carstens@...ibm.com>
Subject: Re: [PATCH AUTOSEL 5.2 13/85] kernel/module.c: Only return -EEXIST
for modules that have finished loading
On 7/26/19 9:38 AM, Sasha Levin wrote:
> From: Prarit Bhargava <prarit@...hat.com>
>
> [ Upstream commit 6e6de3dee51a439f76eb73c22ae2ffd2c9384712 ]
>
> Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and
> linux guests boot with repeated errors:
>
Hey Sasha, I'd prefer to leave this out of stable branches for now. Linus is a
bit nervous about it and I like to get see more soak time before the patch is
backported to stable.
https://lkml.org/lkml/2019/7/18/680
FWIW we've been running this in RHEL for some time now without any issues.
P.
> amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
> amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
> amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
> amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
> amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
> amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
>
> The warnings occur because the module code erroneously returns -EEXIST
> for modules that have failed to load and are in the process of being
> removed from the module list.
>
> module amd64_edac_mod has a dependency on module edac_mce_amd. Using
> modules.dep, systemd will load edac_mce_amd for every request of
> amd64_edac_mod. When the edac_mce_amd module loads, the module has
> state MODULE_STATE_UNFORMED and once the module load fails and the state
> becomes MODULE_STATE_GOING. Another request for edac_mce_amd module
> executes and add_unformed_module() will erroneously return -EEXIST even
> though the previous instance of edac_mce_amd has MODULE_STATE_GOING.
> Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which
> fails because of unknown symbols from edac_mce_amd.
>
> add_unformed_module() must wait to return for any case other than
> MODULE_STATE_LIVE to prevent a race between multiple loads of
> dependent modules.
>
> Signed-off-by: Prarit Bhargava <prarit@...hat.com>
> Signed-off-by: Barret Rhoden <brho@...gle.com>
> Cc: David Arcari <darcari@...hat.com>
> Cc: Jessica Yu <jeyu@...nel.org>
> Cc: Heiko Carstens <heiko.carstens@...ibm.com>
> Signed-off-by: Jessica Yu <jeyu@...nel.org>
> Signed-off-by: Sasha Levin <sashal@...nel.org>
> ---
> kernel/module.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 80c7c09584cf..8431c3d47c97 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3385,8 +3385,7 @@ static bool finished_loading(const char *name)
> sched_annotate_sleep();
> mutex_lock(&module_mutex);
> mod = find_module_all(name, strlen(name), true);
> - ret = !mod || mod->state == MODULE_STATE_LIVE
> - || mod->state == MODULE_STATE_GOING;
> + ret = !mod || mod->state == MODULE_STATE_LIVE;
> mutex_unlock(&module_mutex);
>
> return ret;
> @@ -3576,8 +3575,7 @@ static int add_unformed_module(struct module *mod)
> mutex_lock(&module_mutex);
> old = find_module_all(mod->name, strlen(mod->name), true);
> if (old != NULL) {
> - if (old->state == MODULE_STATE_COMING
> - || old->state == MODULE_STATE_UNFORMED) {
> + if (old->state != MODULE_STATE_LIVE) {
> /* Wait in case it fails to load. */
> mutex_unlock(&module_mutex);
> err = wait_event_interruptible(module_wq,
>
Powered by blists - more mailing lists