lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 27 Apr 2019 17:15:06 +0000 From: Moshe Shemesh <moshe@...lanox.com> To: Jakub Kicinski <jakub.kicinski@...ronome.com> CC: Saeed Mahameed <saeedm@...lanox.com>, "davem@...emloft.net" <davem@...emloft.net>, Jiri Pirko <jiri@...lanox.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org> Subject: Re: [PATCH net-next] devlink: Execute devlink health recover as a work On 4/26/2019 7:19 PM, Jakub Kicinski wrote: > On Fri, Apr 26, 2019 at 6:04 AM Moshe Shemesh <moshe@...lanox.com> wrote: >> On 4/26/2019 5:37 AM, Jakub Kicinski wrote: >>> On Fri, 26 Apr 2019 01:42:34 +0000, Saeed Mahameed wrote: >>>>>> @@ -4813,7 +4831,11 @@ static int >>>>>> devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb, >>>>>> if (!reporter) >>>>>> return -EINVAL; >>>>>> >>>>>> - return devlink_health_reporter_recover(reporter, NULL); >>>>>> + if (!reporter->ops->recover) >>>>>> + return -EOPNOTSUPP; >>>>>> + >>>>>> + queue_work(devlink->reporters_wq, &reporter->recover_work); >>>>>> + return 0; >>>>>> } >>>>> >>>>> So the recover user space request will no longer return the status, >>>>> and >>>>> it will not actually wait for the recover to happen. Leaving user >>>>> pondering - did the recover run and fail, or did it nor get run >>>>> yet... >>>>> >>>> >>>> wait_for_completion_interruptible_timeout is missing from the design ? >>> >>> Perhaps, but I think its better to avoid the async execution of >>> the recover all together. Perhaps its better to refcount the >>> reporters on the call to recover_doit? Or some such.. :) >>> >> >> I tried using refcount instead of devlink lock here. But once I get to >> reporter destroy I wait for the refcount and not sure if I should >> release the reporter after some timeout or have endless wait for >> refcount. Both options seem not good. > > Well you should "endlessly" wait. Why would the refcount not drop, > you have to remove it from the list first, so no new operations can > start, right? > In principle there is no difference between waiting for refcount to > drop, flushing the work, or waiting for the devlink lock if reporter > holds it? > Makes sense, I will rewrite this patch, thanks.
Powered by blists - more mailing lists