lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 4 Sep 2016 03:29:39 +0200
From:   Andreas Mohr <andi@...as.de>
To:     Tejun Heo <tj@...nel.org>, Qiao Zhou <qiaozhou@...micro.com>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [Question] about patch: don't use [delayed_]work_pending()

Hi,

[no properly binding reference via In-Reply-To: available thus manually re-creating, sorry]

https://lkml.org/lkml/2016/9/2/335

I came up with the following somewhat random thoughts:


*** this treatment is exclusive to a single use case, i.e.
not covering things consistently (API-wide)

> +++ b/kernel/power/qos.c
> @@ -482,7 +482,16 @@ void pm_qos_update_request(struct pm_qos_request
> *req,
>  		return;
>  	}
>  
> -	cancel_delayed_work_sync(&req->work);
> +	/*
> +	 * This function may be called very early during boot, for
> example,
> +	 * from of_clk_init(), where irq needs to stay disabled.
> +	 * cancel_delayed_work_sync() assumes that irq is enabled on
> +	 * invocation and re-enables it on return.  Avoid calling it
> until
> +	 * workqueue is initialized.
> +	 */
> +	if (keventd_up())
> +		cancel_delayed_work_sync(&req->work);
> +
>  	__pm_qos_update_request(req, new_value);
>  }
>  EXPORT_SYMBOL_GPL(pm_qos_update_request);


Reason: any other [early-boot] invoker of cancel_delayed_work_sync()
would hit the same issue,
without any fix then available locally each.

This may or may not be intentional.
Just wanted to point it out.


The more global question here obviously is
how this annoying
irq handling side effect
ought to be handled cleanly / symmetrically
(and ideally with minimal effect to hotpaths,
which probably is the usual inherent outcome of
an elegant/clean solution),
especially given that it is early handling vs. normal.
However unfortunately I do not have any final ideas here ATM.



*** try_to_grab_pending() layer violation (asymmetry)
It *internally* does a local_irq_save(*flags)
and then passes these flags to the *external* user,
where the user
after having called try_to_grab_pending()
then has to invoke these same
*internal layer implementation details*
(local_irq_restore(flags))
That's just fugly asymmetric handling.
If an API layer happens to have
some certain
*internal*
"context"/"resource" requirements
[either by having an explicit API function
to establish these requirements,
or by "dirtily"/"implicitly" passing these out
via an out parameter],
then this API layer
*always* should *itself* provide
correspondingly required
resource handling functions
*at this very layer implementation*
Since the API currently is named try_to_grab_pending(),
this suggests that the signature of the
corresponding resource cleanup function
could be something like
    work_grab_context_release(context_reference);

Not properly providing such a public API
will result in e.g. the following major disadvantages:
- a flood of changes necessary
  *at all API users*
  in case use protocol of these
  implementation details
  (local_irq_restore())
  happen to get changed
- all users of this API layer
  having to #include the specific header
  of the inner implementation details layer
  (local_irq_restore())
  rather than the cleanly plain API layer itself only
  (either
   within the API layer header in case of an inline helper, or
   within the API layer implemention internally only in case of non-inline)
  IOW, by getting this wrong,
  #include handling of this header of the internal implementation layer requirements
  is not properly under the control of the API layer implementer
  any more


Also,
comment
"try_to_grab_pending - steal work item from worklist and disable irq"
seems to have wrong order
since I would think that
"disable irq"
is the *pre-requisite* for being able to
reliably (atomically) steal work items from work list.
[[hmm, and let's hope that NMIs are (known to be?) not involved]]

HTH,

Andreas Mohr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ