lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 31 Aug 2018 12:09:47 -0400
From:   Steven Sistare <steven.sistare@...cle.com>
To:     subhra mazumdar <subhra.mazumdar@...cle.com>,
        linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, dhaval.giani@...cle.com
Subject: Re: [RFC PATCH 1/2] pipe: introduce busy wait for pipe

On 8/30/2018 4:24 PM, subhra mazumdar wrote:
> Introduce pipe_ll_usec field for pipes that indicates the amount of micro
> seconds a thread should spin if pipe is empty or full before sleeping. This
> is similar to network sockets. Workloads like hackbench in pipe mode
> benefits significantly from this by avoiding the sleep and wakeup overhead.
> Other similar usecases can benefit. pipe_wait_flag is used to signal any
> thread busy waiting. pipe_busy_loop_timeout checks if spin time is over.
> 
> Signed-off-by: subhra mazumdar <subhra.mazumdar@...cle.com>
> ---
>  include/linux/pipe_fs_i.h | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
> index e7497c9..fdfd2a2 100644
> --- a/include/linux/pipe_fs_i.h
> +++ b/include/linux/pipe_fs_i.h
> @@ -1,6 +1,8 @@
>  #ifndef _LINUX_PIPE_FS_I_H
>  #define _LINUX_PIPE_FS_I_H
>  
> +#include <linux/sched/clock.h>
> +
>  #define PIPE_DEF_BUFFERS	16
>  
>  #define PIPE_BUF_FLAG_LRU	0x01	/* page is on the LRU */
> @@ -54,6 +56,8 @@ struct pipe_inode_info {
>  	unsigned int waiting_writers;
>  	unsigned int r_counter;
>  	unsigned int w_counter;
> +	unsigned int pipe_ll_usec;
> +	unsigned long pipe_wait_flag;
>  	struct page *tmp_page;
>  	struct fasync_struct *fasync_readers;
>  	struct fasync_struct *fasync_writers;
> @@ -157,6 +161,21 @@ static inline int pipe_buf_steal(struct pipe_inode_info *pipe,
>  	return buf->ops->steal(pipe, buf);
>  }
>  
> +static inline unsigned long pipe_busy_loop_current_time(void)
> +{
> +	return (unsigned long)(local_clock() >> 10);

Why ">> 10" ? local_lock() has nanosec units, and you compare to the tunable
pipe_llc_sec which has microsec units.  Should be ">> 3".  Better yet, redefine 
the tunable to have nanosec units.  I suspect you will need very large values
of the tunable to show similar results.

Also, since this type of optimization consumes CPU extra cycles that could
be used by other tasks, show the overall CPU utilization before and after
the optimization, such as by using "time hackbench ...".

- Steve

> +}
> +
> +static inline bool pipe_busy_loop_timeout(struct pipe_inode_info *pipe,
> +					  unsigned long start_time)
> +{
> +	unsigned long bp_usec = READ_ONCE(pipe->pipe_ll_usec);
> +	unsigned long end_time = start_time + bp_usec;
> +	unsigned long now = pipe_busy_loop_current_time();
> +
> +	return time_after(now, end_time);
> +}
> +
>  /* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual
>     memory allocation, whereas PIPE_BUF makes atomicity guarantees.  */
>  #define PIPE_SIZE		PAGE_SIZE
> 

Powered by blists - more mailing lists