lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-id: <2f2083e145b6.456de740@fnal.gov>
Date:	Wed, 29 Nov 2006 20:02:08 -0600
From:	Wenji Wu <wenji@...l.gov>
To:	Andrew Morton <akpm@...l.org>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

> That yield() will need to be removed - yield()'s behaviour is truly 
> awfulif the system is otherwise busy.  What is it there for?

Please read the uploaded paper, which has detailed description.

thanks,

wenji

----- Original Message -----
From: Andrew Morton <akpm@...l.org>
Date: Wednesday, November 29, 2006 7:08 pm
Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

> On Wed, 29 Nov 2006 16:53:11 -0800 (PST)
> David Miller <davem@...emloft.net> wrote:
> 
> > 
> > Please, it is very difficult to review your work the way you have
> > submitted this patch as a set of 4 patches.  These patches have not
> > been split up "logically", but rather they have been split up "per
> > file" with the same exact changelog message in each patch posting.
> > This is very clumsy, and impossible to review, and wastes a lot of
> > mailing list bandwith.
> > 
> > We have an excellent file, called 
> Documentation/SubmittingPatches, in
> > the kernel source tree, which explains exactly how to do this
> > correctly.
> > 
> > By splitting your patch into 4 patches, one for each file touched,
> > it is impossible to review your patch as a logical whole.
> > 
> > Please also provide your patch inline so people can just hit reply
> > in their mail reader client to quote your patch and comment on it.
> > This is impossible with the attachments you've used.
> > 
> 
> Here you go - joined up, cleaned up, ported to mainline and test-
> compiled.
> That yield() will need to be removed - yield()'s behaviour is truly 
> awfulif the system is otherwise busy.  What is it there for?
> 
> 
> 
> From: Wenji Wu <wenji@...l.gov>
> 
> For Linux TCP, when the network applcaiton make system call to move 
> data from
> socket's receive buffer to user space by calling tcp_recvmsg().  
> The socket
> will be locked.  During this period, all the incoming packet for 
> the TCP
> socket will go to the backlog queue without being TCP processed
> 
> Since Linux 2.6 can be inerrupted mid-task, if the network application
> expires, and moved to the expired array with the socket locked, all 
> thepackets within the backlog queue will not be TCP processed till 
> the network
> applicaton resume its execution.  If the system is heavily loaded, 
> TCP can
> easily RTO in the Sender Side.
> 
> 
> 
> include/linux/sched.h |    2 ++
> kernel/fork.c         |    3 +++
> kernel/sched.c        |   24 ++++++++++++++++++------
> net/ipv4/tcp.c        |    9 +++++++++
> 4 files changed, 32 insertions(+), 6 deletions(-)
> 
> diff -puN net/ipv4/tcp.c~tcp-speedup net/ipv4/tcp.c
> --- a/net/ipv4/tcp.c~tcp-speedup
> +++ a/net/ipv4/tcp.c
> @@ -1109,6 +1109,8 @@ int tcp_recvmsg(struct kiocb *iocb, stru
> 	struct task_struct *user_recv = NULL;
> 	int copied_early = 0;
> 
> +	current->backlog_flag = 1;
> +
> 	lock_sock(sk);
> 
> 	TCP_CHECK_TIMER(sk);
> @@ -1468,6 +1470,13 @@ skip_copy:
> 
> 	TCP_CHECK_TIMER(sk);
> 	release_sock(sk);
> +
> +	current->backlog_flag = 0;
> +	if (current->extrarun_flag == 1){
> +        	current->extrarun_flag = 0;
> +        	yield();
> +	}
> +
> 	return copied;
> 
> out:
> diff -puN include/linux/sched.h~tcp-speedup include/linux/sched.h
> --- a/include/linux/sched.h~tcp-speedup
> +++ a/include/linux/sched.h
> @@ -1023,6 +1023,8 @@ struct task_struct {
> #ifdef	CONFIG_TASK_DELAY_ACCT
> 	struct task_delay_info *delays;
> #endif
> +	int backlog_flag; 	/* packets wait in tcp backlog queue flag */
> +	int extrarun_flag;	/* extra run flag for TCP performance */
> };
> 
> static inline pid_t process_group(struct task_struct *tsk)
> diff -puN kernel/sched.c~tcp-speedup kernel/sched.c
> --- a/kernel/sched.c~tcp-speedup
> +++ a/kernel/sched.c
> @@ -3099,12 +3099,24 @@ void scheduler_tick(void)
> 
>         	if (!rq->expired_timestamp)
>                 	rq->expired_timestamp = jiffies;
> -        	if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
> -                	enqueue_task(p, rq->expired);
> -                	if (p->static_prio < rq->best_expired_prio)
> -                        	rq->best_expired_prio = p->static_prio;
> -        	} else
> -                	enqueue_task(p, rq->active);
> +        	if (p->backlog_flag == 0) {
> +                	if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
> +                        	enqueue_task(p, rq->expired);
> +                        	if (p->static_prio < rq->best_expired_prio)
> +                                	rq->best_expired_prio = p-
> >static_prio;+                	} else
> +                        	enqueue_task(p, rq->active);
> +        	} else {
> +                	if (expired_starving(rq)) {
> +                        	enqueue_task(p,rq->expired);
> +                        	if (p->static_prio < rq->best_expired_prio)
> +                                	rq->best_expired_prio = p-
> >static_prio;+                	} else {
> +                        	if (!TASK_INTERACTIVE(p))
> +                                	p->extrarun_flag = 1;
> +                        	enqueue_task(p,rq->active);
> +                	}
> +        	}
> 	} else {
>         	/*
>                  * Prevent a too long timeslice allowing a task to 
> monopolizediff -puN kernel/fork.c~tcp-speedup kernel/fork.c
> --- a/kernel/fork.c~tcp-speedup
> +++ a/kernel/fork.c
> @@ -1032,6 +1032,9 @@ static struct task_struct *copy_process(
> 	clear_tsk_thread_flag(p, TIF_SIGPENDING);
> 	init_sigpending(&p->pending);
> 
> +	p->backlog_flag = 0;
> +	p->extrarun_flag = 0;
> +
> 	p->utime = cputime_zero;
> 	p->stime = cputime_zero;
>  	p->sched_time = 0;
> _
> 
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ