lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878scz89tl.fsf@nanos.tec.linutronix.de>
Date:   Thu, 24 Sep 2020 17:37:42 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     qianjun.kernel@...il.com, peterz@...radead.org, will@...nel.org,
        luto@...nel.org, linux-kernel@...r.kernel.org
Cc:     laoar.shao@...il.com, qais.yousef@....com, urezki@...il.com,
        jun qian <qianjun.kernel@...il.com>
Subject: Re: [PATCH V7 4/4] softirq: Allow early break the softirq processing loop

On Tue, Sep 15 2020 at 19:56, qianjun kernel wrote:
>  
> +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1)
>
> +/*
> + * The pending_next_bit is recorded for the next processing order when
> + * the loop is broken. This per cpu variable is to solve the following
> + * scenarios:
> + * Assume bit 0 and 1 are pending when the processing starts. Now it
> + * breaks out after bit 0 has been handled and stores back bit 1 as
> + * pending. Before ksoftirqd runs bit 0 gets raised again. ksoftirqd
> + * runs and handles bit 0, which takes more than the timeout. As a
> + * result the bit 0 processing can starve all other softirqs.
> + *
> + * so we need the pending_next_bit to record the next process order.
> + */
> +DEFINE_PER_CPU(u32, pending_next_bit);

static if at all.

> +
>  asmlinkage __visible void __softirq_entry __do_softirq(void)
>  {
>  	u64 start = sched_clock();
> @@ -261,8 +277,11 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>  	unsigned int max_restart = MAX_SOFTIRQ_RESTART;
>  	struct softirq_action *h;
>  	unsigned long pending;
> +	unsigned long pending_left, pending_again;
>  	unsigned int vec_nr;
>  	bool in_hardirq;
> +	int next_bit;
> +	unsigned long flags;
>  
>  	/*
>  	 * Mask out PF_MEMALLOC as the current task context is borrowed for the
> @@ -283,25 +302,66 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>  
>  	local_irq_enable();
>  
> -	for_each_set_bit(vec_nr, &pending, NR_SOFTIRQS) {
> -		int prev_count;
> -
> -		__clear_bit(vec_nr, &pending);
> -
> -		h = softirq_vec + vec_nr;
> -
> -		prev_count = preempt_count();
> -
> -		kstat_incr_softirqs_this_cpu(vec_nr);
> +	/*
> +	 * pending_left means that the left bits unhandled when the loop is
> +	 * broken without finishing the vectors. These bits will be handled
> +	 * first in the next time. pending_again means that the new bits is
> +	 * generated in the other time. These bits should be handled after
> +	 * the pending_left bits have been handled.
> +	 *
> +	 * For example
> +	 * If the pending bits is 1101010110, and the loop is broken after
> +	 * the bit4 is handled. Then, the pending_next_bit will be 5, and
> +	 * the pending_left is 1101000000, the pending_again is 000000110.
> +	 */

If you need such a comment to explain the meaning of your variables then
you did something fundamentaly wrong.

> +	next_bit = __this_cpu_read(pending_next_bit);
> +	pending_left = pending &
> +		(SOFTIRQ_PENDING_MASK << next_bit);
> +	pending_again = pending &
> +		(SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit));
> +
> +	while (pending_left || pending_again) {
> +		if  (pending_left) {
> +			pending = pending_left;
> +			pending_left = 0;
> +		} else if (pending_again) {
> +			pending = pending_again;
> +			pending_again = 0;
> +		} else
> +			break;

Aside of lacking brackets how is that 'else' patch ever going to be
reached?

But TBH that whole patch is a completely unreviewable maze. 

This can be done without all this pending, pending_left, pending_again,
pending_next_bit, next_bit convolution. It's inconsistent anyway:

__do_softirq()

        pending = 0x25;
        next = 0;

        for (...)
            break after bit 0

        ==> pending == 0x24

        ==> next = 2

now on the next invocation

       pending = 0x35;
       next = 2;

       So the processing order is 2, 4, 5, 0

and there is nothing you can do about that with that approach.

But the whole point is to ensure that the not yet processed bits are
processed first.

Find attached an updated series based on the original one from Peter
with the authorship preserved, intact SOB chains and proper changelogs.

The last one is new and addressing the starvation issue in a readable
way.

All of this is again completely untested.

Thanks,

        tglx


View attachment "peterz-softirq-fix-loop.patch" of type "text/x-diff" (1931 bytes)

View attachment "peterz-softirq-timo.patch" of type "text/x-diff" (2111 bytes)

View attachment "peterz-softirq-needs-break.patch" of type "text/x-diff" (3044 bytes)

View attachment "peterz-softirq-break-more.patch" of type "text/x-diff" (2486 bytes)

View attachment "tglx-softirq-prevent-starvation-of-higher-softirq-vectors.patch" of type "text/x-diff" (3998 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ