lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160603071656.GA7466@gmail.com>
Date:	Fri, 3 Jun 2016 09:16:56 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Wanpeng Li <kernellwp@...il.com>
Cc:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Wanpeng Li <wanpeng.li@...mail.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>,
	Rik van Riel <riel@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Paolo Bonzini <pbonzini@...hat.com>, Radim <rkrcmar@...hat.com>
Subject: Re: [PATCH v3] sched/cputime: add steal time support to full
 dynticks CPU time accounting


* Wanpeng Li <kernellwp@...il.com> wrote:

> From: Wanpeng Li <wanpeng.li@...mail.com>
> 
> This patch adds steal guest time support to full dynticks CPU 
> time accounting. After 'commit ff9a9b4c4334 ("sched, time: Switch 
> VIRT_CPU_ACCOUNTING_GEN to jiffy granularity")', time is jiffy 
> based sampling even if it's still listened to ring boundaries, so 
> steal_account_process_tick() is reused to account how much 'ticks' 
> are steal time after the last accumulation. 

WTF? This changelog has 4 grammar errors and it sails through review just like 
that?

 1) What does 'time is jiffy based sampling' mean?
 2) what does 'even if it's still listened to ring boundaries' mean?
 3) "how muck 'ticks'"?
 4) "are steal time"?

So I fixed this to be at least parseable:

  This patch adds guest steal-time support to full dynticks CPU
  time accounting. After the following commit:

    ff9a9b4c4334 ("sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity")

  ... time sampling became jiffy based, even if it's still listened
  to ring boundaries, so steal_account_process_tick() is reused
  to account how many 'ticks' are stolen-time, after the last accumulation.

Although I'm still wondering what this key phrase means:

  even if it's still listened to ring boundaries,

Could someone please explain what this means? (Beyond the 5th grammar error this 
portion has, which I'll fix once it actually makes sense to me...)

Furthermore, the real problem that made me go back and tear the changelog apart is 
that the code flow itself is incredibly ugly and fragile as hell:

>  	write_seqcount_begin(&tsk->vtime_seqcount);
>  	tsk->vtime_snap_whence = VTIME_SYS;
>  	if (vtime_delta(tsk)) {
> +		cputime_t steal_time;
> +		unsigned long delta_st = steal_account_process_tick();
>  		delta_cpu = get_vtime_delta(tsk);
> +		steal_time = jiffies_to_cputime(delta_st);
> +
> +		if (steal_time >= delta_cpu) {
> +			write_seqcount_end(&tsk->vtime_seqcount);
> +			return;
> +		}
> +		delta_cpu -= steal_time;
>  		account_user_time(tsk, delta_cpu, cputime_to_scaled(delta_cpu));
>  	}
>  	write_seqcount_end(&tsk->vtime_seqcount);
> }

Yeah, a return in the middle of a locking critical section, really??

Also, how about basic style details like leaving an extra newline after local 
variable definition sections, like every other scheduler function does?

Also, what's this thing about calling a time unit variable 'delta_cpu'? When I 
reviewed this one of my first reactions was: "Why are we comparing time to CPU 
ID??".

Plus as an added bonus a 'delta_st' variable name to count ticks, which variable 
is not just badly named but single-use. WTF?

Something like this looks much better and shorter:

void vtime_account_user(struct task_struct *tsk)
{
        cputime_t delta_time, steal_time;

        write_seqcount_begin(&tsk->vtime_seqcount);
        tsk->vtime_snap_whence = VTIME_SYS;
        if (vtime_delta(tsk)) {
                delta_time = get_vtime_delta(tsk);
                steal_time = jiffies_to_cputime(steal_account_process_tick());

                if (steal_time < delta_time) {
                        delta_time -= steal_time;
                        account_user_time(tsk, delta_time, cputime_to_scaled(delta_time));
                }
        }
        write_seqcount_end(&tsk->vtime_seqcount);
}

See the consistent, obvious naming of the variables and the clear code flow?

(Totally untested, etc.)

But I'm very annoyed that this was in v3 and still has so many trivial problems: 
incredibly bad and confusing spelling, totally bad, fragile, apparently write-only 
code - and I counted like three acks from 3 other people who should really know 
better ...

This is core scheduler code!

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ