lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140320114735.GD10182@pek-khao-d1.corp.ad.wrs.com>
Date:	Thu, 20 Mar 2014 19:47:35 +0800
From:	Kevin Hao <haokexin@...il.com>
To:	Scott Wood <scottwood@...escale.com>
Cc:	Chenhui Zhao <chenhui.zhao@...escale.com>,
	linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
	leoli@...escale.com, Jason.Jin@...escale.com
Subject: Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

On Tue, Mar 18, 2014 at 06:18:54PM -0500, Scott Wood wrote:
> >  The sequence "write, readback, sync" will guarantee this according to the manual. 
> 
> If you're referring to the section you quoted above, the manual does not
> say that.  It only talks about when accesses "to memory regions affected
> by the configuration register write" can be safely made.

  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.

According to the above description in t4240 RM manual (2.6.1 Accessing CCSR
Memory from the Local Processor), that the writing to configuration register
takes effect is a prerequisite for the memory access to the affected regions.

> 
> > If we just want to
> > order the delay loop after the load, the following sequence should be enough:
> > 	store to CCSR_DDR_SDRAM_CFG_2
> > 	load from CCSR_DDR_SDRAM_CFG_2
> > 	isync or sync
> > 	delay loop
> > 
> > Why do we need the 'date dependency' here? According to the e6500 manual, the
> > instructions can execute out of order, but complete in order. So I am really
> > wondering why we need to order the load and the following delay loop if there
> > is no intention to order the storage access? 
> 
> The data (not date) dependency means that the twi will not execute until
> after the load instruction returns data (thus, after the device has
> responded to the load).  The twi, being a flow control instruction
> rather than a storage instruction, should be fully ordered by isync.
> 
> From the isync description in the ISA: "Except as described in the
> preceding sentence, the isync instruction may complete before storage
> accesses associated with instructions preceding the
> isync instruction have been performed."
> 
> I don't know if that really applies to loads (as opposed to stores), and
> it probably doesn't apply to guarded loads, but I feel safer leaving the
> twi in.
> 
> As for sync instead of isync, I see nothing to indicate that it would be
> adequate (though it may be that the only reason there needed to be a
> delay loop in the first place was the lack of a readback/sync before
> doing other I/O, in which case this is moot).

OK, so the intention of 'twi, isync' following the load is not to order the
following storage access, but order the following delay loop instructions,
right? But according to the e6500 manual, the instructions complete in order.
The following is the definition of 'complete':
  Complete—An instruction is eligible to complete after it finishes executing
  and makes its results available for subsequent instructions. Instructions
  must complete in order from the bottom two entries of the
  completion queue (CQ). The completion unit coordinates how instructions (which
  may have executed out of order) affect architected registers to ensure the
  appearance of serial execution. This guarantees that the completed instruction
  and all previous instructions can cause no exceptions. An instruction completes
  when it is retired, that is, deleted from the CQ.

So the following delay loop instructions should never complete before the
complete of the load instruction. After the complete of load instruction,
the data should already been updated to the architecture register. So we can
guarantee that the load instruction get the data (the device has responded to
the load) before the complete of the following delay loop instructions, why do
we need the additional 'twi, isync'?  Then for a case like this (which don't
need order the following storage access), I think the following sequence should
be enough:
	write to configuration register
	read back
	delay loop

> I mean that I do not understand why it says, "the load on which the
> Branch depends is performed before any loads caused by instructions
> following the isync" rather than "the Load on which the Branch depends
> is performed before any instructions following the isync".

When we talk about the 'performed' here, we should only mean the effect of
storage access. The following is the definition of 'performed':
  performed
  A load or instruction fetch by a processor or mech-
  anism (P1) is performed with respect to any pro-
  cessor or mechanism (P2) when the value to be
  returned by the load or instruction fetch can no
  longer be changed by a store by P2. A store by P1
  is performed with respect to P2 when a load by P2
  from the location accessed by the store will return
  the value stored (or a value stored subsequently).
  An instruction cache block invalidation by P1 is
  performed with respect to P2 when an instruction
  fetch by P2 will not be satisfied from the copy of
  the block that existed in its instruction cache when
  the instruction causing the invalidation was exe-
  cuted, and similarly for a data cache block invalida-
  tion.
  The preceding definitions apply regardless of
  whether P1 and P2 are the same entity.

> 
> > > > So if we want to order all the storage access as well
> > > > as execution synchronization, we should choose sync here.
> > > 
> > > Do we need execution synchronization or context synchronization?
> > 
> > There is no context-altering instruction here, so I think an execution
> > synchronizing instruction should be enough here.
> 
> Is the ISA ever explicit about what constitutes "context"?

The following is the definition of context-altering instruction:
  An instruction that alters the context in which data
  addresses or instruction addresses are interpreted, or
  in which instructions are executed or data accesses are
  performed, is called a context-altering instruction.

So the context should be:
  - in which data addresses or instruction addresses are interpreted
  - in which instructions are executed
  - in which data accesses are performed

Thanks,
Kevin

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ