lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Apr 2014 14:01:54 +0100
From:	Will Deacon <will.deacon@....com>
To:	Ding Tianhong <dingtianhong@...wei.com>
Cc:	Catalin Marinas <Catalin.Marinas@....com>,
	Sukie Peng <Sukie.Peng@....com>,
	"huxinwei@...wei.com" <huxinwei@...wei.com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] arm64: Flush the process's mm context TLB entries when
 switching

Hi Ding,

On Mon, Apr 14, 2014 at 01:03:12PM +0100, Ding Tianhong wrote:
> I met a problem when migrating process by following steps:
> 
> 1) The process was already running on core 0.
> 2) Set the CPU affinity of the process to 0x02 and move it to core 1,
>    it could work well.
> 3) Set the CPU affinity of the process to 0x01 and move it to core 0 again,
>    the problem occurs and the process was killed.

[...]

> It was a very strange problem that the PC and LR are both 0, and the esr is
> 0x83000006, it means that the used for instruction access generated MMU faults
> and synchronous external aborts, including synchronous parity errors.
> 
> I try to fix the problem by invalidating the process's TLB entries when switching,
> it will make the context stale and pick new one, and then it could work well.
> 
> So I think in some situation that after the process switching, the modification of
> the TLB entries in the new core didn't inform all other cores to invalidate the old
> TLB entries which was in the inner shareable caches, and then if the process schedule
> to another core, the old TLB entries may occur MMU faults.

Yes, it sounds like you don't have your TLBs configured correctly. Can you
confirm that your EL3 firmware is configuring TLB broadcasting correctly
please?

> Signed-off-by: Ding Tianhong <dingtianhong@...wei.com>
> ---
>  arch/arm64/kernel/process.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 6391485..d7d8439 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -283,6 +283,13 @@ static void tls_thread_switch(struct task_struct *next)
>  	: : "r" (tpidr), "r" (tpidrro));
>  }
>  
> +static void tlb_flush_thread(struct task_struct *prev)
> +{
> +	/* Flush the prev task's TLB entries */
> +	if (prev->mm)
> +		flush_tlb_mm(prev->mm);
> +}
> +
>  /*
>   * Thread switching.
>   */
> @@ -296,6 +303,8 @@ struct task_struct *__switch_to(struct task_struct *prev,
>  	hw_breakpoint_thread_switch(next);
>  	contextidr_thread_switch(next);
>  
> +	tlb_flush_thread(prev);

NAK to the patch -- the architecture certainly doesn't require this, and
it's a huge hammer for what is more likely a firmware initialisation issue.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ