linux-kernel - Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YR/m/8PCmCTbogey@zn.tnic>
Date:   Fri, 20 Aug 2021 19:31:43 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Tony Luck <tony.luck@...el.com>
Cc:     Jue Wang <juew@...gle.com>, Ding Hui <dinghui@...gfor.com.cn>,
        naoya.horiguchi@....com, osalvador@...e.de,
        Youquan Song <youquan.song@...el.com>, huangcun@...gfor.com.cn,
        x86@...nel.org, linux-edac@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user
 recovery

On Tue, Aug 17, 2021 at 05:29:40PM -0700, Tony Luck wrote:
> @@ -1287,17 +1291,34 @@ static void kill_me_maybe(struct callback_head *cb)
>  	}
>  }
>  
> -static void queue_task_work(struct mce *m, int kill_current_task)
> +static void queue_task_work(struct mce *m, char *msg, int kill_current_task)
>  {
> -	current->mce_addr = m->addr;
> -	current->mce_kflags = m->kflags;
> -	current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV);
> -	current->mce_whole_page = whole_page(m);
> +	int count = ++current->mce_count;
>  
> -	if (kill_current_task)
> -		current->mce_kill_me.func = kill_me_now;
> -	else
> -		current->mce_kill_me.func = kill_me_maybe;
> +	/* First call, save all the details */
> +	if (count == 1) {
> +		current->mce_addr = m->addr;
> +		current->mce_kflags = m->kflags;
> +		current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV);
> +		current->mce_whole_page = whole_page(m);
> +
> +		if (kill_current_task)
> +			current->mce_kill_me.func = kill_me_now;
> +		else
> +			current->mce_kill_me.func = kill_me_maybe;
> +	}
> +
> +	/* Ten is likley overkill. Don't expect more than two faults before task_work() */

"likely"

> +	if (count > 10)
> +		mce_panic("Too many machine checks while accessing user data", m, msg);

Ok, aren't we too nasty here? Why should we panic the whole box even
with 10 MCEs? It is still user memory...

IOW, why not:

	if (count > 10)
		current->mce_kill_me.func = kill_me_now;

and when we return, that user process dies immediately.

> +	/* Second or later call, make sure page address matches the one from first call */
> +	if (count > 1 && (current->mce_addr >> PAGE_SHIFT) != (m->addr >> PAGE_SHIFT))
> +		mce_panic("Machine checks to different user pages", m, msg);

Same question here.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette