lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 22 Mar 2021 22:06:45 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Kai Huang <kai.huang@...el.com>, kvm@...r.kernel.org,
        x86@...nel.org, linux-sgx@...r.kernel.org,
        linux-kernel@...r.kernel.org, jarkko@...nel.org, luto@...nel.org,
        dave.hansen@...el.com, rick.p.edgecombe@...el.com,
        haitao.huang@...el.com, pbonzini@...hat.com, tglx@...utronix.de,
        mingo@...hat.com, hpa@...or.com
Subject: Re: [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from
 sgx_free_epc_page()

On Mon, Mar 22, 2021 at 12:37:02PM -0700, Sean Christopherson wrote:
> Yes.  Note, it's still true if you strike out the "too", KVM support is completely
> orthogonal to this code.  The purpose of this patch is to separate out the EREMOVE
> path used for host enclaves (/dev/sgx_enclave), because EPC virtualization for
> KVM will have non-buggy scenarios where EREMOVE can fail.  But the virt EPC code
> is designed to handle that gracefully.

"gracefully" as it won't leak EPC pages which would require a host reboot? That
leaking is done by host enclaves only?

> Hmm.  I don't think it warrants BUG.  At worst, leaking EPC pages is fatal only
> to SGX.

Fatal how? If it keeps leaking, at some point it won't have any pages
for EPC pages anymore?

Btw, I probably have seen this and forgotten again so pls remind me,
is the amount of pages available for SGX use static and limited by,
I believe BIOS, or can a leakage in EPC pages cause system memory
shortage?

> If the underlying bug caused other fallout, e.g. didn't release a
> lock, then obviously that could be fatal to the kernel. But I don't
> think there's ever a case where SGX being unusuable would prevent the
> kernel from functioning.

This kinda replies my question above but still...

> Probably something in between.  Odds are good SGX will eventually become
> unusuable, e.g. either kernel SGX support is completely hosted, or it will soon
> leak the majority of EPC pages.  Something like this?
> 
>   "EREMOVE returned %d (0x%x), kernel bug likely.  EPC page leaked, SGX may become unusuable.  Reboot recommended to continue using SGX."

So all this handwaving I'm doing is to provoke a proper response from
you guys as to how a EPC page leaking is supposed to be handled by the
users of the technology:

1. Issue a warning message and forget about it, eventual reboot

2. Really scary message to make users reboot sooner

3. Detect when host enclaves are run while guest enclaves are running
and issue a warning then.

4. Fall on knees and pray to not get sued by customers because their
enclaves are not working anymore.

....

Btw, 4. needs to be considered properly so that people can cover asses.

Oh and whatever we end up deciding, we should document that in
Documentation/... somewhere and point users to it in that warning
message where a longer treatise is explaining the whole deal properly.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ