lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <9AD43423-2FF3-422D-A5AD-61CAE6339CCC@linux.vnet.ibm.com> Date: Tue, 19 Aug 2014 00:35:42 +0800 From: Xiao Guangrong <xiaoguangrong.eric@...il.com> To: Paolo Bonzini <pbonzini@...hat.com> Cc: gleb@...nel.org, avi.kivity@...il.com, mtosatti@...hat.com, linux-kernel@...r.kernel.org, kvm@...r.kernel.org, stable@...r.kernel.org, David Matlack <dmatlack@...gle.com> Subject: Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number Hi Paolo, Thank you to review the patch! On Aug 18, 2014, at 9:57 PM, Paolo Bonzini <pbonzini@...hat.com> wrote: > Il 14/08/2014 09:01, Xiao Guangrong ha scritto: >> - update_memslots(slots, new, kvm->memslots->generation); >> + /* ensure generation number is always increased. */ >> + slots->generation = old_memslots->generation; >> + update_memslots(slots, new); >> rcu_assign_pointer(kvm->memslots, slots); >> synchronize_srcu_expedited(&kvm->srcu); >> + slots->generation++; > > I don't trust my brain enough to review this patch. Sorry to make you confused. I should expain it more clearly. What this patch tried to fix is: kvm will generate wrong mmio-exit forever if no luck event cleans mmio spte. (eg. if no memory pressure or no context-sync on that spte.) Note, it is hard to do precise sync between kvm_vm_ioctl_set_memory_region and mmio-exit - that means userspace is able to get mmio-exit even if kvm_vm_ioctl_set_memory_region have finished, for example, kvm identifies a mmio access before issuing the ioctl and injects mmio-exit to userspace after ioctl return. So checking if mmio-exit is a real mmio access in userspace is needed anyway. > kvm_current_mmio_generation seems like a very bad (race-prone) API. One > patch I trust myself reviewing would change a bunch of functions in > kvm_main.c to take a memslots struct. This would make it easy to > respect the hard and fast rule of not dereferencing the same pointer > twice. But it would be a tedious change. kvm_set_memory_region is the only place updating memslot and kvm_current_mmio_generation accesses memslot by rcu-dereference, i do not know why other places need to take into account. I think this patch is auditable, page-fault is always called by holding srcu-lock so that a page fault can’t go across synchronize_srcu_expedited. Only these cases can happen: 1) page fault occurs before synchronize_srcu_expedited. In this case, vcpu will generate mmio-exit for the memslot being registered by the ioctl. That’s ok since the ioctl have not finished. 2) page fault occurs after synchronize_srcu_expedited and during increasing generation-number. In this case, userspace may get wrong mmio-exit (that happen if handing page-fault is slower that the ioctl), that’s ok too since userspace need do the check anyway like i said above. 3) page fault occurs after generation-number update that’s definitely correct. :) > Another alternative could be to use the low bit to mark an in-progress > change, and skip the caching if the low bit is set. Similar to a > seqcount (except if read_seqcount_retry fails, we just punt and not > retry anything), you could use it even though the memory barriers > provided by write_seqcount_begin/end are not too useful in this case. I do not know how the bit works, page fault will cache the memslot before the bit set and cache the generation-number after the bit set. Maybe i missed your idea, could you please detail it? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists