Skip to content

Commit 7031ebb

Browse files
kvaneeshgregkh
authored andcommitted
powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
[ Upstream commit 103a854 ] If the hypervisor doesn't support hugepages, the kernel ends up allocating a large number of page table pages. The early page table allocation was wrongly setting the max memblock limit to ppc64_rma_size with radix translation which resulted in boot failure as shown below. Kernel panic - not syncing: early_alloc_pgtable: Failed to allocate 16777216 bytes align=0x1000000 nid=-1 from=0x0000000000000000 max_addr=0xffffffffffffffff CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-24.9-default+ #2 Call Trace: [c0000000016f3d00] [c0000000007c6470] dump_stack+0xc4/0x114 (unreliable) [c0000000016f3d40] [c00000000014c78c] panic+0x164/0x418 [c0000000016f3dd0] [c000000000098890] early_alloc_pgtable+0xe0/0xec [c0000000016f3e60] [c0000000010a5440] radix__early_init_mmu+0x360/0x4b4 [c0000000016f3ef0] [c000000001099bac] early_init_mmu+0x1c/0x3c [c0000000016f3f10] [c00000000109a320] early_setup+0x134/0x170 This was because the kernel was checking for the radix feature before we enable the feature via mmu_features. This resulted in the kernel using hash restrictions on radix. Rework the early init code such that the kernel boot with memblock restrictions as imposed by hash. At that point, the kernel still hasn't finalized the translation the kernel will end up using. We have three different ways of detecting radix. 1. dt_cpu_ftrs_scan -> used only in case of PowerNV 2. ibm,pa-features -> Used when we don't use cpu_dt_ftr_scan 3. CAS -> Where we negotiate with hypervisor about the supported translation. We look at 1 or 2 early in the boot and after that, we look at the CAS vector to finalize the translation the kernel will use. We also support a kernel command line option (disable_radix) to switch to hash. Update the memblock limit after mmu_early_init_devtree() if the kernel is going to use radix translation. This forces some of the memblock allocations we do before mmu_early_init_devtree() to be within the RMA limit. Fixes: 2bfd65e ("powerpc/mm/radix: Add radix callbacks for early init routines") Reported-by: Shirisha Ganta <shiganta@in.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200828100852.426575-1-aneesh.kumar@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent 6d035d5 commit 7031ebb

File tree

3 files changed

+14
-22
lines changed

3 files changed

+14
-22
lines changed

arch/powerpc/include/asm/book3s/64/mmu.h

+5-5
Original file line numberDiff line numberDiff line change
@@ -228,14 +228,14 @@ static inline void early_init_mmu_secondary(void)
228228

229229
extern void hash__setup_initial_memory_limit(phys_addr_t first_memblock_base,
230230
phys_addr_t first_memblock_size);
231-
extern void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
232-
phys_addr_t first_memblock_size);
233231
static inline void setup_initial_memory_limit(phys_addr_t first_memblock_base,
234232
phys_addr_t first_memblock_size)
235233
{
236-
if (early_radix_enabled())
237-
return radix__setup_initial_memory_limit(first_memblock_base,
238-
first_memblock_size);
234+
/*
235+
* Hash has more strict restrictions. At this point we don't
236+
* know which translations we will pick. Hence go with hash
237+
* restrictions.
238+
*/
239239
return hash__setup_initial_memory_limit(first_memblock_base,
240240
first_memblock_size);
241241
}

arch/powerpc/mm/book3s64/radix_pgtable.c

-15
Original file line numberDiff line numberDiff line change
@@ -654,21 +654,6 @@ void radix__mmu_cleanup_all(void)
654654
}
655655
}
656656

657-
void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
658-
phys_addr_t first_memblock_size)
659-
{
660-
/*
661-
* We don't currently support the first MEMBLOCK not mapping 0
662-
* physical on those processors
663-
*/
664-
BUG_ON(first_memblock_base != 0);
665-
666-
/*
667-
* Radix mode is not limited by RMA / VRMA addressing.
668-
*/
669-
ppc64_rma_size = ULONG_MAX;
670-
}
671-
672657
#ifdef CONFIG_MEMORY_HOTPLUG
673658
static void free_pte_table(pte_t *pte_start, pmd_t *pmd)
674659
{

arch/powerpc/mm/init_64.c

+9-2
Original file line numberDiff line numberDiff line change
@@ -431,9 +431,16 @@ void __init mmu_early_init_devtree(void)
431431
if (!(mfmsr() & MSR_HV))
432432
early_check_vec5();
433433

434-
if (early_radix_enabled())
434+
if (early_radix_enabled()) {
435435
radix__early_init_devtree();
436-
else
436+
/*
437+
* We have finalized the translation we are going to use by now.
438+
* Radix mode is not limited by RMA / VRMA addressing.
439+
* Hence don't limit memblock allocations.
440+
*/
441+
ppc64_rma_size = ULONG_MAX;
442+
memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE);
443+
} else
437444
hash__early_init_devtree();
438445
}
439446
#endif /* CONFIG_PPC_BOOK3S_64 */

0 commit comments

Comments
 (0)