Buildresult: powerpc-fixes/ppc64_defconfig+NO_RADIX/powerpc-gcc4.9 built on Aug 19 2020, 08:14
kisskb
Revisions
|
Branches
|
Compilers
|
Configs
|
Build Results
|
Build Failures
|
Status:
Failed
Date/Time:
Aug 19 2020, 08:14
Duration:
0:03:53.218546
Builder:
blade4b
Revision:
powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death (
801980f6497946048709b9b09771a1729551d705)
Target:
powerpc-fixes/ppc64_defconfig+NO_RADIX/powerpc-gcc4.9
Branch:
powerpc-fixes
Compiler:
powerpc-gcc4.9
(powerpc64-linux-gcc (GCC) 4.9.4 / GNU ld (GNU Binutils) 2.29.1.20170915)
Config:
ppc64_defconfig+NO_RADIX
(
download
)
Log:
Download original
Possible errors
arch/powerpc/platforms/pseries/lpar.o:(.toc+0x0): undefined reference to `mmu_pid_bits' make[1]: *** [Makefile:1167: vmlinux] Error 1 make: *** [Makefile:185: __sub-make] Error 2
No warnings found in log.
Full Log
# git rev-parse -q --verify 801980f6497946048709b9b09771a1729551d705^{commit} 801980f6497946048709b9b09771a1729551d705 already have revision, skipping fetch # git checkout -q -f -B kisskb 801980f6497946048709b9b09771a1729551d705 # git clean -qxdf # < git log -1 # commit 801980f6497946048709b9b09771a1729551d705 # Author: Michael Roth <mdroth@linux.vnet.ibm.com> # Date: Tue Aug 11 11:15:44 2020 -0500 # # powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death # # For a power9 KVM guest with XIVE enabled, running a test loop # where we hotplug 384 vcpus and then unplug them, the following traces # can be seen (generally within a few loops) either from the unplugged # vcpu: # # cpu 65 (hwid 65) Ready to die... # Querying DEAD? cpu 66 (66) shows 2 # list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048 # ------------[ cut here ]------------ # kernel BUG at lib/list_debug.c:56! # Oops: Exception in kernel mode, sig: 5 [#1] # LE SMP NR_CPUS=2048 NUMA pSeries # Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ... # CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1 # NIP: c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac # REGS: c0000009e5a17840 TRAP: 0700 Not tainted (4.18.0-221.el8.ppc64le) # MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000842 XER: 20040000 # ... # NIP __list_del_entry_valid+0xac/0x100 # LR __list_del_entry_valid+0xa8/0x100 # Call Trace: # __list_del_entry_valid+0xa8/0x100 (unreliable) # free_pcppages_bulk+0x1f8/0x940 # free_unref_page+0xd0/0x100 # xive_spapr_cleanup_queue+0x148/0x1b0 # xive_teardown_cpu+0x1bc/0x240 # pseries_mach_cpu_die+0x78/0x2f0 # cpu_die+0x48/0x70 # arch_cpu_idle_dead+0x20/0x40 # do_idle+0x2f4/0x4c0 # cpu_startup_entry+0x38/0x40 # start_secondary+0x7bc/0x8f0 # start_secondary_prolog+0x10/0x14 # # or on the worker thread handling the unplug: # # pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a # Querying DEAD? cpu 314 (314) shows 2 # BUG: Bad page state in process kworker/u768:3 pfn:95de1 # cpu 314 (hwid 314) Ready to die... # page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 # flags: 0x5ffffc00000000() # raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000 # raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000 # page dumped because: nonzero mapcount # Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ... # CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1 # Workqueue: pseries hotplug workque pseries_hp_work_fn # Call Trace: # dump_stack+0xb0/0xf4 (unreliable) # bad_page+0x12c/0x1b0 # free_pcppages_bulk+0x5bc/0x940 # page_alloc_cpu_dead+0x118/0x120 # cpuhp_invoke_callback.constprop.5+0xb8/0x760 # _cpu_down+0x188/0x340 # cpu_down+0x5c/0xa0 # cpu_subsys_offline+0x24/0x40 # device_offline+0xf0/0x130 # dlpar_offline_cpu+0x1c4/0x2a0 # dlpar_cpu_remove+0xb8/0x190 # dlpar_cpu_remove_by_index+0x12c/0x150 # dlpar_cpu+0x94/0x800 # pseries_hp_work_fn+0x128/0x1e0 # process_one_work+0x304/0x5d0 # worker_thread+0xcc/0x7a0 # kthread+0x1ac/0x1c0 # ret_from_kernel_thread+0x5c/0x80 # # The latter trace is due to the following sequence: # # page_alloc_cpu_dead # drain_pages # drain_pages_zone # free_pcppages_bulk # # where drain_pages() in this case is called under the assumption that # the unplugged cpu is no longer executing. To ensure that is the case, # and early call is made to __cpu_die()->pseries_cpu_die(), which runs a # loop that waits for the cpu to reach a halted state by polling its # status via query-cpu-stopped-state RTAS calls. It only polls for 25 # iterations before giving up, however, and in the trace above this # results in the following being printed only .1 seconds after the # hotplug worker thread begins processing the unplug request: # # pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a # Querying DEAD? cpu 314 (314) shows 2 # # At that point the worker thread assumes the unplugged CPU is in some # unknown/dead state and procedes with the cleanup, causing the race # with the XIVE cleanup code executed by the unplugged CPU. # # Fix this by waiting indefinitely, but also making an effort to avoid # spurious lockup messages by allowing for rescheduling after polling # the CPU status and printing a warning if we wait for longer than 120s. # # Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller") # Suggested-by: Michael Ellerman <mpe@ellerman.id.au> # Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> # Tested-by: Greg Kurz <groug@kaod.org> # Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> # Reviewed-by: Greg Kurz <groug@kaod.org> # [mpe: Trim oopses in change log slightly for readability] # Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> # Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com # < /opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux-ld --version # < git log --format=%s --max-count=1 801980f6497946048709b9b09771a1729551d705 # < make -s -j 24 ARCH=powerpc O=/kisskb/build/powerpc-fixes_ppc64_defconfig+NO_RADIX_powerpc-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux- ppc64_defconfig # Added to kconfig CONFIG_PPC_RADIX_MMU=n # < make -s -j 24 ARCH=powerpc O=/kisskb/build/powerpc-fixes_ppc64_defconfig+NO_RADIX_powerpc-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux- help # make -s -j 24 ARCH=powerpc O=/kisskb/build/powerpc-fixes_ppc64_defconfig+NO_RADIX_powerpc-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux- olddefconfig # make -s -j 24 ARCH=powerpc O=/kisskb/build/powerpc-fixes_ppc64_defconfig+NO_RADIX_powerpc-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux- arch/powerpc/platforms/pseries/lpar.o:(.toc+0x0): undefined reference to `mmu_pid_bits' make[1]: *** [/kisskb/src/Makefile:1167: vmlinux] Error 1 make: *** [Makefile:185: __sub-make] Error 2 Command 'make -s -j 24 ARCH=powerpc O=/kisskb/build/powerpc-fixes_ppc64_defconfig+NO_RADIX_powerpc-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/powerpc64-linux/bin/powerpc64-linux- ' returned non-zero exit status 2 # rm -rf /kisskb/build/powerpc-fixes_ppc64_defconfig+NO_RADIX_powerpc-gcc4.9 # Build took: 0:03:53.218546
© Michael Ellerman 2006-2018.