Buildresult: powerpc-fixes/ia64-defconfig/ia64-gcc4.9 built on Aug 19 2020, 07:12
kisskb
Revisions
|
Branches
|
Compilers
|
Configs
|
Build Results
|
Build Failures
|
Status:
OK
Date/Time:
Aug 19 2020, 07:12
Duration:
0:02:25.126860
Builder:
ka3
Revision:
powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death (
801980f6497946048709b9b09771a1729551d705)
Target:
powerpc-fixes/ia64-defconfig/ia64-gcc4.9
Branch:
powerpc-fixes
Compiler:
ia64-gcc4.9
(ia64-linux-gcc (GCC) 4.9.4 / GNU ld (GNU Binutils) 2.29.1.20170915)
Config:
defconfig
(
download
)
Log:
Download original
Possible warnings (3)
<stdin>:1511:2: warning: #warning syscall clone3 not implemented [-Wcpp] arch/ia64/include/uapi/asm/cmpxchg.h:57:2: warning: value computed is not used [-Wunused-value] arch/ia64/include/uapi/asm/cmpxchg.h:57:2: warning: value computed is not used [-Wunused-value]
Full Log
# git rev-parse -q --verify 801980f6497946048709b9b09771a1729551d705^{commit} 801980f6497946048709b9b09771a1729551d705 already have revision, skipping fetch # git checkout -q -f -B kisskb 801980f6497946048709b9b09771a1729551d705 # git clean -qxdf # < git log -1 # commit 801980f6497946048709b9b09771a1729551d705 # Author: Michael Roth <mdroth@linux.vnet.ibm.com> # Date: Tue Aug 11 11:15:44 2020 -0500 # # powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death # # For a power9 KVM guest with XIVE enabled, running a test loop # where we hotplug 384 vcpus and then unplug them, the following traces # can be seen (generally within a few loops) either from the unplugged # vcpu: # # cpu 65 (hwid 65) Ready to die... # Querying DEAD? cpu 66 (66) shows 2 # list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048 # ------------[ cut here ]------------ # kernel BUG at lib/list_debug.c:56! # Oops: Exception in kernel mode, sig: 5 [#1] # LE SMP NR_CPUS=2048 NUMA pSeries # Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ... # CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1 # NIP: c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac # REGS: c0000009e5a17840 TRAP: 0700 Not tainted (4.18.0-221.el8.ppc64le) # MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000842 XER: 20040000 # ... # NIP __list_del_entry_valid+0xac/0x100 # LR __list_del_entry_valid+0xa8/0x100 # Call Trace: # __list_del_entry_valid+0xa8/0x100 (unreliable) # free_pcppages_bulk+0x1f8/0x940 # free_unref_page+0xd0/0x100 # xive_spapr_cleanup_queue+0x148/0x1b0 # xive_teardown_cpu+0x1bc/0x240 # pseries_mach_cpu_die+0x78/0x2f0 # cpu_die+0x48/0x70 # arch_cpu_idle_dead+0x20/0x40 # do_idle+0x2f4/0x4c0 # cpu_startup_entry+0x38/0x40 # start_secondary+0x7bc/0x8f0 # start_secondary_prolog+0x10/0x14 # # or on the worker thread handling the unplug: # # pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a # Querying DEAD? cpu 314 (314) shows 2 # BUG: Bad page state in process kworker/u768:3 pfn:95de1 # cpu 314 (hwid 314) Ready to die... # page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 # flags: 0x5ffffc00000000() # raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000 # raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000 # page dumped because: nonzero mapcount # Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ... # CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1 # Workqueue: pseries hotplug workque pseries_hp_work_fn # Call Trace: # dump_stack+0xb0/0xf4 (unreliable) # bad_page+0x12c/0x1b0 # free_pcppages_bulk+0x5bc/0x940 # page_alloc_cpu_dead+0x118/0x120 # cpuhp_invoke_callback.constprop.5+0xb8/0x760 # _cpu_down+0x188/0x340 # cpu_down+0x5c/0xa0 # cpu_subsys_offline+0x24/0x40 # device_offline+0xf0/0x130 # dlpar_offline_cpu+0x1c4/0x2a0 # dlpar_cpu_remove+0xb8/0x190 # dlpar_cpu_remove_by_index+0x12c/0x150 # dlpar_cpu+0x94/0x800 # pseries_hp_work_fn+0x128/0x1e0 # process_one_work+0x304/0x5d0 # worker_thread+0xcc/0x7a0 # kthread+0x1ac/0x1c0 # ret_from_kernel_thread+0x5c/0x80 # # The latter trace is due to the following sequence: # # page_alloc_cpu_dead # drain_pages # drain_pages_zone # free_pcppages_bulk # # where drain_pages() in this case is called under the assumption that # the unplugged cpu is no longer executing. To ensure that is the case, # and early call is made to __cpu_die()->pseries_cpu_die(), which runs a # loop that waits for the cpu to reach a halted state by polling its # status via query-cpu-stopped-state RTAS calls. It only polls for 25 # iterations before giving up, however, and in the trace above this # results in the following being printed only .1 seconds after the # hotplug worker thread begins processing the unplug request: # # pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a # Querying DEAD? cpu 314 (314) shows 2 # # At that point the worker thread assumes the unplugged CPU is in some # unknown/dead state and procedes with the cleanup, causing the race # with the XIVE cleanup code executed by the unplugged CPU. # # Fix this by waiting indefinitely, but also making an effort to avoid # spurious lockup messages by allowing for rescheduling after polling # the CPU status and printing a warning if we wait for longer than 120s. # # Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller") # Suggested-by: Michael Ellerman <mpe@ellerman.id.au> # Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> # Tested-by: Greg Kurz <groug@kaod.org> # Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> # Reviewed-by: Greg Kurz <groug@kaod.org> # [mpe: Trim oopses in change log slightly for readability] # Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> # Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com # < /opt/cross/kisskb/korg/gcc-4.9.4-nolibc/ia64-linux/bin/ia64-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-4.9.4-nolibc/ia64-linux/bin/ia64-linux-ld --version # < git log --format=%s --max-count=1 801980f6497946048709b9b09771a1729551d705 # < make -s -j 80 ARCH=ia64 O=/kisskb/build/powerpc-fixes_ia64-defconfig_ia64-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/ia64-linux/bin/ia64-linux- defconfig # < make -s -j 80 ARCH=ia64 O=/kisskb/build/powerpc-fixes_ia64-defconfig_ia64-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/ia64-linux/bin/ia64-linux- help # make -s -j 80 ARCH=ia64 O=/kisskb/build/powerpc-fixes_ia64-defconfig_ia64-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/ia64-linux/bin/ia64-linux- olddefconfig # make -s -j 80 ARCH=ia64 O=/kisskb/build/powerpc-fixes_ia64-defconfig_ia64-gcc4.9 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-4.9.4-nolibc/ia64-linux/bin/ia64-linux- <stdin>:1511:2: warning: #warning syscall clone3 not implemented [-Wcpp] In file included from /kisskb/src/arch/ia64/include/uapi/asm/intrinsics.h:22:0, from /kisskb/src/arch/ia64/include/asm/intrinsics.h:11, from /kisskb/src/arch/ia64/include/asm/timex.h:14, from /kisskb/src/include/linux/timex.h:65, from /kisskb/src/include/linux/time32.h:13, from /kisskb/src/include/linux/time.h:73, from /kisskb/src/fs/nfs/read.c:11: /kisskb/src/fs/nfs/read.c: In function 'nfs_read_completion': /kisskb/src/arch/ia64/include/uapi/asm/cmpxchg.h:57:2: warning: value computed is not used [-Wunused-value] ((__typeof__(*(ptr))) __xchg((unsigned long) (x), (ptr), sizeof(*(ptr)))) ^ /kisskb/src/fs/nfs/read.c:196:5: note: in expansion of macro 'xchg' xchg(&nfs_req_openctx(req)->error, error); ^ /kisskb/src/fs/nfs/read.c: In function 'nfs_readpage': /kisskb/src/arch/ia64/include/uapi/asm/cmpxchg.h:57:2: warning: value computed is not used [-Wunused-value] ((__typeof__(*(ptr))) __xchg((unsigned long) (x), (ptr), sizeof(*(ptr)))) ^ /kisskb/src/fs/nfs/read.c:355:2: note: in expansion of macro 'xchg' xchg(&ctx->error, 0); ^ No errors detected in 22420 functions. Completed OK # rm -rf /kisskb/build/powerpc-fixes_ia64-defconfig_ia64-gcc4.9 # Build took: 0:02:25.126860
© Michael Ellerman 2006-2018.