# git rev-parse -q --verify 734ad0af3609464f8f93e00b6c0de1e112f44559^{commit} 734ad0af3609464f8f93e00b6c0de1e112f44559 already have revision, skipping fetch # git checkout -q -f -B kisskb 734ad0af3609464f8f93e00b6c0de1e112f44559 # git clean -qxdf # < git log -1 # commit 734ad0af3609464f8f93e00b6c0de1e112f44559 # Author: Nysal Jan K.A. # Date: Thu Aug 29 07:58:27 2024 +0530 # # powerpc/qspinlock: Fix deadlock in MCS queue # # If an interrupt occurs in queued_spin_lock_slowpath() after we increment # qnodesp->count and before node->lock is initialized, another CPU might # see stale lock values in get_tail_qnode(). If the stale lock value happens # to match the lock on that CPU, then we write to the "next" pointer of # the wrong qnode. This causes a deadlock as the former CPU, once it becomes # the head of the MCS queue, will spin indefinitely until it's "next" pointer # is set by its successor in the queue. # # Running stress-ng on a 16 core (16EC/16VP) shared LPAR, results in # occasional lockups similar to the following: # # $ stress-ng --all 128 --vm-bytes 80% --aggressive \ # --maximize --oomable --verify --syslog \ # --metrics --times --timeout 5m # # watchdog: CPU 15 Hard LOCKUP # ...... # NIP [c0000000000b78f4] queued_spin_lock_slowpath+0x1184/0x1490 # LR [c000000001037c5c] _raw_spin_lock+0x6c/0x90 # Call Trace: # 0xc000002cfffa3bf0 (unreliable) # _raw_spin_lock+0x6c/0x90 # raw_spin_rq_lock_nested.part.135+0x4c/0xd0 # sched_ttwu_pending+0x60/0x1f0 # __flush_smp_call_function_queue+0x1dc/0x670 # smp_ipi_demux_relaxed+0xa4/0x100 # xive_muxed_ipi_action+0x20/0x40 # __handle_irq_event_percpu+0x80/0x240 # handle_irq_event_percpu+0x2c/0x80 # handle_percpu_irq+0x84/0xd0 # generic_handle_irq+0x54/0x80 # __do_irq+0xac/0x210 # __do_IRQ+0x74/0xd0 # 0x0 # do_IRQ+0x8c/0x170 # hardware_interrupt_common_virt+0x29c/0x2a0 # --- interrupt: 500 at queued_spin_lock_slowpath+0x4b8/0x1490 # ...... # NIP [c0000000000b6c28] queued_spin_lock_slowpath+0x4b8/0x1490 # LR [c000000001037c5c] _raw_spin_lock+0x6c/0x90 # --- interrupt: 500 # 0xc0000029c1a41d00 (unreliable) # _raw_spin_lock+0x6c/0x90 # futex_wake+0x100/0x260 # do_futex+0x21c/0x2a0 # sys_futex+0x98/0x270 # system_call_exception+0x14c/0x2f0 # system_call_vectored_common+0x15c/0x2ec # # The following code flow illustrates how the deadlock occurs. # For the sake of brevity, assume that both locks (A and B) are # contended and we call the queued_spin_lock_slowpath() function. # # CPU0 CPU1 # ---- ---- # spin_lock_irqsave(A) | # spin_unlock_irqrestore(A) | # spin_lock(B) | # | | # ▼ | # id = qnodesp->count++; | # (Note that nodes[0].lock == A) | # | | # ▼ | # Interrupt | # (happens before "nodes[0].lock = B") | # | | # ▼ | # spin_lock_irqsave(A) | # | | # ▼ | # id = qnodesp->count++ | # nodes[1].lock = A | # | | # ▼ | # Tail of MCS queue | # | spin_lock_irqsave(A) # ▼ | # Head of MCS queue ▼ # | CPU0 is previous tail # ▼ | # Spin indefinitely ▼ # (until "nodes[1].next != NULL") prev = get_tail_qnode(A, CPU0) # | # ▼ # prev == &qnodes[CPU0].nodes[0] # (as qnodes[CPU0].nodes[0].lock == A) # | # ▼ # WRITE_ONCE(prev->next, node) # | # ▼ # Spin indefinitely # (until nodes[0].locked == 1) # # Thanks to Saket Kumar Bhaskar for help with recreating the issue # # Fixes: 84990b169557 ("powerpc/qspinlock: add mcs queueing for contended waiters") # Cc: stable@vger.kernel.org # v6.2+ # Reported-by: Geetika Moolchandani # Reported-by: Vaishnavi Bhat # Reported-by: Jijo Varghese # Signed-off-by: Nysal Jan K.A. # Reviewed-by: Nicholas Piggin # Signed-off-by: Michael Ellerman # Link: https://msgid.link/20240829022830.1164355-1-nysal@linux.ibm.com # < /opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld --version # < git log --format=%s --max-count=1 734ad0af3609464f8f93e00b6c0de1e112f44559 # make -s -j 40 ARCH=powerpc O=/kisskb/build/powerpc-fixes_44x_warp_defconfig_powerpc-gcc5 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux- 44x/warp_defconfig # < make -s -j 40 ARCH=powerpc O=/kisskb/build/powerpc-fixes_44x_warp_defconfig_powerpc-gcc5 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux- help # make -s -j 40 ARCH=powerpc O=/kisskb/build/powerpc-fixes_44x_warp_defconfig_powerpc-gcc5 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux- olddefconfig # make -s -j 40 ARCH=powerpc O=/kisskb/build/powerpc-fixes_44x_warp_defconfig_powerpc-gcc5 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux- Completed OK # rm -rf /kisskb/build/powerpc-fixes_44x_warp_defconfig_powerpc-gcc5 # Build took: 0:00:40.866207