# git rev-parse -q --verify 75ecfb49516c53da00c57b9efe48fa3f5504a791^{commit} 75ecfb49516c53da00c57b9efe48fa3f5504a791 already have revision, skipping fetch # git checkout -q -f -B kisskb 75ecfb49516c53da00c57b9efe48fa3f5504a791 # git clean -qxdf # < git log -1 # commit 75ecfb49516c53da00c57b9efe48fa3f5504a791 # Author: Mahesh Salgaonkar # Date: Mon Apr 23 10:29:27 2018 +0530 # # powerpc/mce: Fix a bug where mce loops on memory UE. # # The current code extracts the physical address for UE errors and then # hooks it up into memory failure infrastructure. On successful # extraction of physical address it wrongly sets "handled = 1" which # means this UE error has been recovered. Since MCE handler gets return # value as handled = 1, it assumes that error has been recovered and # goes back to same NIP. This causes MCE interrupt again and again in a # loop leading to hard lockup. # # Also, initialize phys_addr to ULONG_MAX so that we don't end up # queuing undesired page to hwpoison. # # Without this patch we see: # Severe Machine check interrupt [Recovered] # NIP: [000000001002588c] PID: 7109 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffd2755940 # Physical address: 000020181a080000 # ... # Severe Machine check interrupt [Recovered] # NIP: [000000001002588c] PID: 7109 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffd2755940 # Physical address: 000020181a080000 # Severe Machine check interrupt [Recovered] # NIP: [000000001002588c] PID: 7109 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffd2755940 # Physical address: 000020181a080000 # Memory failure: 0x20181a08: recovery action for dirty LRU page: Recovered # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # ... # Watchdog CPU:38 Hard LOCKUP # # After this patch we see: # # Severe Machine check interrupt [Not recovered] # NIP: [00007fffaae585f4] PID: 7168 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffaafe28ac # Physical address: 00002017c0bd0000 # find[7168]: unhandled signal 7 at 00007fffaae585f4 nip 00007fffaae585f4 lr 00007fffaae585e0 code 4 # Memory failure: 0x2017c0bd: recovery action for dirty LRU page: Recovered # # Fixes: 01eaac2b0591 ("powerpc/mce: Hookup ierror (instruction) UE errors") # Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors") # Cc: stable@vger.kernel.org # v4.15+ # Signed-off-by: Mahesh Salgaonkar # Signed-off-by: Balbir Singh # Reviewed-by: Balbir Singh # Signed-off-by: Michael Ellerman # < /opt/cross/kisskb/br-sparc64-full-2016.08-613-ge98b4dd/bin/sparc64-linux-gcc --version # < git log --format=%s --max-count=1 75ecfb49516c53da00c57b9efe48fa3f5504a791 # < make -s -j 10 ARCH=sparc64 O=/kisskb/build/powerpc-fixes_sparc64-defconfig_sparc64 CROSS_COMPILE=/opt/cross/kisskb/br-sparc64-full-2016.08-613-ge98b4dd/bin/sparc64-linux- defconfig # make -s -j 10 ARCH=sparc64 O=/kisskb/build/powerpc-fixes_sparc64-defconfig_sparc64 CROSS_COMPILE=/opt/cross/kisskb/br-sparc64-full-2016.08-613-ge98b4dd/bin/sparc64-linux- WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned. kernel: arch/sparc/boot/image is ready kernel: arch/sparc/boot/zImage is ready Completed OK # rm -rf /kisskb/build/powerpc-fixes_sparc64-defconfig_sparc64 # Build took: 0:01:18.386801