# git rev-parse -q --verify 75ecfb49516c53da00c57b9efe48fa3f5504a791^{commit} 75ecfb49516c53da00c57b9efe48fa3f5504a791 already have revision, skipping fetch # git checkout -q -f -B kisskb 75ecfb49516c53da00c57b9efe48fa3f5504a791 # git clean -qxdf # < git log -1 # commit 75ecfb49516c53da00c57b9efe48fa3f5504a791 # Author: Mahesh Salgaonkar # Date: Mon Apr 23 10:29:27 2018 +0530 # # powerpc/mce: Fix a bug where mce loops on memory UE. # # The current code extracts the physical address for UE errors and then # hooks it up into memory failure infrastructure. On successful # extraction of physical address it wrongly sets "handled = 1" which # means this UE error has been recovered. Since MCE handler gets return # value as handled = 1, it assumes that error has been recovered and # goes back to same NIP. This causes MCE interrupt again and again in a # loop leading to hard lockup. # # Also, initialize phys_addr to ULONG_MAX so that we don't end up # queuing undesired page to hwpoison. # # Without this patch we see: # Severe Machine check interrupt [Recovered] # NIP: [000000001002588c] PID: 7109 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffd2755940 # Physical address: 000020181a080000 # ... # Severe Machine check interrupt [Recovered] # NIP: [000000001002588c] PID: 7109 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffd2755940 # Physical address: 000020181a080000 # Severe Machine check interrupt [Recovered] # NIP: [000000001002588c] PID: 7109 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffd2755940 # Physical address: 000020181a080000 # Memory failure: 0x20181a08: recovery action for dirty LRU page: Recovered # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # Memory failure: 0x20181a08: already hardware poisoned # ... # Watchdog CPU:38 Hard LOCKUP # # After this patch we see: # # Severe Machine check interrupt [Not recovered] # NIP: [00007fffaae585f4] PID: 7168 Comm: find # Initiator: CPU # Error type: UE [Load/Store] # Effective address: 00007fffaafe28ac # Physical address: 00002017c0bd0000 # find[7168]: unhandled signal 7 at 00007fffaae585f4 nip 00007fffaae585f4 lr 00007fffaae585e0 code 4 # Memory failure: 0x2017c0bd: recovery action for dirty LRU page: Recovered # # Fixes: 01eaac2b0591 ("powerpc/mce: Hookup ierror (instruction) UE errors") # Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors") # Cc: stable@vger.kernel.org # v4.15+ # Signed-off-by: Mahesh Salgaonkar # Signed-off-by: Balbir Singh # Reviewed-by: Balbir Singh # Signed-off-by: Michael Ellerman # < /opt/cross/kisskb/gcc-5.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc --version # < git log --format=%s --max-count=1 75ecfb49516c53da00c57b9efe48fa3f5504a791 # < make -s -j 48 ARCH=powerpc O=/kisskb/build/powerpc-fixes_chrp32_defconfig_powerpc-5.3 CROSS_COMPILE=/opt/cross/kisskb/gcc-5.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux- chrp32_defconfig # make -s -j 48 ARCH=powerpc O=/kisskb/build/powerpc-fixes_chrp32_defconfig_powerpc-5.3 CROSS_COMPILE=/opt/cross/kisskb/gcc-5.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux- Completed OK # rm -rf /kisskb/build/powerpc-fixes_chrp32_defconfig_powerpc-5.3 # Build took: 0:00:44.172325