vulnerability

SUSE: CVE-2024-35931: SUSE Linux Security Advisory

Severity
5
CVSS
(AV:L/AC:L/Au:S/C:N/I:N/A:C)
Published
05/19/2024
Added
06/24/2024
Modified
02/18/2025

Description

In the Linux kernel, the following vulnerability has been resolved:

drm/amdgpu: Skip do PCI error slot reset during RAS recovery

Why:
The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
caused system hang.
[ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 557.373718] [drm] PCIE GART of 512M enabled.
[ 557.373722] [drm] PTB located at 0x0000031FED700000
[ 557.373788] [drm] VRAM is lost due to GPU reset!
[ 557.373789] [drm] PSP is resuming...
[ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
[ 557.547067] [drm] PCI error: detected callback, state(1)!!
[ 557.547069] [drm] No support for XGMI hive yet...
[ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
[ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
[ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
[ 557.610492] [drm] PCI error: slot reset callback!!
...
[ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
[ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
[ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
[ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu
[ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
[ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
[ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
[ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
[ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
[ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
[ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
[ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
[ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
[ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
[ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
[ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 560.843444] PKRU: 55555554
[ 560.846480] Call Trace:
[ 560.849225]
[ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea
[ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea
[ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
[ 560.867778] ? show_regs.part.0+0x23/0x29
[ 560.872293] ? __die_body.cold+0x8/0xd
[ 560.876502] ? die_addr+0x3e/0x60
[ 560.880238] ? exc_general_protection+0x1c5/0x410
[ 560.885532] ? asm_exc_general_protection+0x27/0x30
[ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
[ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
[ 560.904520] process_one_work+0x228/0x3d0
How:
In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Solution(s)

suse-upgrade-cluster-md-kmp-64kbsuse-upgrade-cluster-md-kmp-azuresuse-upgrade-cluster-md-kmp-defaultsuse-upgrade-cluster-md-kmp-rtsuse-upgrade-dlm-kmp-64kbsuse-upgrade-dlm-kmp-azuresuse-upgrade-dlm-kmp-defaultsuse-upgrade-dlm-kmp-rtsuse-upgrade-dtb-allwinnersuse-upgrade-dtb-alterasuse-upgrade-dtb-amazonsuse-upgrade-dtb-amdsuse-upgrade-dtb-amlogicsuse-upgrade-dtb-apmsuse-upgrade-dtb-applesuse-upgrade-dtb-armsuse-upgrade-dtb-broadcomsuse-upgrade-dtb-caviumsuse-upgrade-dtb-exynossuse-upgrade-dtb-freescalesuse-upgrade-dtb-hisiliconsuse-upgrade-dtb-lgsuse-upgrade-dtb-marvellsuse-upgrade-dtb-mediateksuse-upgrade-dtb-nvidiasuse-upgrade-dtb-qcomsuse-upgrade-dtb-renesassuse-upgrade-dtb-rockchipsuse-upgrade-dtb-socionextsuse-upgrade-dtb-sprdsuse-upgrade-dtb-xilinxsuse-upgrade-gfs2-kmp-64kbsuse-upgrade-gfs2-kmp-azuresuse-upgrade-gfs2-kmp-defaultsuse-upgrade-gfs2-kmp-rtsuse-upgrade-kernel-64kbsuse-upgrade-kernel-64kb-develsuse-upgrade-kernel-64kb-extrasuse-upgrade-kernel-64kb-livepatch-develsuse-upgrade-kernel-64kb-optionalsuse-upgrade-kernel-azuresuse-upgrade-kernel-azure-develsuse-upgrade-kernel-azure-extrasuse-upgrade-kernel-azure-livepatch-develsuse-upgrade-kernel-azure-optionalsuse-upgrade-kernel-azure-vdsosuse-upgrade-kernel-debugsuse-upgrade-kernel-debug-develsuse-upgrade-kernel-debug-livepatch-develsuse-upgrade-kernel-debug-vdsosuse-upgrade-kernel-defaultsuse-upgrade-kernel-default-basesuse-upgrade-kernel-default-base-rebuildsuse-upgrade-kernel-default-develsuse-upgrade-kernel-default-extrasuse-upgrade-kernel-default-livepatchsuse-upgrade-kernel-default-livepatch-develsuse-upgrade-kernel-default-optionalsuse-upgrade-kernel-default-vdsosuse-upgrade-kernel-develsuse-upgrade-kernel-devel-azuresuse-upgrade-kernel-devel-rtsuse-upgrade-kernel-docssuse-upgrade-kernel-docs-htmlsuse-upgrade-kernel-kvmsmallsuse-upgrade-kernel-kvmsmall-develsuse-upgrade-kernel-kvmsmall-livepatch-develsuse-upgrade-kernel-kvmsmall-vdsosuse-upgrade-kernel-macrossuse-upgrade-kernel-obs-buildsuse-upgrade-kernel-obs-qasuse-upgrade-kernel-rtsuse-upgrade-kernel-rt-develsuse-upgrade-kernel-rt-extrasuse-upgrade-kernel-rt-livepatch-develsuse-upgrade-kernel-rt-optionalsuse-upgrade-kernel-rt-vdsosuse-upgrade-kernel-rt_debugsuse-upgrade-kernel-rt_debug-develsuse-upgrade-kernel-rt_debug-livepatch-develsuse-upgrade-kernel-rt_debug-vdsosuse-upgrade-kernel-sourcesuse-upgrade-kernel-source-azuresuse-upgrade-kernel-source-rtsuse-upgrade-kernel-source-vanillasuse-upgrade-kernel-symssuse-upgrade-kernel-syms-azuresuse-upgrade-kernel-syms-rtsuse-upgrade-kernel-zfcpdumpsuse-upgrade-kselftests-kmp-64kbsuse-upgrade-kselftests-kmp-azuresuse-upgrade-kselftests-kmp-defaultsuse-upgrade-kselftests-kmp-rtsuse-upgrade-ocfs2-kmp-64kbsuse-upgrade-ocfs2-kmp-azuresuse-upgrade-ocfs2-kmp-defaultsuse-upgrade-ocfs2-kmp-rtsuse-upgrade-reiserfs-kmp-64kbsuse-upgrade-reiserfs-kmp-azuresuse-upgrade-reiserfs-kmp-defaultsuse-upgrade-reiserfs-kmp-rt
Title
NEW

Explore Exposure Command

Confidently identify and prioritize exposures from endpoint to cloud with full attack surface visibility and threat-aware risk context.