Windows BSOD 0xCA Analysis. 根据 Windbg Dump 分析,这个 BSOD 是由 IOMMU 操作失败导致的。具体来看:
错误发生在设备初始化阶段:
1 2 DevNode State = DeviceNodeResourcesAssigned (0x306) Previous State = DeviceNodeDriversAdded (0x305)
关键问题点:
1 2 3 BUGCHECK_CODE: ca (PNP_DETECTED_FATAL_ERROR) Arg1: 13 (IOMMU operation failure) Arg2: 1000 (Unblock operation)
解决 IOMMU 操作失败的问题。主要通过:
正确初始化 DMA/IOMMU 映射(当前代码已支持,不需修改)
确保状态转换时 DMA 操作安全完成 (当前代码待修改)
代码改动:
设备进入D3/D0以前Complete所有I/O请求(SRB),使用Storport miniport要求的API RequestComplete : We do not recommend that writers of Storport miniport drivers use this particular Storport interface routine. Instead, the miniport driver should call StorPortNotification( RequestComplete ) for each outstanding request.
附录1:BSOD 0xCA的相关文档
https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0xca--pnp-detected-fatal-error?redirectedfrom=MSDN
https://www.sysnative.com/forums/threads/debugging-stop-0xca-dddriver-sys-dddriver64dcsa-sys.35039/
附录2:BSOD 0xCA的Windbg分析log
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* PNP_DETECTED_FATAL_ERROR (ca) PnP encountered a severe error, either as a result of a problem in a driver or a problem in PnP itself. The first argument describes the nature of the problem, the second argument is the address of the PDO. The other arguments vary depending on argument 1. Arguments: Arg1: 0000000000000013, IOMMU operation failure A critical IOMMU operation has failed. Arg2: 0000000000001000, Unblock operation Arg3: ffffffffc0350066, NT status code. Arg4: ffffb986629afc40, DevNode of the device. Debugging Details: ------------------ KEY_VALUES_STRING: 1 Key : Analysis.CPU.mSec Value: 4546 Key : Analysis.Elapsed.mSec Value: 4604 Key : Analysis.IO.Other.Mb Value: 17 Key : Analysis.IO.Read.Mb Value: 2 Key : Analysis.IO.Write.Mb Value: 24 Key : Analysis.Init.CPU.mSec Value: 968 Key : Analysis.Init.Elapsed.mSec Value: 4411983 Key : Analysis.Memory.CommitPeak.Mb Value: 92 Key : Analysis.Version.DbgEng Value: 10.0.27725.1000 Key : Analysis.Version.Description Value: 10.2408.27.01 amd64fre Key : Analysis.Version.Ext Value: 1.2408.27.1 Key : Bugcheck.Code.KiBugCheckData Value: 0xca Key : Bugcheck.Code.LegacyAPI Value: 0xca Key : Bugcheck.Code.TargetModel Value: 0xca Key : Dump.Attributes.AsUlong Value: 21800 Key : Dump.Attributes.DiagDataWrittenToHeader Value: 1 Key : Dump.Attributes.ErrorCode Value: 0 Key : Dump.Attributes.LastLine Value: Dump completed successfully. Key : Dump.Attributes.ProgressPercentage Value: 100 Key : Failure.Bucket Value: 0xCA_13_nt!PiDmaGuardProcessPreStart Key : Failure.Hash Value: {b367b2d8-0cc5-f3e0-e733-3787841dcdd2} Key : Hypervisor.Enlightenments.ValueHex Value: 7497cf94 Key : Hypervisor.Flags.AnyHypervisorPresent Value: 1 Key : Hypervisor.Flags.ApicEnlightened Value: 1 Key : Hypervisor.Flags.ApicVirtualizationAvailable Value: 0 Key : Hypervisor.Flags.AsyncMemoryHint Value: 0 Key : Hypervisor.Flags.CoreSchedulerRequested Value: 0 Key : Hypervisor.Flags.CpuManager Value: 1 Key : Hypervisor.Flags.DeprecateAutoEoi Value: 0 Key : Hypervisor.Flags.DynamicCpuDisabled Value: 1 Key : Hypervisor.Flags.Epf Value: 0 Key : Hypervisor.Flags.ExtendedProcessorMasks Value: 1 Key : Hypervisor.Flags.HardwareMbecAvailable Value: 1 Key : Hypervisor.Flags.MaxBankNumber Value: 0 Key : Hypervisor.Flags.MemoryZeroingControl Value: 0 Key : Hypervisor.Flags.NoExtendedRangeFlush Value: 0 Key : Hypervisor.Flags.NoNonArchCoreSharing Value: 1 Key : Hypervisor.Flags.Phase0InitDone Value: 1 Key : Hypervisor.Flags.PowerSchedulerQos Value: 0 Key : Hypervisor.Flags.RootScheduler Value: 0 Key : Hypervisor.Flags.SynicAvailable Value: 1 Key : Hypervisor.Flags.UseQpcBias Value: 0 Key : Hypervisor.Flags.Value Value: 38408431 Key : Hypervisor.Flags.ValueHex Value: 24a10ef Key : Hypervisor.Flags.VpAssistPage Value: 1 Key : Hypervisor.Flags.VsmAvailable Value: 1 Key : Hypervisor.RootFlags.AccessStats Value: 1 Key : Hypervisor.RootFlags.CrashdumpEnlightened Value: 1 Key : Hypervisor.RootFlags.CreateVirtualProcessor Value: 1 Key : Hypervisor.RootFlags.DisableHyperthreading Value: 0 Key : Hypervisor.RootFlags.HostTimelineSync Value: 1 Key : Hypervisor.RootFlags.HypervisorDebuggingEnabled Value: 0 Key : Hypervisor.RootFlags.IsHyperV Value: 1 Key : Hypervisor.RootFlags.LivedumpEnlightened Value: 1 Key : Hypervisor.RootFlags.MapDeviceInterrupt Value: 1 Key : Hypervisor.RootFlags.MceEnlightened Value: 1 Key : Hypervisor.RootFlags.Nested Value: 0 Key : Hypervisor.RootFlags.StartLogicalProcessor Value: 1 Key : Hypervisor.RootFlags.Value Value: 1015 Key : Hypervisor.RootFlags.ValueHex Value: 3f7 Key : SecureKernel.HalpHvciEnabled Value: 1 Key : WER.OS.Branch Value: ge_release Key : WER.OS.Version Value: 10.0.26100.1 BUGCHECK_CODE: ca BUGCHECK_P1: 13 BUGCHECK_P2: 1000 BUGCHECK_P3: ffffffffc0350066 BUGCHECK_P4: ffffb986629afc40 FILE_IN_CAB: MEMORY.DMP TAG_NOT_DEFINED_202b: *** Unknown TAG in analysis list 202b DUMP_FILE_ATTRIBUTES: 0x21800 FAULTING_THREAD: ffffb986625ef040 DEVICE_OBJECT: 0000000000001000 BLACKBOXBSD: 1 (!blackboxbsd) BLACKBOXNTFS: 1 (!blackboxntfs) BLACKBOXPNP: 1 (!blackboxpnp) BLACKBOXWINLOGON: 1 PROCESS_NAME: System LOCK_ADDRESS: fffff8019cd8a380 -- (!locks fffff8019cd8a380) KD: Scanning for held locks........................................................ Resource @ nt!PiEngineLock (0xfffff8019cd8a380) Exclusively owned Contention Count = 23 Threads: ffffb986625ef040-01<*> 1 total locks PNP_TRIAGE_DATA: Lock address : 0xfffff8019cd8a380 Thread Count : 1 Thread address: 0xffffb986625ef040 Thread wait : 0x281e98 STACK_TEXT: ffffdf84`8a6b7158 fffff801`9c8e5bca : 00000000`000000ca 00000000`00000013 00000000`00001000 ffffffff`c0350066 : nt!KeBugCheckEx ffffdf84`8a6b7160 fffff801`9c7db310 : ffffb986`629afc40 00000000`00000000 00000000`00000001 ffffb986`629afc40 : nt!PiDmaGuardProcessPreStart+0x10a7f6 ffffdf84`8a6b71a0 fffff801`9c697621 : ffffb986`629afc40 ffffdf84`8a6b7261 00000000`00000000 00000000`00000001 : nt!PipProcessStartPhase1+0x4c ffffdf84`8a6b71e0 fffff801`9c8187a7 : ffffb986`36694b20 ffffb986`626da790 ffffdf84`8a6b7300 fffff801`00000002 : nt!PipProcessDevNodeTree+0x645 ffffdf84`8a6b72b0 fffff801`9c2404bd : 00000001`00000003 ffffb986`36694b20 ffffb986`626da790 00000000`00000000 : nt!PiProcessReenumeration+0x9f ffffdf84`8a6b7300 fffff801`9c1249d2 : ffffb986`625ef040 ffffb986`366b6cb0 fffff801`9c23fe80 ffffb986`00000000 : nt!PnpDeviceActionWorker+0x63d ffffdf84`8a6b73c0 fffff801`9c25a9ea : ffffb986`625ef040 ffffb986`625ef040 fffff801`9c124820 ffffb986`366b6cb0 : nt!ExpWorkerThread+0x1b2 ffffdf84`8a6b7570 fffff801`9c4736f4 : ffffce81`b4759180 ffffb986`625ef040 fffff801`9c25a990 00320033`006d0065 : nt!PspSystemThreadStartup+0x5a ffffdf84`8a6b75c0 00000000`00000000 : ffffdf84`8a6b8000 ffffdf84`8a6b1000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x34 SYMBOL_NAME: nt!PiDmaGuardProcessPreStart+10a7f6 MODULE_NAME: nt IMAGE_NAME: ntkrnlmp.exe STACK_COMMAND: .process /r /p 0xffffb98636697040; .thread 0xffffb986625ef040 ; kb BUCKET_ID_FUNC_OFFSET: 10a7f6 FAILURE_BUCKET_ID: 0xCA_13_nt!PiDmaGuardProcessPreStart OS_VERSION: 10.0.26100.1 BUILDLAB_STR: ge_release OSPLATFORM_TYPE: x64 OSNAME: Windows 10 FAILURE_ID_HASH: {b367b2d8-0cc5-f3e0-e733-3787841dcdd2} Followup: MachineOwner --------- 1: kd> !devnode ffffb986629afc40 DevNode 0xffffb986629afc40 for PDO 0xffffb98665085060 Parent 0xffffb986629adc40 Sibling 0000000000 Child 0000000000 InstancePath is "PCI\VEN_1217&DEV_8621&SUBSYS_00021217&REV_01\4&32cd076f&0&0013" ServiceName is "bhtsddr" State = DeviceNodeResourcesAssigned (0x306) @ 2024 Oct 24 21:26:21.411 Previous State = DeviceNodeDriversAdded (0x305) @ 2024 Oct 24 21:26:21.411 StateHistory[02] = DeviceNodeDriversAdded (0x305) StateHistory[01] = DeviceNodeInitialized (0x304) StateHistory[00] = DeviceNodeUninitialized (0x301) StateHistory[19] = Unknown State (0x0) StateHistory[18] = Unknown State (0x0) StateHistory[17] = Unknown State (0x0) StateHistory[16] = Unknown State (0x0) StateHistory[15] = Unknown State (0x0) StateHistory[14] = Unknown State (0x0) StateHistory[13] = Unknown State (0x0) StateHistory[12] = Unknown State (0x0) StateHistory[11] = Unknown State (0x0) StateHistory[10] = Unknown State (0x0) StateHistory[09] = Unknown State (0x0) StateHistory[08] = Unknown State (0x0) StateHistory[07] = Unknown State (0x0) StateHistory[06] = Unknown State (0x0) StateHistory[05] = Unknown State (0x0) StateHistory[04] = Unknown State (0x0) StateHistory[03] = Unknown State (0x0) Flags (0x6c0000f0) DNF_ENUMERATED, DNF_IDS_QUERIED, DNF_HAS_BOOT_CONFIG, DNF_BOOT_CONFIG_RESERVED, DNF_NO_LOWER_DEVICE_FILTERS, DNF_NO_LOWER_CLASS_FILTERS, DNF_NO_UPPER_DEVICE_FILTERS, DNF_NO_UPPER_CLASS_FILTERS CapabilityFlags (0x00002000) WakeFromD3