Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Last revision Both sides next revision
en:pcie:disable-fatal [2019/04/17 04:16]
alex created
en:pcie:disable-fatal [2019/04/17 04:21]
alex
Line 21: Line 21:
 </​code>​ </​code>​
  
-To work around this crash, PCIe fatal error reporting must be disabled on the switch or root port upstream of the FPGA.  Specifically,​ two bits must be cleared - SERR in the command register, and the fatal error reporting enable bit in the device control register in the PCIe capability. ​ The following script performs these operations on the switch port upstream of the specified PCIe device ID.  ​+The device falling off the bus triggers a PCIe fatal error that causes the kernel to panic and the iDRAC to reboot the machine. ​ The iDRAC is totally independent from the operating system; it will still reboot the machine even if the operating system ignores the error. ​  
 + 
 +To work around this crash, the error must not be reported to the OS or to the iDRAC. ​ To that end, PCIe fatal error reporting must be disabled on the switch or root port upstream of the FPGA.  Specifically,​ two bits must be cleared - SERR in the command register, and the fatal error reporting enable bit in the device control register in the PCIe capability. ​ The following script performs these operations on the switch port upstream of the specified PCIe device ID.  ​
  
 <code sh> <code sh>
Line 62: Line 64:
 echo "​Device control:"​ $ctrl echo "​Device control:"​ $ctrl
  
-# clear non-fatal error reporting enable bit in device control register+# clear fatal error reporting enable bit in device control register
 setpci -s $port CAP_EXP+8.w=$(printf "​%04x"​ $(("​0x$ctrl"​ & ~0x0004))) setpci -s $port CAP_EXP+8.w=$(printf "​%04x"​ $(("​0x$ctrl"​ & ~0x0004)))