亚欧色一区w666天堂,色情一区二区三区免费看,少妇特黄A片一区二区三区,亚洲人成网站999久久久综合,国产av熟女一区二区三区

  • 發布文章
  • 消息中心
點贊
收藏
評論
分享
原創

內核驅動-mlnx網卡報device's health compromised故障處理

2023-11-20 05:45:48
1119
0

        • <form id='Yczcu'></form>
            <bdo id='rMION'><sup id='yclg1'><div id='aDlg2'><bdo id='1KXhf'></bdo></div></sup></bdo>

                • [613405.736532] mlx5_core 0000:2a:00.0: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613405.737166] mlx5_core 0000:2a:00.0: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613405.738196] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613405.738781] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613405.739334] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613405.739904] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613405.740465] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613405.741018] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613405.741550] mlx5_core 0000:2a:00.0: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613405.742070] mlx5_core 0000:2a:00.0: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613405.742589] mlx5_core 0000:2a:00.0: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613405.743089] mlx5_core 0000:2a:00.0: print_health_info:502:(pid 0): time 0

                  [613405.743575] mlx5_core 0000:2a:00.0: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613405.744054] mlx5_core 0000:2a:00.0: print_health_info:504:(pid 0): rfr 0

                  [613405.744522] mlx5_core 0000:2a:00.0: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613405.744989] mlx5_core 0000:2a:00.0: print_health_info:506:(pid 0): irisc_index 7

                  [613405.745405] mlx5_core 0000:2a:00.0: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613405.745840] mlx5_core 0000:2a:00.0: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613405.746281] mlx5_core 0000:2a:00.0: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                  [613406.278016] mlx5_core 0000:be:00.1 ens6f1np1: Link up

                  [613406.285205] 8021q: adding VLAN 0 to HW filter on device ens6f1np1

                  [613406.325260] IPv6: ADDRCONF(NETDEV_CHANGE): ens6f1np1: link becomes ready

                  [613406.824530] mlx5_core 0000:2a:00.1: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613406.825059] mlx5_core 0000:2a:00.1: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613406.825930] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613406.826328] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613406.826758] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613406.827118] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613406.827503] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613406.827840] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613406.828167] mlx5_core 0000:2a:00.1: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613406.828489] mlx5_core 0000:2a:00.1: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613406.828819] mlx5_core 0000:2a:00.1: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613406.829126] mlx5_core 0000:2a:00.1: print_health_info:502:(pid 0): time 0

                  [613406.829434] mlx5_core 0000:2a:00.1: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613406.829781] mlx5_core 0000:2a:00.1: print_health_info:504:(pid 0): rfr 0

                  [613406.830129] mlx5_core 0000:2a:00.1: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613406.830479] mlx5_core 0000:2a:00.1: print_health_info:506:(pid 0): irisc_index 7

                  [613406.830827] mlx5_core 0000:2a:00.1: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613406.831150] mlx5_core 0000:2a:00.1: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613406.831485] mlx5_core 0000:2a:00.1: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                  [613406.888534] mlx5_core 0000:be:00.0: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613406.888971] mlx5_core 0000:be:00.0: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613406.889684] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613406.890047] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613406.890392] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613406.890720] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613406.891010] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613406.891308] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613406.891605] mlx5_core 0000:be:00.0: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613406.891893] mlx5_core 0000:be:00.0: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613406.892171] mlx5_core 0000:be:00.0: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613406.892438] mlx5_core 0000:be:00.0: print_health_info:502:(pid 0): time 0

                  [613406.892705] mlx5_core 0000:be:00.0: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613406.892999] mlx5_core 0000:be:00.0: print_health_info:504:(pid 0): rfr 0

                  [613406.893296] mlx5_core 0000:be:00.0: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613406.893602] mlx5_core 0000:be:00.0: print_health_info:506:(pid 0): irisc_index 7

                  [613406.893909] mlx5_core 0000:be:00.0: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613406.894213] mlx5_core 0000:be:00.0: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613406.894519] mlx5_core 0000:be:00.0: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                  [613407.976530] mlx5_core 0000:be:00.1: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613407.976887] mlx5_core 0000:be:00.1: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613407.977530] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613407.977875] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613407.978170] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613407.978499] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613407.978816] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613407.979092] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613407.979379] mlx5_core 0000:be:00.1: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613407.979656] mlx5_core 0000:be:00.1: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613407.979932] mlx5_core 0000:be:00.1: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613407.980196] mlx5_core 0000:be:00.1: print_health_info:502:(pid 0): time 0

                  [613407.980464] mlx5_core 0000:be:00.1: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613407.980729] mlx5_core 0000:be:00.1: print_health_info:504:(pid 0): rfr 0

                  [613407.980995] mlx5_core 0000:be:00.1: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613407.981273] mlx5_core 0000:be:00.1: print_health_info:506:(pid 0): irisc_index 7

                  [613407.981559] mlx5_core 0000:be:00.1: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613407.981840] mlx5_core 0000:be:00.1: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613407.982121] mlx5_core 0000:be:00.1: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                   

                  QuadEn參數說明:

                  QuadEn為1表示Flash工作在四線模式,QuadEn為0表示Flash工作在二線模式。

                  四線模式、二線模式是Flash與SPIFLash燒寫器、網卡FW的通訊方式,四線模式的速率會優于二線模式,某些情況下,當FW向Flash讀取數據時,如果Flash工作于二線模式,由于速率的限制,可能不能及時響應FW的請求,會導致FW運行出現些問題。

                  網卡上電過程中,FW會向Flash讀取數據,FW首先會檢查Fash是否支持四線模式,如果支持則采用四線模式通訊,不支持則采用二線模式通訊。

                  問題結論:

                  開啟固件的OuadEn參數。

                  解決方案: 

                  測試過程中用的網卡沒有經過生產的FT階段, 在生產的FT階段會開啟

                  修改方法:

                  參考《ip link set down關閉后link燈依然點亮》安裝mft工具修改固件參數QuadEn,重啟生效

                   

                  0條評論
                  0 / 1000
                  y****n
                  11文章數
                  0粉絲數
                  y****n
                  11 文章 | 0 粉絲
                  原創

                  內核驅動-mlnx網卡報device's health compromised故障處理

                  2023-11-20 05:45:48
                  1119
                  0

                  [613405.736532] mlx5_core 0000:2a:00.0: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613405.737166] mlx5_core 0000:2a:00.0: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613405.738196] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613405.738781] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613405.739334] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613405.739904] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613405.740465] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613405.741018] mlx5_core 0000:2a:00.0: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613405.741550] mlx5_core 0000:2a:00.0: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613405.742070] mlx5_core 0000:2a:00.0: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613405.742589] mlx5_core 0000:2a:00.0: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613405.743089] mlx5_core 0000:2a:00.0: print_health_info:502:(pid 0): time 0

                  [613405.743575] mlx5_core 0000:2a:00.0: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613405.744054] mlx5_core 0000:2a:00.0: print_health_info:504:(pid 0): rfr 0

                  [613405.744522] mlx5_core 0000:2a:00.0: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613405.744989] mlx5_core 0000:2a:00.0: print_health_info:506:(pid 0): irisc_index 7

                  [613405.745405] mlx5_core 0000:2a:00.0: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613405.745840] mlx5_core 0000:2a:00.0: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613405.746281] mlx5_core 0000:2a:00.0: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                  [613406.278016] mlx5_core 0000:be:00.1 ens6f1np1: Link up

                  [613406.285205] 8021q: adding VLAN 0 to HW filter on device ens6f1np1

                  [613406.325260] IPv6: ADDRCONF(NETDEV_CHANGE): ens6f1np1: link becomes ready

                  [613406.824530] mlx5_core 0000:2a:00.1: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613406.825059] mlx5_core 0000:2a:00.1: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613406.825930] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613406.826328] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613406.826758] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613406.827118] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613406.827503] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613406.827840] mlx5_core 0000:2a:00.1: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613406.828167] mlx5_core 0000:2a:00.1: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613406.828489] mlx5_core 0000:2a:00.1: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613406.828819] mlx5_core 0000:2a:00.1: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613406.829126] mlx5_core 0000:2a:00.1: print_health_info:502:(pid 0): time 0

                  [613406.829434] mlx5_core 0000:2a:00.1: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613406.829781] mlx5_core 0000:2a:00.1: print_health_info:504:(pid 0): rfr 0

                  [613406.830129] mlx5_core 0000:2a:00.1: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613406.830479] mlx5_core 0000:2a:00.1: print_health_info:506:(pid 0): irisc_index 7

                  [613406.830827] mlx5_core 0000:2a:00.1: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613406.831150] mlx5_core 0000:2a:00.1: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613406.831485] mlx5_core 0000:2a:00.1: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                  [613406.888534] mlx5_core 0000:be:00.0: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613406.888971] mlx5_core 0000:be:00.0: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613406.889684] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613406.890047] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613406.890392] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613406.890720] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613406.891010] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613406.891308] mlx5_core 0000:be:00.0: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613406.891605] mlx5_core 0000:be:00.0: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613406.891893] mlx5_core 0000:be:00.0: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613406.892171] mlx5_core 0000:be:00.0: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613406.892438] mlx5_core 0000:be:00.0: print_health_info:502:(pid 0): time 0

                  [613406.892705] mlx5_core 0000:be:00.0: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613406.892999] mlx5_core 0000:be:00.0: print_health_info:504:(pid 0): rfr 0

                  [613406.893296] mlx5_core 0000:be:00.0: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613406.893602] mlx5_core 0000:be:00.0: print_health_info:506:(pid 0): irisc_index 7

                  [613406.893909] mlx5_core 0000:be:00.0: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613406.894213] mlx5_core 0000:be:00.0: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613406.894519] mlx5_core 0000:be:00.0: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                  [613407.976530] mlx5_core 0000:be:00.1: poll_health:971:(pid 0): device's health compromised - reached miss count

                  [613407.976887] mlx5_core 0000:be:00.1: print_health_info:491:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:

                  [613407.977530] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[0] 0x00000000

                  [613407.977875] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[1] 0x00000000

                  [613407.978170] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[2] 0x00000000

                  [613407.978499] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[3] 0x00000000

                  [613407.978816] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[4] 0x00000000

                  [613407.979092] mlx5_core 0000:be:00.1: print_health_info:495:(pid 0): assert_var[5] 0x00000000

                  [613407.979379] mlx5_core 0000:be:00.1: print_health_info:498:(pid 0): assert_exit_ptr 0x20a202c8

                  [613407.979656] mlx5_core 0000:be:00.1: print_health_info:499:(pid 0): assert_callra 0x20a26488

                  [613407.979932] mlx5_core 0000:be:00.1: print_health_info:500:(pid 0): fw_ver 26.35.2000

                  [613407.980196] mlx5_core 0000:be:00.1: print_health_info:502:(pid 0): time 0

                  [613407.980464] mlx5_core 0000:be:00.1: print_health_info:503:(pid 0): hw_id 0x00000216

                  [613407.980729] mlx5_core 0000:be:00.1: print_health_info:504:(pid 0): rfr 0

                  [613407.980995] mlx5_core 0000:be:00.1: print_health_info:505:(pid 0): severity 3 (ERROR)

                  [613407.981273] mlx5_core 0000:be:00.1: print_health_info:506:(pid 0): irisc_index 7

                  [613407.981559] mlx5_core 0000:be:00.1: print_health_info:507:(pid 0): synd 0x1: firmware internal error

                  [613407.981840] mlx5_core 0000:be:00.1: print_health_info:509:(pid 0): ext_synd 0x8a02

                  [613407.982121] mlx5_core 0000:be:00.1: print_health_info:510:(pid 0): raw fw_ver 0x1a2307d0

                   

                  QuadEn參數說明:

                  QuadEn為1表示Flash工作在四線模式,QuadEn為0表示Flash工作在二線模式。

                  四線模式、二線模式是Flash與SPIFLash燒寫器、網卡FW的通訊方式,四線模式的速率會優于二線模式,某些情況下,當FW向Flash讀取數據時,如果Flash工作于二線模式,由于速率的限制,可能不能及時響應FW的請求,會導致FW運行出現些問題。

                  網卡上電過程中,FW會向Flash讀取數據,FW首先會檢查Fash是否支持四線模式,如果支持則采用四線模式通訊,不支持則采用二線模式通訊。

                  問題結論:

                  開啟固件的OuadEn參數。

                  解決方案: 

                  測試過程中用的網卡沒有經過生產的FT階段, 在生產的FT階段會開啟

                  修改方法:

                  參考《ip link set down關閉后link燈依然點亮》安裝mft工具修改固件參數QuadEn,重啟生效

                   

                  文章來自個人專欄
                  文章 | 訂閱
                  0條評論
                  0 / 1000
                  請輸入你的評論
                  0
                  0