在(zai)對fastreid進行容器化服務(wu)時,使用(yong)pytorch的Dataloader類加(jia)(jia)載數據,加(jia)(jia)載數據代(dai)碼
讀取數據過(guo)程(cheng)中(zhong),出現如下錯誤(wu):
Traceback (most recent call last):
File "/dl/python/python/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/dl/python/python/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/dl/python/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 321, in reduce_storage
fd, size = storage.share_fd()
RuntimeError: unable to write to file </torch_108661_22634063>
Traceback (most recent call last):
File "/dl/python/python/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/dl/python/python/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/dl/python/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 321, in reduce_storage
fd, size = storage.share_fd()
RuntimeError: unable to write to file </torch_108661_1392190158>
問題原因:
上(shang)述問(wen)題發生在容器中,可(ke)能是因為(wei)容器內(nei)的共享內(nei)存(cun)機(ji)制與宿主機(ji)有所不同,導致無法(fa)正確地傳(chuan)輸(shu)數據(ju)。
解決方法:
使用 --ipc=host 選項啟動容器,以(yi)便(bian)容器和宿主機可(ke)以(yi)共享(xiang)進程間(jian)通信(IPC)資(zi)源,上述錯誤不再(zai)出現。