Why does importing module in #39;__main__#39; not allow multiprocessig to use module?(为什么在__main__中导入模块不允许multiprocessig使用模块?)
问题描述
我已经通过将导入移到顶部声明解决了我的问题,但这让我想知道:为什么我不能在函数中使用在 '__main__'
中导入的模块multiprocessing
的目标?
I've already solved my problem by moving the import to the top declarations, but it left me wondering: Why cant I use a module that was imported in '__main__'
in functions that are the targets of multiprocessing
?
例如:
import os
import multiprocessing as mp
def run(in_file, out_dir, out_q):
arcpy.RaterToPolygon_conversion(in_file, out_dir, "NO_SIMPIFY", "Value")
status = str("Done with "+os.path.basename(in_file))
out_q.put(status, block=False)
if __name__ == '__main__':
raw_input("Program may hang, press Enter to import ArcPy...")
import arcpy
q = mp.Queue()
_file = path/to/file
_dir = path/to/dir
# There are actually lots of files in a loop to build
# processes but I just do one for context here
p = mp.Process(target=run, args=(_file, _dir, q))
p.start()
# I do stuff with Queue below to status user
当您在 IDLE 中运行它时,它根本不会出错...只是继续进行 Queue
检查(这很好,所以不是问题).问题是,当您在 CMD 终端(操作系统或 Python)中运行它时,会产生 arcpy
未定义的错误!
When you run this in IDLE it doesn't error at all...just keeps doing a Queue
check (which is good so not the problem). The problem is that when you run this in the CMD terminal (either OS or Python) it produces the error that arcpy
is not defined!
只是一个奇怪的话题.
推荐答案
类unix系统和windows的情况不同.在 unixy 系统上,multiprocessing
使用 fork
创建共享父内存空间的写时复制视图的子进程.子进程会看到来自父进程的导入,包括父进程在 if __name__ == "__main__":
下导入的任何内容.
The situation is different in unix-like systems and Windows. On the unixy systems, multiprocessing
uses fork
to create child processes that share a copy-on-write view of the parent memory space. The child sees the imports from the parent, including anything the parent imported under if __name__ == "__main__":
.
在 windows 上,没有 fork,必须执行一个新进程.但是简单地重新运行父进程是行不通的——它会再次运行整个程序.相反,multiprocessing
运行自己的 python 程序,该程序导入父主脚本,然后腌制/取消腌制父对象空间的视图,希望这对于子进程来说足够了.
On windows, there is no fork, a new process has to be executed. But simply rerunning the parent process doesn't work - it would run the whole program again. Instead, multiprocessing
runs its own python program that imports the parent main script and then pickles/unpickles a view of the parent object space that is, hopefully, sufficient for the child process.
该程序是子进程的 __main__
并且父脚本的 __main__
不运行.主脚本就像任何其他模块一样被导入.原因很简单:运行父 __main__
只会再次运行完整的父程序,这是 mp
必须避免的.
That program is the __main__
for the child process and the __main__
of the parent script doesn't run. The main script was just imported like any other module. The reason is simple: running the parent __main__
would just run the full parent program again, which mp
must avoid.
这是一个测试来显示发生了什么.一个名为 testmp.py
的主模块和一个由第一个模块导入的第二个模块 test2.py
.
Here is a test to show what is going on. A main module called testmp.py
and a second module test2.py
that is imported by the first.
testmp.py
import os
import multiprocessing as mp
print("importing test2")
import test2
def worker():
print('worker pid: {}, module name: {}, file name: {}'.format(os.getpid(),
__name__, __file__))
if __name__ == "__main__":
print('main pid: {}, module name: {}, file name: {}'.format(os.getpid(),
__name__, __file__))
print("running process")
proc = mp.Process(target=worker)
proc.start()
proc.join()
test2.py
import os
print('test2 pid: {}, module name: {}, file name: {}'.format(os.getpid(),
__name__, __file__))
在 Linux 上运行时,test2 被导入一次,worker 运行在主模块中.
When run on Linux, test2 is imported once and the worker runs in the main module.
importing test2
test2 pid: 17840, module name: test2, file name: /media/td/USB20FD/tmp/test2.py
main pid: 17840, module name: __main__, file name: testmp.py
running process
worker pid: 17841, module name: __main__, file name: testmp.py
在 windows 下,请注意importing test2"打印了两次 - testmp.py 运行了两次.但是main pid"只打印了一次——它的 __main__
没有运行.那是因为 multiprocessing
在导入期间将模块名称更改为 __mp_main__
.
Under windows, notice that "importing test2" is printed twice - testmp.py was run two times. But "main pid" was only printed once - its __main__
wasn't run. That's because multiprocessing
changed the module name to __mp_main__
during import.
E: mp>py testmp.py
importing test2
test2 pid: 7536, module name: test2, file name: E: mp est2.py
main pid: 7536, module name: __main__, file name: testmp.py
running process
importing test2
test2 pid: 7544, module name: test2, file name: E: mp est2.py
worker pid: 7544, module name: __mp_main__, file name: E: mp estmp.py
这篇关于为什么在'__main__'中导入模块不允许multiprocessig使用模块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:为什么在'__main__'中导入模块不允许multiprocessig使用模块?
- 使用 Cython 将 Python 链接到共享库 2022-01-01
- 我如何透明地重定向一个Python导入? 2022-01-01
- 检查具有纬度和经度的地理点是否在 shapefile 中 2022-01-01
- CTR 中的 AES 如何用于 Python 和 PyCrypto? 2022-01-01
- YouTube API v3 返回截断的观看记录 2022-01-01
- 计算测试数量的Python单元测试 2022-01-01
- ";find_element_by_name(';name';)";和&QOOT;FIND_ELEMENT(BY NAME,';NAME';)";之间有什么区别? 2022-01-01
- 使用公司代理使Python3.x Slack(松弛客户端) 2022-01-01
- 如何使用PYSPARK从Spark获得批次行 2022-01-01
- 我如何卸载 PyTorch? 2022-01-01