Python多处理:为什么大块更慢?-Python问题

Python multiprocessing: why are large chunksizes slower?(Python多处理:为什么大块更慢?)

本文介绍了Python多处理:为什么大块更慢?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在使用 Python 的多处理模块分析一些代码('job' 函数只是将数字平方).

I've been profiling some code using Python's multiprocessing module (the 'job' function just squares the number).

data = range(100000000)
n=4
time1 = time.time()
processes = multiprocessing.Pool(processes=n)
results_list = processes.map(func=job, iterable=data, chunksize=10000)
processes.close()
time2 = time.time()
print(time2-time1)
print(results_list[0:10])

我发现奇怪的一件事是，最佳块大小似乎是 10k 元素左右 - 这在我的计算机上花了 16 秒.如果我将块大小增加到 100k 或 200k，那么它会减慢到 20 秒.

One thing I found odd is that the optimal chunksize appears to be around 10k elements - this took 16 seconds on my computer. If I increase the chunksize to 100k or 200k, then it slows to 20 seconds.

这种差异可能是由于更长的列表需要更长的酸洗时间吗?100 个元素的块大小需要 62 秒，我假设这是由于在不同进程之间来回传递块所需的额外时间.

Could this difference be due to the amount of time required for pickling being longer for longer lists? A chunksize of 100 elements takes 62 seconds which I'm assuming is due to the extra time required to pass the chunks back and forth between different processes.

问题描述

推荐答案