trio-parallel: CPU parallelism for Trio
Do you have CPU-bound work that just keeps slowing down your Trio event loop no matter what you try? Do you need to get all those cores humming at once? This is the library for you!
The aim of trio-parallel is to use the lightest-weight, lowest-overhead, lowest-latency method to achieve CPU parallelism of arbitrary Python code with a dead-simple API.
Resources
License |
|
Documentation |
|
Chat |
|
Forum |
|
Issues |
|
Repository |
|
Tests |
|
Coverage |
|
Style |
|
Distribution |
Example
import functools
import multiprocessing
import trio
import trio_parallel
def loop(n):
# Arbitrary CPU-bound work
for _ in range(n):
pass
print("Loops completed:", n)
async def amain():
t0 = trio.current_time()
async with trio.open_nursery() as nursery:
# Do CPU-bound work in parallel
for i in [6, 7, 8] * 4:
nursery.start_soon(trio_parallel.run_sync, loop, 10 ** i)
# Event loop remains responsive
t1 = trio.current_time()
await trio.sleep(0)
print("Scheduling latency:", trio.current_time() - t1)
# This job could take far too long, make it cancellable!
nursery.start_soon(
functools.partial(
trio_parallel.run_sync, loop, 10 ** 20, kill_on_cancel=True
)
)
await trio.sleep(2)
# Only explicit kill_on_cancel jobs are terminated
nursery.cancel_scope.cancel()
print("Total runtime:", trio.current_time() - t0)
if __name__ == "__main__":
multiprocessing.freeze_support()
trio.run(amain)
Additional examples and the full API are available in the documentation.
Features
Bypasses the GIL for CPU-bound work
Minimal API complexity
looks and feels like Trio threads
Minimal internal complexity
No reliance on
multiprocessing.Pool
,ProcessPoolExecutor
, or any background threads
Cross-platform
print
just worksSeamless interoperation with
Automatic LIFO caching of subprocesses
Cancel seriously misbehaving code via SIGKILL/TerminateProcess
Convert segfaults and other scary things to catchable errors
FAQ
How does trio-parallel run Python code in parallel?
Currently, this project is based on multiprocessing
subprocesses and
has all the usual multiprocessing caveats (freeze_support
, pickleable objects
only, executing the __main__
module).
The case for basing these workers on multiprocessing is that it keeps a lot of
complexity outside of the project while offering a set of quirks that users are
likely already familiar with.
The pickling limitations can be partially alleviated by installing cloudpickle.
Can I have my workers talk to each other?
This is currently possible through the use of multiprocessing.Manager
,
but we don’t and will not officially support it.
This package focuses on providing
a flat hierarchy of worker subprocesses to run synchronous, CPU-bound functions.
If you are looking to create a nested hierarchy of processes communicating
asynchronously with each other, while preserving the power, safety, and convenience of
structured concurrency, look into tractor.
Or, if you are looking for a more customized solution, try using trio.run_process
to spawn additional Trio runs and have them talk to each other over sockets.
Can I let my workers outlive the main Trio process?
No. Trio’s structured concurrency strictly bounds job runs to within a given
trio.run
call, while cached idle workers are shutdown and killed if necessary
by our atexit
handler, so this use case is not supported.
How should I map a function over a collection of arguments?
This is fully possible but we leave the implementation of that up to you. Think of us as a loky for your joblib, but natively async and Trionic. We take care of the worker handling so that you can focus on the best concurrency for your application. That said, some example parallelism patterns can be found in the documentation.
Also, look into aiometer?
Contributing
If you notice any bugs, need any help, or want to contribute any code, GitHub issues and pull requests are very welcome! Please read the code of conduct.