Created
January 25, 2023 23:37
-
-
Save danhey/b53fed6b3b968b134ae0f0ddbe4fb54d to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "id": "6aff8511-b94c-476e-85e6-06eec5057638", | |
| "metadata": {}, | |
| "source": [ | |
| "For this notebook I'm using `multiprocess` instead of `multiprocessing` (the default internal Python package). This will work for both but `multiprocess` is generally more stable. You can `pip install multiprocess` or change the import" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 1, | |
| "id": "22ea3160", | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "import tqdm\n", | |
| "from multiprocess import Pool" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "61d43ff7-7d6f-4232-b5c0-9f3092a6dcc9", | |
| "metadata": {}, | |
| "source": [ | |
| "Here's a simple example first. We have something we want to iterate over and perform an operation on. To make a progress bar around this is straightforward." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 2, | |
| "id": "a28db714-8e3d-4236-9597-1dbfb7ae8bbf", | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "def some_function(x):\n", | |
| " return x ** 2\n", | |
| "\n", | |
| "# The things we want to pass into our function\n", | |
| "inputs = list(range(0, 100, 1))" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "3054ed19-8f71-4a47-9485-60b3ad9653e9", | |
| "metadata": {}, | |
| "source": [ | |
| "To add the progress bar, we wrap the `tqdm` function around the iterator in the for loop, like so" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "id": "22f08d2f-f141-474e-860b-056b2a617906", | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stderr", | |
| "output_type": "stream", | |
| "text": [ | |
| "100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 675411.27it/s]\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "for inp in tqdm.tqdm(inputs):\n", | |
| " some_function(inp)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "c97fdfcb-8aed-4a04-9cb3-b0a02d9d2bbf", | |
| "metadata": {}, | |
| "source": [ | |
| "If we want to multiprocess something it becomes more complicated, but not infeasible. Let's look at a case of multiprocessing without a progress bar first." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 4, | |
| "id": "556142e0-040e-481a-8f9e-4084d921e363", | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "with Pool(4) as p:\n", | |
| " r = p.imap(some_function, inputs)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 5, | |
| "id": "50310866-8d3a-4dd8-8250-3b0f320df4fd", | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/plain": [ | |
| "<multiprocess.pool.IMapIterator at 0x1311d1100>" | |
| ] | |
| }, | |
| "execution_count": 5, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "r" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "3297a4fa-d70b-4b55-bbda-5be879bb0e6b", | |
| "metadata": {}, | |
| "source": [ | |
| "By using the `.imap` instead of `.map`, we get back an iterable. So nothing has actually been computed yet! Since it's an iterable we can wrap it in a loop to yield results" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 6, | |
| "id": "2e4f187f-ace6-4b9c-bce3-d7fca973174b", | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stderr", | |
| "output_type": "stream", | |
| "text": [ | |
| "100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 76566.34it/s]\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "# iterates results from map\n", | |
| "r = []\n", | |
| "with Pool(4) as p:\n", | |
| " for result in tqdm.tqdm(p.imap(some_function, inputs), total=len(inputs)):\n", | |
| " r.append(result)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "b7fc2a72-a4dc-43a5-a151-fe9d4bd2611f", | |
| "metadata": {}, | |
| "source": [ | |
| "The only difference in the `tqdm` wrapping now is that we must specify how many things we're iterating over. `tqdm` can't figure that out by itself on this kind of processing" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "dabd7651-a201-4ad9-b546-88fb24b15166", | |
| "metadata": {}, | |
| "source": [ | |
| "This snippet below is how I normally do it. It's doing exactly the same as above, just a bit more compactly" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 7, | |
| "id": "4102f63f", | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stderr", | |
| "output_type": "stream", | |
| "text": [ | |
| "100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 42065.03it/s]\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "with Pool(4) as p:\n", | |
| " r = list(tqdm.tqdm(p.imap(some_function, inputs), \n", | |
| " total=len(inputs))\n", | |
| " )" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "id": "279f42e2", | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3 (ipykernel)", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.9.12" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 5 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment