Skip to content

Instantly share code, notes, and snippets.

@danhey
Created January 25, 2023 23:37
Show Gist options
  • Select an option

  • Save danhey/b53fed6b3b968b134ae0f0ddbe4fb54d to your computer and use it in GitHub Desktop.

Select an option

Save danhey/b53fed6b3b968b134ae0f0ddbe4fb54d to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "6aff8511-b94c-476e-85e6-06eec5057638",
"metadata": {},
"source": [
"For this notebook I'm using `multiprocess` instead of `multiprocessing` (the default internal Python package). This will work for both but `multiprocess` is generally more stable. You can `pip install multiprocess` or change the import"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "22ea3160",
"metadata": {},
"outputs": [],
"source": [
"import tqdm\n",
"from multiprocess import Pool"
]
},
{
"cell_type": "markdown",
"id": "61d43ff7-7d6f-4232-b5c0-9f3092a6dcc9",
"metadata": {},
"source": [
"Here's a simple example first. We have something we want to iterate over and perform an operation on. To make a progress bar around this is straightforward."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a28db714-8e3d-4236-9597-1dbfb7ae8bbf",
"metadata": {},
"outputs": [],
"source": [
"def some_function(x):\n",
" return x ** 2\n",
"\n",
"# The things we want to pass into our function\n",
"inputs = list(range(0, 100, 1))"
]
},
{
"cell_type": "markdown",
"id": "3054ed19-8f71-4a47-9485-60b3ad9653e9",
"metadata": {},
"source": [
"To add the progress bar, we wrap the `tqdm` function around the iterator in the for loop, like so"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "22f08d2f-f141-474e-860b-056b2a617906",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 675411.27it/s]\n"
]
}
],
"source": [
"for inp in tqdm.tqdm(inputs):\n",
" some_function(inp)"
]
},
{
"cell_type": "markdown",
"id": "c97fdfcb-8aed-4a04-9cb3-b0a02d9d2bbf",
"metadata": {},
"source": [
"If we want to multiprocess something it becomes more complicated, but not infeasible. Let's look at a case of multiprocessing without a progress bar first."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "556142e0-040e-481a-8f9e-4084d921e363",
"metadata": {},
"outputs": [],
"source": [
"with Pool(4) as p:\n",
" r = p.imap(some_function, inputs)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "50310866-8d3a-4dd8-8250-3b0f320df4fd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<multiprocess.pool.IMapIterator at 0x1311d1100>"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"r"
]
},
{
"cell_type": "markdown",
"id": "3297a4fa-d70b-4b55-bbda-5be879bb0e6b",
"metadata": {},
"source": [
"By using the `.imap` instead of `.map`, we get back an iterable. So nothing has actually been computed yet! Since it's an iterable we can wrap it in a loop to yield results"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "2e4f187f-ace6-4b9c-bce3-d7fca973174b",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 76566.34it/s]\n"
]
}
],
"source": [
"# iterates results from map\n",
"r = []\n",
"with Pool(4) as p:\n",
" for result in tqdm.tqdm(p.imap(some_function, inputs), total=len(inputs)):\n",
" r.append(result)"
]
},
{
"cell_type": "markdown",
"id": "b7fc2a72-a4dc-43a5-a151-fe9d4bd2611f",
"metadata": {},
"source": [
"The only difference in the `tqdm` wrapping now is that we must specify how many things we're iterating over. `tqdm` can't figure that out by itself on this kind of processing"
]
},
{
"cell_type": "markdown",
"id": "dabd7651-a201-4ad9-b546-88fb24b15166",
"metadata": {},
"source": [
"This snippet below is how I normally do it. It's doing exactly the same as above, just a bit more compactly"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "4102f63f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 42065.03it/s]\n"
]
}
],
"source": [
"with Pool(4) as p:\n",
" r = list(tqdm.tqdm(p.imap(some_function, inputs), \n",
" total=len(inputs))\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "279f42e2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment