Created
January 30, 2023 12:51
-
-
Save sekomer/8571f9e9e1f35fc904da216afd785316 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| # function to extract each tar file | |
| extract_tar() { | |
| local file="$1" | |
| local folder="$(basename "$file" .tar.bz2)" | |
| mkdir "$folder" | |
| tar -xjf "$file" -C "$folder" | |
| } | |
| # get the number of CPUs | |
| num_cpu=$(grep -c ^processor /proc/cpuinfo) | |
| # create a semaphore to limit the number of parallel processes | |
| semaphore=$(mktemp -u) | |
| mkfifo "$semaphore" | |
| exec 3<>"$semaphore" | |
| for ((i=0;i<num_cpu;i++)); do | |
| echo >&3 | |
| done | |
| # loop through all .tar.bz2 files in the current directory | |
| for file in *.tar.bz2; do | |
| read -u 3 | |
| ( | |
| extract_tar "$file" | |
| echo >&3 | |
| ) & | |
| done | |
| # wait for all background jobs to finish | |
| wait | |
| # clean up the semaphore | |
| exec 3>&- | |
| rm "$semaphore" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I really like what how you implemented the max thread count. That's clever.
BTW If you have gnu parallel installed, I think this makes basically the same results as your script:
both the for loop to make the directories and the extract command should work on any
*.tar*archives, compressed or not, but will not work on archives named like.tgzinstead of.tar.gz. I use bsdtar because it does single threaded decompression (gnu tar may use multiple threads forxzandzstdand other compressiosn, which could get excessive as we're already making 1 process per cpu core)