GNU Parallel is incredibly useful to run the same command multiple times in parallel. However, I spent an unreasonable amount of time trying to find how to use it for commands that take an arbitrary number of arguments.
The manual and online discussions explain how to pipe the output of a command into another while splitting it into chunks. If you just want to read multiple lines and feed them as space-separated arguments to a command, you can simply use the -N
argument. -N max-args
reads max-args
arguments (or less) and inserts them on the command line.
Here is an example:
❯ seq 3
1
2
3
seq
produces one integer on each line of its standard output. Now, we can pipe this output into echo
, by groups of 2:
❯ seq 5 | parallel -N 2 echo
1 2
3 4
5
Remark that the last line has only one argument! We can also use a different number of arguments:
❯ seq 6 | parallel -N 3 echo
1 2 3
4 5 6
When the position of the arguments matters, -N
can also be useful to get placeholders like {1}
:
❯ seq 4 | parallel echo "Hello {1}! "
Hello 1!
Hello 2!
Hello 3!
Hello 4!
Finally, here is an example use-case where I scrub metadata in parallel:
fd JPEG | parallel -N50 -j15 exiftool -All= -overwrite_original
I get the list of images with fd
and feed it to 15 instances of exiftool
by chunks of 50.