7. Performance¶
video-pipeline
is a package designed to take advantage of the Python
multiprocessing
library in order to speed up image manipulations and
filters applied to a PiCamera video stream in real time.
The intent of this approach is to spread the work of image manipulation operations across many processes to “parallelize” the work that is done on each frame. This concept holds under the assumption that the image manipulation operations do not depend on the order in which the frame is captured, so multiple frames can be processed in parallel and then ordered chronologically later after completing the processing step.
Using video-pipeline
to process many frames in parallel is
supposedly capable of processing more image frames than processing each
frame sequentially (capture->process->send out->repeat) although the
code architecture needed to support multiprocessing
is much more
complex and likely introduces additional overhead to the pipeline versus
serial processing. The true performance gains of using
video-pipeline
have not yet been quantified.
7.1. Testing approach¶
With this set of tests, I will quantify the performance of
video-pipeline
versus serial image processing on a PiCamera video
stream. I will also quantify the operational overhead of the pipeline
itself compared to no image processing. The primary metric for
performance is frames per second (fps) of the video output.
Isolate image processing performance¶
I am interested only in the image processing part of
video-pipeline
’s performance, so I will create a “control” script
that uses video-pipeline
interfaces to PiCamera and outputting a
video stream to a client. The control script will NOT use
video-pipeline
tools to handle images captured from PiCamera, but it
WILL use the same operations on each frame, processing each frame
directly and in order. In other words, the control script will be used
to quantify the performance of the test setup itself and establish a
baseline.
Quantify performance impacts from overhead¶
While the main benefit to using video-pipeline
is its
multiprocessing support, it is possible to run video-pipeline
with
one process. This effectively forces video-pipeline
to operate on
frames in series. While this is not a realistic use-case of the package,
it provides us with an opportunity to quantify performance losses from
any additional overhead introduced from using this package versus plain
serial processing. Ideally, there would be little to no overhead and
using a single-process video-pipeline
would have the same impact as
any other script between capturing and displaying frames.
Quantify gains from parallel processing¶
video-pipeline
allows the user to select an arbitrary number of parallel
processes to use for image processing. Clearly the upper limit to the number of
concurrent parallel processes is limited by hardware capabilities, but we can
still assess the performance gains compared to a single-process baseline. Even
on hardware with one CPU core, the Python multiprocessing
module abstracts
this away so we can specify an arbitrary number of processes. For these tests we
will compare the performance of video-pipeline
with 1, 2, 4, 8, and 16
processes in the pool. The expectation is that performance improves with more
than one process but with diminishing returns as the number of processes
increases.
Try various image processing operations¶
The performance of an image processor is heavily dependent on the
operations it must perform on each frame. As an image processing task
has more operations or more complex computations to be run on every
pixel, it is expected to have lower throughput (fps). As such, I will
subject the serial baseline and video-pipeline
to the following
operations: - No-op. Output frames exactly match the captured
frames. Any performance losses are attributed to overhead. - Grayscale
filter. Convert captured frames (RGB) to single-channel grayscale,
then output the grayscale frame as an equivalent 3-channel (RGB) frame.
The number of operations is proportional to the number of pixels in the
frame. This method is built in to PIL. - Sobel filter. Compute Sobel
edge detection algorithms on the captured frame and output the filtered,
grayscale result as an equivalent 3-channel (RGB) frame. The number of
operations is proportional to 8x the number of pixels in the frame since
it convolves a 3x3 kernel with every pixel. This method is built in to
PIL. - Color select filter. Convert captured frames (RGB) to HWV
color space. Create a binary mask of the pixels that are within the
desired HSV bounds. Apply the binary mask to the original frame as a
logical-and, then output the result as a 3-channel RGB frame. I don’t
know how many operations this is but it’s probably more than the Sobel
filter. These methods are built in to OpenCV.
7.2. Baseline Script¶
# TODO@phil: write me
# the source for the baseline serial, single process image processor will go here
7.3. Test Execution¶
TODO@phil explain the scene and how the test is conducted. use gifs where applicable.
7.4. Test Results¶
TODO@phil: include plots showing diminishing returns from increasing number of processes TODO@phil: include gif of sample video
Raspberry Pi 2¶
640x480 | # processes | No-op | Grayscale | Sobel | Color Select |
---|---|---|---|---|---|
Baseline | 1 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
1 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
2 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
4 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
8 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
16 | ?? fps | ?? fps | ?? fps | ?? fps |
Raspberry Pi 3 B+¶
640x480 | # processes | No-op | Grayscale | Sobel | Color Select |
---|---|---|---|---|---|
Baseline | 1 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
1 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
2 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
4 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
8 | ?? fps | ?? fps | ?? fps | ?? fps |
video-pipeline |
16 | ?? fps | ?? fps | ?? fps | ?? fps |