How to use an IDS uEye Camera to analyze live video with Darknet YOLO using OpenCV-Python
NOTE: This article is a work in progress. I will update it regularly when I have time.
Introduction
In a recent machine learning project I had to use a uEye camera to analyze experimental data using the Darknet/YOLO neural network. At work, my predecessor had set up Darknet such that it could run without a problem on video files and images, but my task was to implement live analysis. I set about doing this in Python, as it has a great implementation of OpenCV and there are also open source uEye libraries (pyueye) available in Python, so it seemed a natural choice. Moreover, I like coding in Python, because its an easy going language, so there’s that too.
To my amazement, it seemed that there was not a lot of information
available online on how to use a uEye camera for this purpose. Darknet
supplies you with a terminal command to
open darknet on a webcam, which works for most simple cameras, but it
doesn’t work for uEye cameras because these require more settings to be
configured than simple webcams. These settings, as it turns out, aren’t
automatically configured by this ‘standard’ command. So a custom code
implementation was a must. I also saw a lot of code examples,
which used webcams, that use OpenCV-Python directly to call the camera
using the cv2.VideoCapture()
command which I think is what
darknet also does behind the scenes, and I even saw an example
where they were even able to configure some uEye camera
settings using OpenCV, but were not able to set other very
critical settings. This made me very weary at first, because it
seemed there was maybe no already neatly implemented way to couple a
uEye camera to Darknet, which would have meant I would have had to make
a deep dive into Darknet’s source code.
I had to do a lot of digging but at last I found a solution. A very simple one too, and against my fears it turned out everything was indeed already implemented quite well, though it was documented quite terribly every step of the way, if I may let my frustration shine through a little. Darknet already supplies a Python interface in its GitHub repository. There are a few different forks of Darknet and the interface I liked the most was AlexeyAB’s version. It’s not really explained how to use it and there was one major obstacle to overcome concerning image types to be passed from Python to Darknet which I will cover later, but I found a satisfying solution to that.
In this blog post I will be explaining how to control your uEye
camera using the pyueye module together with pypyueye, a pyueye
wraper, and I’ll show you how to use separate threads to get camera
input and display such that these processes aren’t run in series with
the image detection. Finally I will cover how to hook up Darknet/YOLO to
this for live image detection. I will be assuming that you already have
Darknet, pyueye, and OpenCV installed, and will not be covering any of
the installation steps, as they are explained well enough elsewhere.
Since pypyueye is an abandoned project, I’ll cover its installation and
some changes that I made. Note that I have installed OpenCV with GPU
support, which is not available in the standard pip
opencv-python
package, however that package does work as
well, though with lower performance, so I can’t guarantee that you will
have the same satisfying results if you’re not using a GPU.
Pyueye and pypyueye
pyueye is a “low level” Python interface and has all the tools needed to control a uEye camera. However, its also tedious to use because of this. Luckily someone already implemented quite a few ease-of-life functions in the pyueye wrapper: pypyueye. This project is by no means perfect, and as mentioned before, the project has been abandoned by its only developer and may be may even be out of date for newer cameras for all I know. Still, it’s a good starting point for working with uEye camera’s so as not to have to start from scratch. I myself had to make some adjustments to get a satisfying live camera feed which I will also explain.
Installing pypyueye
To install pypyueye is very simple, and also described on its GitHub
page. Using pip
you can install it by simply running:
$ git clone https://github.com/galaunay/pypyueye
$ cd pypyueye
$ pip install .
If you’re on Windows or otherwise don’t have access to these
commands, simply download the repository from GitHub as a zip file,
unzip it and open a terminal in the install location and run
python setup.py install
and it should install.
If you want to check if you installed it correctly simply try an import in Python:
# PYTHON CODE
import pypyueye
Alternatively to installing, you can also simply place the cloned/downloaded respository in your project and import it locally.
Initializing a pypyueye Camera object
Pypyueye provides the Camera class which let’s you easily interface with your uEye camera. Simply import Camera from pypyueye and get going! You need to provide a camera handle id if you have multiple cameras hooked up to your computer, by default it’s set to 0, but if you have multiple cameras hooked up you might have to pass 1, 2, or 3 and so on. From there you can access the interface options via the camera object such as setting the uEye color scheme, fps, exposure etc.
# PYTHON CODE
import Camera from pypyueye
from pyueye import ueye
with Camera(device_id=0, buffer_count=3) as cam:
cam.set_colormode(ueye.IS_CM_RGB8_PACKED)1) # set automatic exposure
cam.set_exposure_auto(1) # set automatic gain
cam.set_gain_auto(5) cam.set_fps(
You can grab a single live frame using the
# PYTHON CODE
= cam.capture_image() img
The img
object is nothing but a simple numpy array which
has light intensity values between 0 to 255 for every pixel for three
color channels. The numpy array has a shape of (height, width,
channels). We can use OpenCV to start displaying video by simply passing
this img
object to cv2.imshow()
like so:
# PYTHON CODE
import cv2
"window name", img)
cv2.imshow(0)
cv2.waitKey( cv2.destroyAllWindows()
The cv2.waitKey(0)
function freezes the window untill
any key is pressed (while the displayer window is selected).
This way, you can display a single frame for an indeterminate amount of
time. If you don’t supply the waitKey()
function, the
window displaying the image would instantly close again. For proper
clean-up we also need to end with the
cv2.destroyAllWindows()
command.
But what if we don’t want to display a single frame but instead want to display a live stream? Easy, have a look at the example below:
# PYTHON CODE
= 1
wait_ms while True:
= cam.capture_img()
img "window name", img)
cv2.imshow(if cv2.waitKey(wait_ms):
break
cv2.destroyAllWindows()
Here we simply open a loop which is permanently set to True to run
indefinitely to capture new images. cv2.imshow()
displays
the newest image every pass and cv2.waitKey(wait_ms)
then
waits at least as long as wait_ms
in
milliseconds before the image is updated. Again the window doesn’t close
until a key is pressed, but we also break the loop upon exiting the
window and then destroy all window instances during clean-up. We can
also wait for a specific key input by substituting
if cv2.waitKey(wait_ms):
with
if cv2.waitKey(wait_ms) & 0xFF == ord(YOURKEY):
where
YOURKEY
could for instance be YOURKEY="q"
or
YOURKEY="esc"
. This is the most basic way to use pypyueye
with OpenCV to get a live feed of your camera, but we can do better!
Some changes for better live-streaming
There’s a problem with the capture_image()
function
implemented in the Camera
class which makes getting live
images this way significantly slower than necessary. Even though there
is a buffer preparation command (cam.capture_video()
) is
supplied which initializes one or more image buffers, for some reason no
utility function for reading frames from this buffer directly is
implemented. The capture_image()
function instead creates a
new image buffer each time the function is run, as shown in the
code below, which is inefficient.
# PYTHON CODE
# Camrea.capture_image
class Camera():
#... other source code ...
def capture_image(self, timeout=None):
if timeout is None:
= self.__get_timeout()
timeout self.capture_video()
= ImageBuffer() # <- THIS IS THE PROBLEM FOR LIVE VIDEO!
img_buffer = ueye.is_WaitForNextImage(self.handle(),
ret
timeout,
img_buffer.mem_ptr,
img_buffer.mem_id)if ret == ueye.IS_SUCCESS:
= ImageData(self.handle(), img_buffer)
imdata = imdata.as_1d_image()
data
imdata.unlock()self.stop_video()
else:
= None
data return data
#... other source code ...
Therefore we need to implement a live-frame readout function, which
is quite simple. Luckily it’s not hard to implement this functionality,
as it is practically the same as the capture_image()
. For
this reason I created a new class called LCamera
which
inherits from Camera
with the following changes. You could
also implement this in the source code, but I didn’t want to change
anything there. This can no doubt be implemented a little better and I
am thinking about forking pypyueye to implement this functionality but
for now this worked for me.
# PYTHON CODE
from pyueye import ueye
from pypyueye import Camera
from pypyueye.utils import ImageData
import numpy as np
class LCamera(Camera):
def __init__(self, device_id=0, buffer_count=3, fps=10, verbose=False):
super().__init__(device_id=0, buffer_count=3)
self.__b_CaptureVideo = False
self.verbose = verbose
def start_video_live_capture(self, wait=False):
"""
start video capture
wrapper for Camera.capture_video()
"""
= self.capture_video(wait)
ret self.__b_CaptureVideo = True
return ret
def stop_video_live_capture(self):
"""
Stop capturing the video.
wrapper for Camera.stop_video()
"""
= self.stop_video()
ret self.__b_CaptureVideo = False
return ret
def capture_live_frame(self, img_buffer = None, timeout=None):
"""
grab a frame, only works if start_video_live_capture was called
"""
if self.__b_CaptureVideo:
if timeout is None:
= self._Camera__get_timeout() # because python inheritance is strange...
timeout if img_buffer is None:
= self.img_buffers[0]
img_buffer = ueye.is_WaitForNextImage(self.handle(),
ret
timeout,
img_buffer.mem_ptr,
img_buffer.mem_id)if ret == ueye.IS_SUCCESS:
= ImageData(self.handle(), img_buffer)
imdata = imdata.as_1d_image()
image
imdata.unlock()else:
if self.verbose:
print("WARNING: image could not be received from camera")
= None
image return image
else:
if self.verbose:
print("WARNING: live image capture is turned off, returning 'None'")
return None
def capture_live_frames(self, n_images, timeout=None):
"""
grab multiple frames, only works if start_video_live_capture was called
"""
if self.__b_CaptureVideo:
if n_images > self.buffer_count:
print(f"WARNING: n_images ({n_mages}) > buffer_count ({self.buffer_count}), setting number of returned images to {self.buffer_count}")
= self.buffer_count
n_images = np.zeros(n_images)
images for i in range(n_images):
= self.img_buffers[i]
buff = self.capture_live_frame(img_buffer = buff, timeout=timeout)
images[i] return images
This way, the pre-defined image buffer is actually used and can
easily be read out, which for me meant a significant speed-up in image
capture (from about 400 ms
to the minimum time possible for
a specific frame rate).
Camera threading for even better performance
Because the in the current setup image capture and image display are executed in series video update, the live video feed is slower than necessary. Especially if we considering that we will want to run image detection using darknet, which, depending on your GPU, may be even slower if it is run in series with image capture and image display. You won’t get around the fact that darknet needs some time to analyze your images before you can display an image containing your detections, however we can get a small boost by simply running image capture and image display on their individual threads. For this I implemented the code below.
# PYTHON CODE
from threading import Thread
import time
import cv2
class VideoCaptureThread:
"""
Class that continuously gets frames from a LCamera object
with a dedicated thread.
"""
def __init__(self, LCameraObj):
self.cam = LCameraObj # <- this *must* be a LCamera object!
self.stopped = False
self.frame = None
self.__thread = None
def start(self):
self.cam.start_video_live_capture()
self.stopped = False
self.__thread = Thread(target=self.get, args=())
print(self.__thread)
self.__thread.start()
def get(self):
while not self.stopped:
self.frame = self.cam.capture_live_frame()
def stop(self):
self.stopped= True
self.cam.stop_video_live_capture()
class VideoDisplayerThread:
def __init__(self, frame=None, window_handle="Video Stream", wait=0):
self.window_handle = window_handle
self.frame = frame
self.stopped = False
self.wait = wait
self.__thread = None
def start(self):
self.__thread = Thread(target=self.show, args=())
self.__thread.start()
return self
def show(self):
while not self.stopped:
if self.frame is not None:
self.window_handle, self.frame)
cv2.imshow(if self.wait > 0:
# slow down video stream...
self.wait)
time.sleep(if cv2.waitKey(1) == ord("q"):
self.stopped = True
def stop(self):
self.stopped = True
Now we can get our live images from a separate thread, make our detections, draw the detections on the images, and pass the changed image to the display thread like in the code below. We’ll get to YOLO/Darknet detection in the next section.
# PYTHON CODE
from lib import darknet as dn
from src import utils
from src import Analysis as anal
from src.CamPypyueye import LCamera as Camera
from threading import Thread
from pyueye import ueye
import cv2 as cv
import time
with Camera() as cam:
cam.start_video_live_capture()1)
cam.set_exposure_auto(1)
cam.set_gain_auto(20)
cam.set_fps(
# create a video capture thread
= VideoCaptureThread(cam)
liveCam # Start the live feed
liveCam.start() = 5/cam.get_fps() # slow down image displayer thread
wait_time = VideoDisplayerThread(wait=wait_time)
viewWindow
viewWindow.start()while not viewWindow.stopped:
= liveCam.frame
img # img_processing ...
= img
viewWindow.frame
viewWindow.stop() liveCam.stop()
Passing live images to YOLO/Darknet
Now we understand how to efficiently use our uEye camera using Python with OpenCV-Python, we can get tho the bread and butter of this article: YOLO/Darknet image detection using our uEye camera. As I mentioned in the introduction, this wasn’t as straight forward to implement as I had initially thought, but it’s actually not that difficult!
Initializing a YOLO network in OpenCV through the Darknet interface
The Darknet interface allows us to open a pre-trained YOLO model in
python. This model is imported into the network
object with
which we can run our detections. Loading a network takes some time, but
only once at the beginning of our code. Different implementations of the
Python interface (darknet.py
) have slightly different ways
of initializing a network, but all work essentially the same way. Our
pre-trained model is essentially saved in three files: 1. A
configuration file 2. A (meta-) data file 3. A weights file The standard
“out of the box” model provided in Darknet comes with the ability to
detect people and many every day objects such as cars and bicycles, but
for many applications you will have to train your own model. I will not
cover how to train your model here, as it is described well elsewhere.
AlexeyAB’s python interface supplies the following
load_network(config_file, data_file, weights, batch_size=1)
function which initializes our network and returns the network, a
class_names
list, and a class_colors
the last
of which can be used when drawing our detections onto the video. A
simple example of this step is listed below.
# PYTHON CODE
import darknet as dn
= dn.load_network("path/to/config.cfg", "path/to/meta.data", "path/to/weights.weights") net, class_names, class_colors
Be sure to set your paths accordingly!
Converting OpenCV-Python numpy images to Darknet’s IMAGE type for detection
One last obstacle remains, to pass our OpenCV ndarray type image to
our network. For this we have to somehow convert this numpy array into
the right format for our network to be able to read it, namely the
IMAGE
type which is defined in Darknet’s source code. This
proved to be a bit more complicated than I had initially thought, as it
isn’t documented very well how to do this and I didn’t find any
satisfactory solutions online. That is, until I had a look at the source
code and also found an issue on
GitHub where people were asking themselves exactly this question. It had
a solution by the user TheMikeyR, which was to change Darknet’s source
code on the C-side to be able to convert a numpy array to the right
format, which I incidentally observed to be already implemented in the
source code in the same way. Here the image was saved in a float pointer
(i.e. as a 1D array of floats). I’m sure this solution worked, but I
didn’t really feel like recompiling darknet just for this purpose and I
felt like there must be some way to do this all in Python. Indeed, I
found a way to do this in Python, and I left a comment
explaining how to do it. For the sake of completeness, I’ll reiterate
the points I made there below.
When testing I realized that the loop below, as found in TheMikeyR’s answer, was essentially doing the same as could be done in python in a single line of code using numpy!
/* C/C++ CODE */
for(i = 0; i < h; ++i){
for(k= 0; k < c; ++k){
for(j = 0; j < w; ++j){
= k*w*h + i*w + j;
index1 = step_h*i + step_w*j + step_c*k;
index2 //fprintf(stderr, "w=%d h=%d c=%d step_w=%d step_h=%d step_c=%d \n", w, h, c, step_w, step_h, step_c);
//fprintf(stderr, "im.data[%d]=%u data[%d]=%f \n", index1, src[index2], index2, src[index2]/255.);
.data[index1] = src[index2]/255.;
im}
}
}
I initially implemented the code above exactly in Python. It worked,
but it was rather slow. I tested my code a little and found that the
above for loop it is simply converting the (w x h x c) ndarray to a one
dimensional array with the data being strung together column-wise (the
columns as they are visually represented by
print(np_image)
). For this reason, all you have to do is
use np.transpose()
and np.flatten()
. Since in
a 3-dimensional matrix/array it is not entirely trivial how transpose
works (and I didn’t know how the function was implemented either), I
will also give a short explanation on that. What you’re doing is
permuting the axes from 0, 1, 2 (h, w, c) to 2, 0, 1 (c, h, w) if I
understand it correctly. Then flattening the data using
np.flatten()
gives you exactly the data you need, and all
that’s left to do is cast this numpy array to a float using
ctypes
. Of course, using numpy sped it up significantly.
The transpose and flatten step essentially does the same as this in
TheMikeyR’s code:
Running Darknet detect and OpenCV visualization
The Darknet Python interface by AlexeyAB also has some OpenCV drawing
functions included for ease of use. To detect objects in our frame we
simply pass the network, the class names, and the IMAGE
object we learned how to create in the last section to Darknet with the
help of the
detect_image(network, class_names, image, thresh=.5, hier_thresh=.5, nms=.45)
function, which returns a list of detections. The list of class labels
that can be passed here is the same class_names
that was
returned upon the network initialization. The thresh
,
hier_thresh
and nms
key word arguments all
concern which detections get discarded, but I won’t go any deeper into
the details of that here.
The returned list is a list of detections for each detected object,
and each detected object has the following structure:
detections = [label(str), confidence(float), [x, y, w, h]]
where the label is the object’s name (i.e. car, person), the confidence
is the confidence of correct classification by the YOLO network, and the
coordinates [x, y, w, h]
are the x, y
coordinates and the box width and height w, h
. Note that
x, y
are the coordinates at the center of the box.
The interface furthermore provides the
draw_boxes(detections, image, colors)
function which allows
us to draw the detection boxes onto our (opencv/ndarray) image captured
by our camera provided we pass the detections. A complete script using
everything we have learned so far may now look as follows. Note that I
have implemented the LCamera
class and
# PYTHON CODE
from lib import darknet as dn
from src import utils
from src import Analysis as anal
from src.CamPypyueye import LCamera as Camera
from threading import Thread
from pyueye import ueye
import cv2 as cv
import time
= dn.load_network(
net, class_names, class_colors
utils.P_CONFIG,
utils.P_META,
utils.P_WEIGHTS
)print(net, class_names, class_colors)
with Camera() as cam:
1)
cam.set_exposure_auto(1)
cam.set_gain_auto(20)
cam.set_fps(
# create a video capture thread
= VideoCaptureThread(cam)
liveCam
liveCam.start()= 5/cam.get_fps()
wait_time = VideoDisplayerThread(wait=wait_time)
viewWindow
viewWindow.start()#exit()
= time.time()
start_time while not viewWindow.stopped:
= liveCam.frame
img = time.time() - start_time
t # img_processing ...
= utils.np_image_to_c_IMAGE(input_frame)
C_IMAGE # detect tips
= dn.detect_image(net, class_names, C_IMAGE_frame)
dets # draw detections
= dn.draw_boxes(img, dets, class_colors)
img # set output frame
= img
viewWindow.frame
viewWindow.stop() liveCam.stop()