Abstract: This paper presents an effective method for the detection of multiple moving
objects from a video sequence captured by a moving surveillance camera. Moving object
detection from a moving camera is difficult since camera motion and object motion are
mixed. In the proposed method, we created a panoramic picture from a moving camera.
After that, with each frame captured from this camera, we used the template matching
method to found its place in the panoramic picture. Finally, using the image differencing
method, we found out moving objects. Experimental results have shown that the proposed
method had good performance with more than 80% of true detection rate on average.
7 trang |
Chia sẻ: thanhle95 | Lượt xem: 335 | Lượt tải: 0
Bạn đang xem nội dung tài liệu Moving object detection from video captured by moving surveillance camera, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Nghiên cứu khoa học công nghệ
Tạp chí Nghiên cứu KH&CN quân sự, Số 71, 02 - 2021 139
MOVING OBJECT DETECTION FROM VIDEO CAPTURED
BY MOVING SURVEILLANCE CAMERA
Hoa Tat Thang
1
, Tran Binh Minh
2,*
, Doan Van Hoa
2
, Nguyen Van Trung
3
Abstract: This paper presents an effective method for the detection of multiple moving
objects from a video sequence captured by a moving surveillance camera. Moving object
detection from a moving camera is difficult since camera motion and object motion are
mixed. In the proposed method, we created a panoramic picture from a moving camera.
After that, with each frame captured from this camera, we used the template matching
method to found its place in the panoramic picture. Finally, using the image differencing
method, we found out moving objects. Experimental results have shown that the proposed
method had good performance with more than 80% of true detection rate on average.
Keywords: Moving object detection; Moving camera; Object tracking; Panoramic image; Image difference.
1. INTRODUCTION
Motion control in video has wide applications in many fields of life, such as moving object
detection [1], object tracking [2], motion segmentation [3], event detection [4]. Many real
applications are based on videos taken either by static or moving cameras, such as in video
surveillance of human activities, visual observation of animals, home care, optical motion
capture, and multimedia applications [5]. These applications often require a moving object
detection step followed by tracking and recognition steps. Moving object detection is thus surely
among the most investigated field in computer vision. At first, methods were developed for static
cameras but, in the last two decades, approaches with moving cameras have been of many
interests giving more challenging situations to handle.
There are three main categories of approaches to detect moving objects from static cameras:
consecutive frame difference, background subtraction, and optical flow [5]. The most commonly
used method is background subtraction. In this method, at first, a background image that does not
include any moving objects is created. After that, the background subtraction method will help
detect moving objects. This method requires two conditions: constant illumination and static
background. The advantages of the method are low computational cost and easy implementation,
but it cannot be applied to moving cameras.
Moving object detection from a moving camera is difficult since camera motion and object
motion are mixed. There are 8 different approaches applied to detect moving objects in video
captured by moving cameras: panoramic background subtraction, dual cameras, motion
compensation, subspace segmentation, motion segmentation, plane+parallax, multi planes, and
split the image into blocks. The panoramic background subtraction method is one of the most
effective methods to detect moving objects in video captured by moving cameras [5]. With the
panoramic background subtraction method, the images captured by a moving camera can be
stitch together to form a bigger image, so-called a panorama picture. This panorama picture can
be used to model the background and detect moving objects as for a static camera.
In this study, we propose a method to detect multiple moving objects from a video captured
by a moving surveillance camera. This method belongs to the panoramic background subtraction
approach to detect moving objects in video captured by moving cameras. And it is useful for
real-time detection and works well in fast-moving object cases. The background subtraction
method was used to create a panoramic picture from a moving camera. After that, with each
picture captured, a template matching method was used to found its place in the panoramic
picture and using the image differencing method to found moving objects.
The rest of this paper is organized as follows: Related work and methodology are described in
Công nghệ thông tin & Cơ sở toán học cho tin học
140 H. T. Thang, , N. V. Trung, “Moving object detection moving surveillance camera.”
Section 2 and Section 3, respectively. Section 4 and section 5 describes experiments, results, and
discussion. Finally, the conclusions are given in section 6.
2. RELATED WORK
D. Avola and al. [6] proposed a method for background modeling and foreground. In
particular, the proposed method uses a spatio-temporal tracking of sets of keypoints to
distinguish the background from the foreground. It analyses these sets by a grid strategy to
estimate both camera movements and scale changes. The same sets are also used to construct a
panoramic background model and to delete the possible initial foreground elements from it.
Choi et al. [7] proposed a solution using infrared cameras. This method can control all 3
dimensions of moving objects. The advantage of this method is the ability to accurately
determine the moving object and the distance from the object to the cameras. The disadvantage
of this method, however, is the use of depth information that is not common in nowadays
cameras but only available for some specialized cameras.
Jodoin et al. [8] proposed a robust moving object detection method for both fixed- and
moving-camera captured video sequences that use a Markov random field (MRF) to obtain label
field fusion.
Wang [9] proposed to use the JRF model for moving vehicle detection in video sequences in
different weather conditions. The model, however, is limited because it applies only to grayscale
images.
Hu et al. [10] proposed a method to detect moving objects for videos captured by a moving
camera. Feature points in frames are classified into the foreground and background with the
assistance of multiple view geometry. Image difference is computed using the aid of affine
transformation based on background feature points. Moving object contours are then obtained by
consolidating the foreground areas and differences in the image. Finally, the movement history
of contiguous frames and refinement schemes are used to detect moving objects. This method
gives relatively good results in detecting moving subjects, but the method cannot show moving
subjects in the whole panoramic picture.
3. METHODOLOGY
This study proposes a method for moving object detection from video captured from a
moving camera without additional sensors. In the proposed method, a panoramic picture from a
moving camera is created. After that, with each frame captured from this camera, we used
template matching to find its place in the whole panoramic picture and using image differencing
method to found the moving object.
Steps involved in this method as in figure 1:
2. Find the location of the current frame in the
panoramic picture
1. Create a panoramic picture from moving camera
3. Find moving objects by image differencing
method and mask these objects
Figure 1. Steps of proposed method.
3.1. Create a panoramic picture from a moving camera
To construct our panoramic picture, we’ll utilize computer vision and image processing
Nghiên cứu khoa học công nghệ
Tạp chí Nghiên cứu KH&CN quân sự, Số 71, 02 - 2021 141
techniques such as keypoint detection and local invariant descriptors; keypoint matching;
RANSAC; and perspective warping.
Our panoramic stitching algorithm consists of steps as follows:
1. From the first images of the video, select a number of images so that any two adjacent
pictures must satisfy the overlap conditions (for stitching). The first image is the leftmost image,
and the last image is the rightmost image;
2. Apply image-stitching and panoramic construction for the first and the second images;
3. Replace the first and the second images with the created panoramic image at 2
nd
step.
4. Repeat 2
nd
step until the last (rightmost) image is reached.
3.2. Find the location of the current frame in the panoramic picture
For finding the location of the current frame in the panoramic picture, we will apply template
matching technique to find areas of the panoramic picture that match to the current frame.
Suppose that:
I - Panoramic picture, T – Current frame.
W H is the size of the panoramic picture.
w h is the size of the current frame.
We will slide T through I, compares the overlapped patches of size w×h against panoramic
picture (W H) - moving the patch one pixel at a time (from left to right, up to down). The
summation is done over template image: x′=0...w−1, y′=0...h−1
At each location, a metric is calculated to represent how "good" or "bad" the match at that
location is (or how similar the patch is to that particular area of the panoramic picture). The
result is matrix R with size (W−w+1)×(H−h+1). Each location (x, y) in R contains the match
metric and is defined as in Eqs (1):
( )
∑ ( ( ) ( ))
√∑ ( ) ∑ ( )
(1)
The location of the current frame in the panoramic picture is x*, y* where R(x*, y*) has a
maximum value and greater than some threshold value. Usually, we assign a threshold value =
0.85. This value is selected based on experience, and 0.85 is just enough to find the location of
the current image on the panoramic picture. If this value is too small, the wrong image position
may be chosen. If the value is too large, the position of the current image may not be found in the
panoramic picture.
3.3. Find moving objects by image differencing method
For finding movements in the current frame, we use the differencing method. The finding
moving object method includes steps as follows:
- Convert frame to a grayscale image;
- Subtract the current frame from the previous frame (difference image = current image –
previous image);
- Convert the difference image into binary images using an optimal threshold;
- Find foreground pixel (moving pixel);
- Apply bounding boxes to mark the moving object;
- Update the panoramic picture by replacing the old region with the current frame.
The image difference between two images is defined as the sum of the absolute difference at
each pixel. The first image It is analyzed with a second image, It–1, is the previous image. The
difference value is defined as:
Công nghệ thông tin & Cơ sở toán học cho tin học
142 H. T. Thang, , N. V. Trung, “Moving object detection moving surveillance camera.”
( ) ∑ ( ) ( )
(2)
where M is the resolution or number of pixels in the image.
It and It–1 images were converted to grayscale images before image differencing to reduce the
amount of calculation, but do not affect the results.
Because the image difference ( ) is noisy and sensitive to camera motion and image
degradation, thus some morphological transformations were applied to reduce noises. Then, the
difference image is converted to binary using an optimal threshold. If one pixel of the difference
image greater than the threshold, then the pixel is considered as foreground pixel (moving pixel).
The erosion operator is used one time, and then the dilation operator is used three times to obtain
more complete moving object regions. The kernel is a 3x3 matrix full of ones.
Finally, the minimum bounding boxes are applied to mark the moving object for moving
object detection and the panoramic picture was updated by replacing the old region with the
current frame for the next detection.
4. EXPERIMENTS
Experiments were conducted on a computer with an Intel Core i7 2.4-GHz CPU and 8GB of
RAM. The algorithms were implemented in Python and Open Source Computer Vision Library.
In the first experiment, video sequences are received from the surveillance camera on Church
Street Marketplace, Burlington (Live Camera from Youtube channel), and results are presented
as follows:
Figure 2. Panoramic picture created by frames from moving camera.
Figure 3. Current frame (from
image sequence).
Figure 4. Difference frame
(Noisy and degradation).
Figure 5. Binary image after
applying morphological
operations to reduce noises.
Figure 6. Movement detection on panoramic picture
(Moving object was masked by a bounding box).
Nghiên cứu khoa học công nghệ
Tạp chí Nghiên cứu KH&CN quân sự, Số 71, 02 - 2021 143
After applying steps in section 3, a moving object was detected and showed on the panoramic
picture (masked by a bounding box). For noise removal, we will show only objects having a
contour size greater than some value. In this case, for human moving detection, we use thresh
value = 1000 with the condition of contour height > contour width.
In other experiments, we used images from moving cameras on a ski field in Colorado,
Canada, and on a Japanese street (Live Camera from Youtube channel) to check the program's
performance. The program classified the moving people and ignored the non-moving people in
the pictures.
Figure 7. Movement detection on Colorado ski field
(Two moving object were masked by bounding boxes).
Figure 8. Movement detection on Japanese street
(Three moving object were mask by bounding boxes).
5. RESULTS AND DISCUSSION
In order to illustrate the performance of video object segmentation, the true detection rate TR
and the false detection rate FR were adopted as defined in Eqs (3),(4), respectively, where N is
the total number of moving objects, TP is the total number of true detection objects, and FN is
the total number of false detection objects.
(3)
(4)
Công nghệ thông tin & Cơ sở toán học cho tin học
144 H. T. Thang, , N. V. Trung, “Moving object detection moving surveillance camera.”
Table 1. Performance evaluation for self-made datasets.
Sequence N TP FN TR FR
Church Street Marketplace 352 279 8 79.26% 2.27%
Colorado ski field 459 371 10 80.83% 2.18%
Japanese street 540 431 11 79.81% 2.04%
All frames 1351 1081 29 80.01% 2.15%
Experimental results show that the proposed method better than the method given by Hu et al.
[10]. The average true detection rate TR and false detection rate FR obtained using the proposed
method are 80.01% and 2.15%, as shown in table 1, respectively, while the state-of-the-art
methods (the method proposed by Hu et al.[10]) has average true detection rate TR and false
detection rate FR 65.69% and 13.27%.
6. CONCLUSION
In this paper, we present a new method to detect multiple moving objects from a video
sequence captured from a moving camera. The experiments have shown that the method has
good performance and efficiency. In addition, the results have shown that the given method not
only helps us to detect multiple moving objects but also shows all moving objects in a full view
panoramic picture. Therefore, the proposed method is a useful tool for intelligent surveillance.
The main weakness of the proposed method is that, in this paper, for finding the location of the
current frame, we use threshold value = 0.85, the erosion operator is used one time, and then the
dilation operator is used three times, but with video sequences from other sources, these
parameters may be different. In the future, we will improve our method to find out these
parameters automatically.
REFERENCES
[1]. X. Zhou, C. Yang, W. Yu, “Moving object detection by detecting contiguous outliers in the low-rank
representation”, IEEE Transactions on Pattern Analysis and Machine Intelligence 35(3) (2013) 597-
610.
[2]. W.-C. Hu, C.-Y. Yang, D.-Y. Huang, “Robust real-time ship detection and tracking for visual
surveillance of cage aquaculture”, Journal of Visual Communication and Image Representation
22(6) (2011) 543-556.
[3]. F.-L. Lian, Y.-C. Lin, C.-T. Kuo, J.-H. Jean, “Voting-based motion estimation for real-time video
transmission in networked mobile camera systems”, IEEE Transactions on Industrial Informatics
9(1) (2013) 172-180.
[4]. D. Tran, J. Yuan, D. Forsyth, “Video event detection: From subvolume localization to spatiotemporal
path search”, IEEE Transactions on Pattern Analysis and Machine Intelligence 36(2) (2014) 404-
416.
[5]. Marie-Neige Chapel, Thierry Bouwmans, “Moving objects detection with a moving camera: A
Comprehensive Review”, Computer Science Review, Volume 38, 2020, pp. 1-2.
[6]. D. Avola, L. Cinque, G. Foresti, C. Massaroni, D. Pannone, “A keypoint-based method for
background modeling and foreground detection using a PTZ camera”, Pattern Recognition Letters
96 (2017) 96–105.
[7]. W. Choi, C. Pantofaru, S. Savarese, “A general framework for tracking multiple people from a
moving camera”, IEEE Transactions on Pattern Analysis and Machine Intelligence 35(7) (2013).
[8]. P.M. Jodoin, M. Mignotte, C. Rosenberger, “Segmentation framework based on label field fusion”,
IEEE Transactions on Image Processing 16(10) (2007) 2535-2550.
[9]. Y. Wang, “Joint random field model for all-weather moving vehicle detection”, IEEE Transactions
on Image Processing 19(9) (2010) 2491-2501.
[10]. W.-C. Hu, C.-H. Chen, C.-M. Chen, T.-Y. Chen, “Effective moving object detection from videos
captured by a moving camera”, in: Proceedings of the First Euro-China Conference on Intelligent
Data Analysis and Applications, Vol. 1, 2014, pp. 343-353.
Nghiên cứu khoa học công nghệ
Tạp chí Nghiên cứu KH&CN quân sự, Số 71, 02 - 2021 145
TÓM TẮT
PHƯƠNG PHÁP PHÁT HIỆN ĐỐI TƯỢNG CHUYỂN ĐỘNG DỰA TRÊN HÌNH ẢNH THU
ĐƯỢC TỪ CAMERA GIÁM SÁT CÓ QUAY QUÉT
Bài báo này giới thiệu một phương pháp hiệu quả để phát hiện nhiều đối tượng chuyển
động từ một chuỗi các khung hình thu được từ một camera chuyển động. Phát hiện đối
tượng chuyển động từ một camera chuyển động (quay quét) là một vấn đề khó vì chuyển
động của camera và chuyển động của đối tượng bị trộn vào nhau. Trong phương pháp đề
xuất, tác giả tạo ra một ảnh toàn cảnh từ camera chuyển động. Tiếp theo, với mỗi khung
hình thu được từ camera, tác giả sử dụng phương pháp trộn mẫu để tìm vị trí của ảnh
trong ảnh toàn cảnh. Cuối cùng, sử dụng phương pháp xác định khác biệt ảnh để tìm ra
đối tượng chuyển động. Các kết quả thực nghiệm cho thấy, phương pháp đề xuất đạt được
kết quả tốt với tỷ lệ phát hiện chính xác trung bình đạt trên 80%.
Từ khóa: Phát hiện đối tượng chuyển động; Camera chuyển động; Định vị đối tượng; Ảnh toàn cảnh; Độ khác biệt
giữa các ảnh.
Received 30
th
December 2020
Revised 27
th
January 2021
Published 5
th
February 2021
Author affiliations:
1
Military Technique Academy;
2
MITI, Military Academy of Science and Technology;
3
Military Academy of Logistics.
*Corresponding author: minhchip79@gmail.com.