Parabolic Curve fitting as an 3D Trajectory Estimation of the Soccer Ball Kyuhyoung Choi and Yongduek Seo Sogang University, {Kyu, Yndk}@sogang.ac.kr Abstract A soccer ball is projected to an image as a small whitish round blob. With multiple synchronized views, its 3D position can be estimated. In this paper, the ball trajectory is modeled as a sequence of 3D parabolic(ballistic) curves from multiple views. With this trajectory estimation method, lossless tracking is guaranteed for pretty long video clips as shown in our experiments. 1 Introduction The most difficulty in visual tracking of soccer ball comes from its size. Its small size with fast motion makes its features useless becoming small whitish and noisy ellipse which makes it hard to keep tracking the ball without aid of noise removal [1, ]. Another difficulty is from its interaction with players which makes it generally accepted that ball tracking has two modes of status, that is, visible and invisible. During visible mode, it shows ballistic motion of elastic sphere while it is occluded by players and hardly separated as independent image blob during the other mode. So tracking ball becomes finding the trajectory for each visible mode. For extraction of D ball trajectory, [3] used triplet seeds growing to trajectory candidates. One of the candidates best satisfying constant acceleration model is selected as the trajectory for the segment. However, constant acceleration in D does not mean any of the ball s physics though it is the best choice. In our setting, we use 3D ball trajectory as an 3D parabolic curve with the acceleration of gravity. Some landmarks on the pitch gives clues to reconstruct 3D information which leads to 3D ball tracking [4]. If the ball can be viewed from the multiple cameras, ideally the 3D rays from each camera centers will meet at one point which is the center of the ball. However, this can be applied to clutters too. We have to exclude less probable 3D points(clutters) one by one until we get the most probable one(true 3D ball point). This can be done by evaluating the consistency of a sequence of 3D points. As applications, 3D tracking results can be further processed to lead to enriched broadcast and game analysis [5]. The purpose of our trajectory estimation approach is to achieve lossless tracking given a batch of long video sequence such as a half soccer match. The rest of this paper is organized as following: Section deals with our soccer tracking environment and preimage processing. The ball trajectory extraction algorithm is discussed in Section 3. Section 4 provides experimental results and finally Section 5 concludes this paper. Getting Observation Data The input data to ball tracking is taken from synchronized N(four in our cases as in Figure 1) multiple views of static cameras which means 3D tracking. Each camera is static covering some Figure 1: Multiple cameras surrounding the pitch. part of the pitch. All the camera images are background-subtracted and connected-componentlabeled so that an object is observed as an image blob(s)(middle and right columns of Figure ). Using the lines drawn on the pitch, each camera calibration is done so that a corresponding 3D ray can be computed when a point in D image is given. 1
Figure : Extracting ball candidate blobs. n th row corresponds to n th camera. Left, middle and right columns show original, foreground and ball candidates images respectively. Ball positions are manually marked for readers better understanding. 3 Ball Trajectory Estimation Ball tracking is also a process of filtering one most likely sequence of 3D positions out of a sequence of noisy multi-view D observation sets. Before going further let us define some sets. Q 1:T = {q t } T (1) S = {Q t1 :t 1 t 1 t T } () { {q } i It } T U = t (3) i=1 { { } } N T P = {p n t (h)} H n h=1 (4) n=1 where p and q denote points in D and 3D respectively, N is the number of cameras and H is the number of observations. Q 1:T is the optimal trajectory, S is a set of 3D trajectory segments, U is a set of 3D ball candidates and P is a sequence of noisy multi-view D observation sets. As in reverse order, U is built from P, then S is from U as well as Q 1:T is extracted from S. To build U from P, for each time t, 3D ball candidates are generated from all the possible pairs ( ) C N of synchronized N views. Given the camera parameters, a point, for example the one q g on the pitch ground, on a ray from the camera center, q c are projected on a point on the image. Ideally if there exists a 3D object(as a point) and a pair of camera project it, the rays from each camera center to the projected point meet at the 3D point. However, due to some noise, the rays may not meet each other posing some distance. If the distance is tolerable, the mid-point between the rays is taken as a 3D ball candidate. The mid-point q mid between two rays passing through two points q1 c and q g 1, and qc and q g respectively is computed as following. where A = q1, mid = qclosest 1 + q closest ] (5) = A 1 B (6) [ q closest 1 q closest q1 c (z) 0 µ g,c 1 (z) 0 0 0 0 q1 c (z) µ g,c 1 (z) 0 0 0 0 0 0 q c (z) 0 µ g,c (x) 0 0 0 0 q c (z) µ g,c (y) µ c,g 1 (x) µ c,g 1 (y) µ c,g 1 (z) µ g,c 1 (x) µ g,c 1 (y) µ g,c 1 (z) µ c,g (x) µ c,g (y) µ c,g (z) µ g,c (x) µ g,c (y) µ g,c (z) (7) B = (q g 1 (x) qc 1 (z), q g 1 (y) qc 1 (z), q g (x) qc (z), q g (y) qc (z), 0, 0) T (8) q1 closest and q closest are the points on the two rays closest to each other and µ a,b i = qi a qb i. Then U is the set of mid-points : { { ( )} } N T U = qm mid N (n),m N (n+1) p j m N (n) (t), pk m N (n+1) (t) n=1 (9) where 1 j Hm t N (n), 1 k Ht m N (n+1) and m N (n) = MAX (mod (n, N + 1), 1). From U, S is built by extending all the possible triplets, three consecutive 3D ball candidates, as long as possible to give trajectory segment candidates. A sequence of three ball candidates in U is qualified to be a triplet if their acceleration and velocities show that of ballistic motion under gravity : S = where β t = δ t = ϕ t = { } {q t } t t=t 1 β t δ t δ t+1 ϕ t > 0, t : t 1 t t (10) { 1 T l β < q t+ (z) q t+1 (z) + q t (z) < Tβ (11) u { 1 T l δ < q t+1 q t < Tδ u (1) { ( 1 cos 1 (qt+1 q t) (q t+ q t+1) q t+1 q t q t+ q t+1 < T ϕ ) (13) To get Q 1:T from S, the longest one among the segment candidates is chosen and fitted to a parabolic
curve parameterized by Θ. the entire length of the second sequence is about 50 minutes covering a half of a soccer match which we Θ = {a, b, c, d, e, f, g q t (x) = at + b, q t (y) = cq (x) + d, aim at. q t (z) = eq (x) + fq (x) + g} Clutters caused from some audience mimic the (14) air ball and produce false positives. However, they are likely filtered off since their long-term behaviors The nearest segments to the both ends of the longest are merged into the longest if its fitness to the curve is tolerable, then the curve is updated considering the new support. After iterations as shown in Algorithm 1, is estimated a parabolic representative of the ball motion for a certain period both ends of which correspond to the points where the hardly satisfies the ballistic motion constraints, e.g., they never come down on the ground. When the ball rolls over the ground, its trajectory is theoretically an extreme case of a 3D parabola. So the estimation of parabolic parameters is very sensitive to noise so that most such cases are not estimated. 3D curve meets the pitch ground. Since the coefficients of 3D parabolic curve equation is estimated Q 1:T is a function of time. 5 Conclusion Algorithm 1 Growing a sequence of 3D points supporting a 3D parabolic curve Require: {q} S = arg max (j i) {q t } j t=i S Equation 14 Ensure: AreOverlapped(a, b) returns T (true) if the sequences a and b are temporally overlapped. AreConsistent(a, b) returns T if the sequences a and b are qualified to be parts of the same sequence. while ischanged = T and isoutofrange = F do ischanged := F isoutofrange := F for i, {q} i S do if AreOverlapped({q} i, {q} S ) = T and AreConsistent({q} i, {q} S ) = T then {q} S := {q} S {q} i Θ S := arg max p ({q} S Θ) Θ ischanged := T if S = or q S 1 (z) 0 and q S {q} S (z) 0 then isoutofrange := T end if break end if end for end while 4 Experiments Experiments were carried out on two video sequences. One is 500-frame long and of SD size 70 480 (Figure 3) while the other has 6000 frames and size of HD, 180 70 (Figure 4). Actually This paper proposed a 3D soccer ball tracking method from multi-camera environment. A soccer ball is projected to an image as a small whitish round blob. With multiple synchronized views, its 3D position can be estimated. In this paper, the ball trajectory is modeled as a sequence of 3D parabolic(ballistic) curves from multiple views. With this trajectory estimation method, lossless tracking is guaranteed for pretty long video clips as shown in our experiments. Currently our method uses a greedy approach which finds the best trajectory segment first then the second best, and so forth. However, globally optimal approach with probabilistic frame will be desirable. Acknowledgements This research is accomplished as the result of the research project for culture contents technology development supported by KOCCA References [1] Yu, X., Xu, C., Leong, H., Tian, Q., Tang, Q., Wan, K.: Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video. In: ACM MM03, Berkeley. (003) 11 0 [] Tong, X.F., Lu, H.Q., Liu, Q.S.: An effective and fast soccer ball detection and tracking method. In: ICPR (4). (004) 795 798 [3] Yan, F., Kostin, A., Christmas, W., Kittler, J.: A novel data association algorithm for object tracking in clutter with application to tennis video analysis. In: CVPR 06: Proceedings of the 006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, IEEE Computer Society (006) 634 641
[4] Kang, J., Cohen, I., Medioni, G.: Soccer player tracking across uncalibrated camera streams. In: Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS). (003) [5] Yan, X., Yu, X., Hay, T.S.: A 3d reconstruction and enrichment system for broadcast soccer video. In: MULTIMEDIA 04: Proceedings of the 1th annual ACM international conference on Multimedia, ACM Press (004) 746 747 [6] Seo, Y., Choi, S., Kim, H., Hong, K.: Where are the ball and players? soccer game analysis with color-based tracking and image mosaick. In: Proc. Int. Conf. on Image Analysis and Processing, Florence, Italy. (1997) (a) 30 th frame (b) 130 th frame (c) 40 th frame (d) 300 th frame Figure 3: Result images of ball tracking. The projected position of the estimated ball on each camera is marked as red circle
(a) 00 th frame (b) 850 th frame (c) 1000 th frame (d) 1300 th frame (e) 300 th frame (f) 5000 th frame (g) 5400 th frame (h) 5800 th frame Figure 4: Another example of ball tracking. Four cameras are located around the pitch almost symmetrically. The four displayed numbers are frame index and X, Y and Z coordinates respectively in order. When the ball position is not estimated the word Disappeared is displayed.