calib3d.calib

Theoretical introduction

Camera calibration is based on the pinhole camera model that approximate how lights travels between the scene and the camera sensor. There are two sets of parameters:

The extrinsic parameters define the camera position and orientation relative to the world coordinates system.
The Intrinsic parameters define the camera sensor and lens parameters.

There are 3 different coordinates systems:

The world 3D coordinates system
The camera 3D coordinates system
The 2D pixel positions where 3D positions are being projected.

Extrinsic parameters

The extrinsic parameters defines the transformations from the world 3D coordinates (\left[x_O,y_O,z_O\right]^T) to the camera 3D coordinates (\left[x_C,y_C,z_C\right]^T). The camera 3D coordinates system has the following conventions:

The point ((0,0,0)) is the center of projection of the camera and is called the principal point.
The (z) axis of the camera points towards the scene, the (x) axis is along the sensor width pointing towards the right, and the (y) axis is along the sensor height pointing towards the bottom.

The camera coordinates system is therefore a transformation of the world coordinates systems with:

A rotation defined by a rotation matrix (R) using euler angles in a right-hand orthogonal system. The rotation is applied to the world coordinates system to obtain the camera orientation.
A translation defined by a translation vector (T) representing the position of the center of the world in the camera coordinates system !

Hence,

$$\lambda\left[\begin{matrix}x_C\ y_C\ z_C\ 1\end{matrix}\right] = \left[\begin{matrix} R_{3\times 3} & T_{3\times 1}\{\bf 0}_{1\times 3}&1 \end{matrix}\right] \left[\begin{matrix}x_O\y_O\z_O\1\end{matrix}\right]$$

Important notes:

The rotation matrix represents a passive (or alias) transformation because it's the coordinates system that rotates and not the objects.
Euler angles define a 3D rotation starting with a rotation around (x) followed by a rotation around (y) followed by a rotation around (z) (the order matters).
If (T) is expressed in the camera coordinate system, the position of the camera expressed in world coordinates system is (C:=-R^{-1}T = -R^{T}T) (since (R) is a rotation matrix).

Intrinsic parameters

The intrinsic parameters defines the transformation between the 3D coordinates relative to the camera center (\left[x_C,y_C,z_C\right]^T) and the 2D coordinates in the camera sensor (\left[i,j\right]^T). This transformation is called a projection and includes:

the scale produced by the focal length, with (f) being the distance between the camera center and the plane on which the image is projected.
the scale factors ((m_x,m_y)) relating pixels units to distance units (usually (m_x=m_y) because pixels are squares).
the translation from the camera _principal point_ to a top-left origin, with ((u_0,v_0)) being the position of the principal point expressed in the image coordinates system.
a skew coefficient (\gamma) between the (x) and (y) axis in the sensor (usually (\gamma=0) because pixels are squares).

Those transformations can be aggregated in one single matrix called camera matrix:

$$K := \left[\begin{matrix}f\cdot m_x & \gamma & u_0 \ 0 & f\cdot m_y & v_0 \ 0 & 0 & 1\end{matrix}\right]$$

Therefore, $$\lambda\left[\begin{matrix}i\ j\ 1\end{matrix}\right]= \left[\begin{matrix}K_{3\times 3}&{\bf 0}_{3\times 1}\end{matrix}\right]\left[\begin{matrix}x_C\y_C\z_C\1\end{matrix}\right]$$

Notes:

The width and height of the image are to be added to those parameters and delimits the sensor width and height in pixels.
When applying the direct projection of a given 3D point, different values of (\lambda) will always give the same 2D point.
When applying the inverse projection on a given 2D point, different values of (\lambda) will give different 3D points.

This is obvious when simplifying the relation between the two points (The column ({\bf 0}_{3\times 1}) cancels the homogenous component of the 3D point):

$$\lambda\left[\begin{matrix}i\j\1\end{matrix}\right]= \left[\begin{matrix}K_{3\times 3}\end{matrix}\right]\left[\begin{matrix}x_C\y_C\z_C\end{matrix}\right]$$

The 2D vector in homogenous coordinates is not affected by the value of (\lambda), while the 3D vector is.

Projection model

Therefore, by combining

the transformation from the world coordinates system to the camera coordinates system (defined by (R) and (T))
with the projection from the camera coordinates system to the image pixels (defined by (K)),

We have a projection model allowing to compute the coordinates of a 2D point in the image (\left(i,j\right)) from a 3D point in the real world (\left(x,y,z\right)) described by the matrix (P): $$P := \left[\begin{matrix}K_{3\times 3}&{\bf 0}_{3\times 1}\end{matrix}\right] \left[\begin{matrix}R_{3\times 3}&T_{3\times 1}\{\bf 0}_{1\times 3}&1\end{matrix}\right]=K_{3\times 3}\left[\begin{matrix}R_{3\times 3}&T_{3\times 1}\end{matrix}\right]$$

The opposite operation requires to invert (P) and is done by pseudo-inverse inversion because (P) is rectangular.

Limitations

We need a model of the distortions brought by the lens:

radial distortion cause lines far from principal point to look distorted.
tangential distortion: occur when lens is not prefectly align with the (z) axis of the camera.

Many lens distortion models exist. Here, we will use the model used in opencv.

Distortion models

There are actually 2 different models:

The direct model that applies distortion: find the 2D image (distorted) coordinates of a 3D point given its 2D projected (undistorted) coordinates.
The inverse model that rectifies distortion: find the 2D (undistorted) coordinate allowing to project a 2D (distorted) image coordinate into 3D.

.. image:: https://github.com/ispgroupucl/calib3d/blob/main/assets/distortion_steps.png?raw=true

Direct model: "distort"

The following relationships allows to compute the distorted 2D coordinates ((x_d,y_d)) on an image from the 2D coordinates provided by the linear projection model ((x_u,y_u)). Those coordinates are expressed in the camera coordinate system, i.e. they are distances (not pixels) and the point ((0,0)) is the principal point of the camera.

$$\begin{align} x_d &= x_u\overbrace{\left(1 + k_1 {r_u}^2 + k_2 {r_u}^4 + k_3 {r_u}^6+\cdots\right)}^{\text{radial component}} + \overbrace{\left[2p_1 x_uy_u + p_2\left({r_u}^2 + 2{x_u}^2\right)\right]}^{\text{tangeantial component}}\ y_d &= y_u\left(1 + k_1 {r_u}^2 + k_2 {r_u}^4 + k_3 {r_u}^6+\cdots\right) + \left[2p_2 x_uy_u + p_1\left({r_u}^2 + 2{y_u}^2\right)\right] \end{align}$$

Where:

(k_1, k_2, k_3, \cdots) are the radial distortion coefficients
(t_1, t_2) are the tangential distortion coefficients
({r_u}^2 := {x_u}^2 + {y_u}^2)

We usually use only 3 radial distortion coefficients, which makes a total of 5 coefficients. Those coefficients are found by running an optimisation algorithm on a set of 2D point - 3D point relations as we did with cv2.calibrateCamera. They are stored in the k vector.

Inverse model: "rectify"

The distortion operation cannot be inverted analitically using the coefficients (k_1), (k_2), (k_3), (p_1), (p_2) (i.e. have (x_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)) and (y_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d))). We either need another model with another set of coefficients, or make an approximation.

Here, we will use the following approximation: We will assume that the distortion at point ((x_d,y_d)) would be the same that distortion at point ((x_u,y_u)) ! Therefore:

$$\left{\begin{align} 2p_1 x_uy_u + p_2\left({r_u}^2 + 2{x_u}^2\right) &\approx 2p_1 x_dy_d + p_2\left({r_d}^2 + 2{x_d}^2\right)\ 2p_2 x_uy_u + p_1\left({r_u}^2 + 2{y_u}^2\right) &\approx 2p_2 x_dy_d + p_1\left({r_d}^2 + 2{y_d}^2\right) \ \left(1 + k_1 {r_u}^2 + k_2 {r_u}^4 + k_3 {r_u}^6+\cdots\right) &\approx \left(1 + k_1 {r_d}^2 + k_2 {r_d}^4 + k_3 {r_d}^6+\cdots\right) \end{align}\right. $$

If this approximation holds, it's much easier to get an analytical expression of (x_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)) and (y_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)).

Implementation

This library defines a Calib object to represent a calibrated camera. Its constructor receives in arguments the intrinsic and extrinsic parameters:

image dimensions width and height,
the translation vector T,
the rotation matrix R,
the camera matrix K.

The method project_3D_to_2D allows to compute the position in the image of a 3D point in the world. The opposite operation project_2D_to_3D requires an additional parameter Z that tells the (z) coordinate of the 3D point.

View Source

  1import pickle
  2import numpy as np
  3import cv2
  4from .points import Point2D, Point3D, HomogeneousCoordinatesPoint
  5import warnings
  6
  7__doc__ = r"""
  8
  9# Theoretical introduction
 10
 11Camera calibration is based on the pinhole camera model that approximate how lights travels between the scene and the camera sensor. There are two sets of parameters:
 12
 13- The **extrinsic parameters** define the camera position and orientation relative to the world coordinates system.
 14- The **Intrinsic parameters** define the camera sensor and lens parameters.
 15
 16There are 3 different coordinates systems:
 17
 18- The world 3D coordinates system
 19- The camera 3D coordinates system
 20- The 2D pixel positions where 3D positions are being projected.
 21
 22
 23## Extrinsic parameters
 24The extrinsic parameters defines the transformations from the world 3D coordinates \(\left[x_O,y_O,z_O\right]^T\) to the camera 3D coordinates \(\left[x_C,y_C,z_C\right]^T\). The camera 3D coordinates system has the following **conventions**:
 25
 26- The point \((0,0,0)\) is the center of projection of the camera and is called the *principal point*.
 27- The \(z\) axis of the camera points *towards the scene*, the \(x\) axis is along the sensor width pointing towards the right, and the \(y\) axis is along the sensor height pointing towards the bottom.
 28
 29The camera coordinates system is therefore a *transformation* of the world coordinates systems with:
 30
 31- A **rotation** defined by a rotation matrix \(R\) using euler angles in a right-hand orthogonal system. The rotation is applied to the world coordinates system to obtain the camera orientation.
 32- A **translation** defined by a translation vector \(T\) representing the position of the center of the world in the camera coordinates system !
 33
 34Hence,
 35
 36$$\lambda\left[\begin{matrix}x_C\\ y_C\\ z_C\\ 1\end{matrix}\right] = \left[\begin{matrix}
 37R_{3\times 3} & T_{3\times 1}\\{\bf 0}_{1\times 3}&1
 38\end{matrix}\right] \left[\begin{matrix}x_O\\y_O\\z_O\\1\end{matrix}\right]$$
 39
 40**Important notes:**
 41
 42- The rotation matrix represents a passive (or alias) transformation because it's the coordinates system that rotates and not the objects.
 43- Euler angles define a 3D rotation starting with a rotation around \(x\) followed by a rotation around \(y\) followed by a rotation around \(z\) (the order matters).
 44- If \(T\) is expressed in the camera coordinate system, the position of the camera expressed in world coordinates system is \(C:=-R^{-1}T = -R^{T}T\) (since \(R\) is a rotation matrix).
 45
 46
 47## Intrinsic parameters
 48
 49The intrinsic parameters defines the transformation between the 3D coordinates relative to the camera center \(\left[x_C,y_C,z_C\right]^T\) and the 2D coordinates in the camera sensor \(\left[i,j\right]^T\). This transformation is called a *projection* and includes:
 50
 51- the scale produced by the focal length, with \(f\) being the distance between the camera center and the plane on which the image is projected.
 52- the scale factors \((m_x,m_y)\) relating pixels units to distance units (usually \(m_x=m_y\) because pixels are squares).
 53- the translation from the camera _principal point_ to a top-left origin, with \((u_0,v_0)\) being the position of the *principal point* expressed in the image coordinates system.
 54- a skew coefficient \(\gamma\) between the \(x\) and \(y\) axis in the sensor (usually \(\gamma=0\) because pixels are squares).
 55
 56Those transformations can be aggregated in one single matrix called **camera matrix**:
 57
 58$$K := \left[\begin{matrix}f\cdot m_x & \gamma & u_0 \\ 0 & f\cdot m_y & v_0 \\ 0 & 0 & 1\end{matrix}\right]$$
 59
 60Therefore,
 61$$\lambda\left[\begin{matrix}i\\ j\\ 1\end{matrix}\right]= \left[\begin{matrix}K_{3\times 3}&{\bf 0}_{3\times 1}\end{matrix}\right]\left[\begin{matrix}x_C\\y_C\\z_C\\1\end{matrix}\right]$$
 62
 63**Notes:**
 64
 65- The width and height of the image are to be added to those parameters and delimits the sensor width and height in pixels.
 66- When applying the **direct** projection of a given 3D point, different values of \(\lambda\) will always give the **same** 2D point.
 67- When applying the **inverse** projection on a given 2D point, different values of \(\lambda\) will give **different** 3D points.
 68
 69This is obvious when simplifying the relation between the two points (The column \({\bf 0}_{3\times 1}\) cancels the homogenous component of the 3D point):
 70
 71$$\lambda\left[\begin{matrix}i\\j\\1\end{matrix}\right]= \left[\begin{matrix}K_{3\times 3}\end{matrix}\right]\left[\begin{matrix}x_C\\y_C\\z_C\end{matrix}\right]$$
 72
 73The 2D vector in homogenous coordinates is not affected by the value of \(\lambda\), while the 3D vector is.
 74
 75
 76
 77## Projection model
 78
 79Therefore, by combining
 80
 81- the transformation from the world coordinates system to the camera coordinates system (defined by \(R\) and \(T\))
 82- with the projection from the camera coordinates system to the image pixels (defined by \(K\)),
 83
 84We have a projection model allowing to compute the coordinates of a 2D point in the image \(\left(i,j\right)\) from a 3D point in the real world \(\left(x,y,z\right)\) described by the matrix \(P\):
 85$$P := \left[\begin{matrix}K_{3\times 3}&{\bf 0}_{3\times 1}\end{matrix}\right] \left[\begin{matrix}R_{3\times 3}&T_{3\times 1}\\{\bf 0}_{1\times 3}&1\end{matrix}\right]=K_{3\times 3}\left[\begin{matrix}R_{3\times 3}&T_{3\times 1}\end{matrix}\right]$$
 86
 87The opposite operation requires to invert \(P\) and is done by pseudo-inverse inversion because \(P\) is rectangular.
 88
 89
 90## Limitations
 91We need a model of the distortions brought by the lens:
 92
 93- **radial** distortion cause lines far from principal point to look distorted.
 94- **tangential** distortion: occur when lens is not prefectly align with the \(z\) axis of the camera.
 95
 96Many lens distortion models exist. Here, we will use the model used in [opencv](https://docs.opencv.org/3.1.0/d4/d94/tutorial_camera_calibration.html).
 97
 98
 99## Distortion models
100
101There are actually 2 different models:
102
103- The direct model that applies distortion: find the 2D image (distorted) coordinates of a 3D point given its 2D projected (undistorted) coordinates.
104- The inverse model that rectifies distortion: find the 2D (undistorted) coordinate allowing to project a 2D (distorted) image coordinate into 3D.
105
106.. image:: https://github.com/ispgroupucl/calib3d/blob/main/assets/distortion_steps.png?raw=true
107
108### Direct model: "distort"
109
110The following relationships allows to compute the distorted 2D coordinates \((x_d,y_d)\) on an image from the 2D coordinates provided by the linear projection model \((x_u,y_u)\). Those coordinates are expressed in the camera coordinate system, i.e. they are distances (not pixels) and the point \((0,0)\) is the principal point of the camera.
111
112$$\begin{align}
113    x_d &= x_u\overbrace{\left(1 + k_1 {r_u}^2 + k_2 {r_u}^4 + k_3 {r_u}^6+\cdots\right)}^{\text{radial component}} + \overbrace{\left[2p_1 x_uy_u + p_2\left({r_u}^2 + 2{x_u}^2\right)\right]}^{\text{tangeantial component}}\\
114    y_d &= y_u\left(1 + k_1 {r_u}^2 + k_2 {r_u}^4 + k_3 {r_u}^6+\cdots\right) + \left[2p_2 x_uy_u + p_1\left({r_u}^2 + 2{y_u}^2\right)\right]
115\end{align}$$
116
117Where:
118
119- \(k_1, k_2, k_3, \cdots\)  are the radial distortion coefficients
120- \(t_1, t_2\) are the tangential distortion coefficients
121- \({r_u}^2 := {x_u}^2 + {y_u}^2\)
122
123We usually use only 3 radial distortion coefficients, which makes a total of 5 coefficients. Those coefficients are found by running an optimisation algorithm on a set of 2D point - 3D point relations as we did with `cv2.calibrateCamera`. They are stored in the `k` vector.
124
125### Inverse model: "rectify"
126
127The distortion operation cannot be inverted analitically using the coefficients \(k_1\), \(k_2\), \(k_3\), \(p_1\), \(p_2\) (i.e. have \(x_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)\) and \(y_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)\)). We either need another model with another set of coefficients, or make an approximation.
128
129Here, we will use the following approximation: We will assume that the distortion at point \((x_d,y_d)\) would be the same that distortion at point \((x_u,y_u)\) ! Therefore:
130
131$$\left\{\begin{align}
132    2p_1 x_uy_u + p_2\left({r_u}^2 + 2{x_u}^2\right) &\approx 2p_1 x_dy_d + p_2\left({r_d}^2 + 2{x_d}^2\right)\\
133    2p_2 x_uy_u + p_1\left({r_u}^2 + 2{y_u}^2\right) &\approx 2p_2 x_dy_d + p_1\left({r_d}^2 + 2{y_d}^2\right) \\
134    \left(1 + k_1 {r_u}^2 + k_2 {r_u}^4 + k_3 {r_u}^6+\cdots\right) &\approx \left(1 + k_1 {r_d}^2 + k_2 {r_d}^4 + k_3 {r_d}^6+\cdots\right)
135   \end{align}\right.
136    $$
137
138If this approximation holds, it's much easier to get an analytical expression of \(x_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)\) and \(y_u=f_{k_{1,2,3},p_{1,2}}(x_d,y_d)\).
139
140
141# Implementation
142
143This library defines a [Calib](./calib3d.calib#Calib) object to represent a calibrated camera. Its constructor receives in arguments the intrinsic and extrinsic parameters:
144
145- image dimensions `width` and `height`,
146- the translation vector `T`,
147- the rotation matrix `R`,
148- the camera matrix `K`.
149
150The method `project_3D_to_2D` allows to compute the position in the image of a 3D point in the world. The opposite operation `project_2D_to_3D` requires an additional parameter `Z` that tells the \(z\) coordinate of the 3D point.
151
152"""
153
154EPS = 1e-5
155
156class Calib():
157    def __init__(self, *, width: int, height: int, T: np.ndarray, R: np.ndarray, K: np.ndarray, k=None, **_) -> None:
158        """ Represents a calibrated camera.
159
160        Args:
161            width (int): image width
162            height (int): image height
163            T (np.ndarray): translation vector
164            R (np.ndarray): rotation matrix
165            K (np.ndarray): camera matrix holding intrinsic parameters
166            k (np.ndarray, optional): lens distortion coefficients. Defaults to None.
167        """
168        if 'kc' in _:
169            warnings.warn("The 'kc' argument is deprecated. Use 'k' instead.", Warning)
170            k = _['kc']
171        self.w = self.width = int(width)
172        self.h = self.height = int(height)
173        self.T = T
174        self.K = K
175        self.k = np.array((k if k is not None else [0,0,0,0,0]), dtype=np.float64)
176        self.R = R
177        self.C = Point3D(-R.T@T)
178        self.P = self.K @ np.hstack((self.R, self.T))
179        self.Pinv = np.linalg.pinv(self.P)
180        self.Kinv = np.linalg.pinv(self.K)
181
182    def update(self, **kwargs) -> 'Calib':
183        """ Creates another Calib object with the given keyword arguments updated
184            Args:
185                **kwargs : Any of the arguments of `Calib`. Other arguments remain unchanged.
186            Returns:
187                A new Calib object
188        """
189        return self.__class__(**{**self.dict, **kwargs})
190
191    @classmethod
192    def from_P(cls, P, width, height) -> 'Calib':
193        """ Create a `Calib` object from a given projection matrix `P` and image dimensions `width` and `height`.
194
195            Args:
196                P (np.ndarray): a 3x4 projection matrix
197                width (int): image width
198                height (int): image height
199            Returns:
200                A Calib object
201        """
202        K, R, T, Rx, Ry, Rz, angles = cv2.decomposeProjectionMatrix(P) # pylint: disable=unused-variable
203        return cls(K=K, R=R, T=Point3D(-R@Point3D(T)), width=width, height=height)
204
205    @classmethod
206    def load(cls, filename) -> 'Calib':
207        """ Loads a Calib object from a file (using the pickle library)
208            Args:
209                filename (str): the file that stores the Calib object
210            Returns:
211                The `Calib` object previously saved in `filename`.
212        """
213        with open(filename, "rb") as f:
214            return cls(**pickle.load(f))
215
216    @property
217    def dict(self) -> dict:
218        """ Gets a dictionnary representing the calib object (allowing easier serialization)
219        """
220        return {k: getattr(self, k) for k in self.__dict__}
221
222    def dump(self, filename) -> None:
223        """ Saves the current calib object to a file (using the pickle library)
224            Args:
225                filename (str): the file that will store the calib object
226        """
227        with open(filename, "wb") as f:
228            pickle.dump(self.dict, f)
229
230    def _project_3D_to_2D_cv2(self, point3D: Point3D):
231        raise BaseException("This function gives errors when rotating the calibration...")
232        return Point2D(cv2.projectPoints(point3D, cv2.Rodrigues(self.R)[0], self.T, self.K, self.k.astype(np.float64))[0][:,0,:].T)
233
234    def project_3D_to_2D(self, point3D: Point3D) -> Point2D:
235        """ Using the calib object, project a 3D point in the 2D image space.
236            Args:
237                point3D (Point3D): the 3D point to be projected
238            Returns:
239                The point in the 2D image space on which point3D is projected by calib
240        """
241        #assert isinstance(point3D, Point3D), "Wrong argument type '{}'. Expected {}".format(type(point3D), Point3D)
242        point2D_H = self.P @ point3D.H # returns a np.ndarray object
243        point2D_H[2] = point2D_H[2] * np.sign(point2D_H[2]) # correct projection of points being projected behind the camera
244        point2D = Point2D(point2D_H)
245        # avoid distortion of points too far away
246        excluded_points = np.logical_or(np.logical_or(point2D.x < -self.width, point2D.x > 2*self.width),
247                                        np.logical_or(point2D.y < -self.height, point2D.y > 2*self.height))
248        return Point2D(np.where(excluded_points, point2D, self.distort(point2D)))
249
250    def project_2D_to_3D(self, point2D: Point2D, X: float=None, Y: float=None, Z: float=None) -> Point3D:
251        """ Using the calib object, project a 2D point in the 3D image space
252            given one of it's 3D coordinates (X, Y or Z). One and only one
253            coordinate must be given.
254            Args:
255                point2D (Point2D): the 2D point to be projected
256                X (float): the X coordinate of the 3D point
257                Y (float): the Y coordinate of the 3D point
258                Z (float): the Z coordinate of the 3D point
259            Returns:
260                The point in the 3D world that projects on `point2D` and for
261                which the given coordinates is given.
262        """
263        v = [X, Y, Z]
264        assert sum([x is not None for x in v]) == 1, "One and only one of X, Y, Z must be given"
265        assert isinstance(point2D, Point2D), "Wrong argument type '{}'. Expected {}".format(type(point2D), Point2D)
266
267        P = Point3D([0 if x is None else x for x in v])
268        v = np.array([[0 if x is None else 1 for x in v]]).T
269        return self.project_2D_to_3D_plane(point2D, P, v)
270
271    def project_2D_to_3D_plane(self, point2D: Point2D, point3D: Point3D, n) -> Point3D:
272        """ Using the calib object, project a 2D point in the 3D image space
273            given the normal vector and point of a 3D plane with which
274            it intersects.
275            Args:
276                point2D (Point2D): the 2D point to be projected
277                point3D (Point3D): a Point3D on the plane
278                n (np.ndarray): the normal vector of the plane
279            Returns:
280                The point in the 3D world that projects on `point2D` and for
281                which the given coordinates is given.
282        """
283        assert isinstance(point2D, Point2D), "Wrong argument type '{}'. Expected {}".format(type(point2D), Point2D)
284        assert isinstance(point3D, Point3D), "Wrong argument type '{}'. Expected {}".format(type(point3D), Point3D)
285
286        point2D = self.rectify(point2D)
287        X = Point3D(self.Pinv @ point2D.H)
288        d = (X - self.C)
289        return line_plane_intersection(self.C, d, point3D, n)
290
291    def distort(self, point2D: Point2D) -> Point2D:
292        """ Applies lens distortions to the given `point2D`.
293        """
294        if np.any(self.k):
295            rad1, rad2, tan1, tan2, rad3 = self.k.flatten()
296            # Convert image coordinates to camera coordinates (with z=1 which is the projection plane)
297            point2D = Point2D(self.Kinv @ point2D.H)
298
299            r2 = point2D.x*point2D.x + point2D.y*point2D.y
300            delta = 1 + rad1*r2 + rad2*r2*r2 + rad3*r2*r2*r2
301            dx = np.array([
302                2*tan1*point2D.x*point2D.y + tan2*(r2 + 2*point2D.x*point2D.x),
303                2*tan2*point2D.x*point2D.y + tan1*(r2 + 2*point2D.y*point2D.y)
304            ]).reshape(2, -1)
305
306            point2D = point2D*delta + dx
307            # Convert camera coordinates to pixel coordinates
308            point2D = Point2D(self.K @ point2D.H)
309        return point2D
310
311    def rectify(self, point2D: Point2D) -> Point2D:
312        """ Removes lens distortion to the given `Point2D`.
313        """
314        if np.any(self.k):
315            rad1, rad2, tan1, tan2, rad3 = self.k.flatten()
316            point2D = Point2D(self.Kinv @ point2D.H) # to camera coordinates
317
318            r2 = point2D.x*point2D.x + point2D.y*point2D.y
319            delta = 1 + rad1*r2 + rad2*r2*r2 + rad3*r2*r2*r2
320            dx = np.array([
321                2*tan1*point2D.x*point2D.y + tan2*(r2 + 2*point2D.x*point2D.x),
322                2*tan2*point2D.x*point2D.y + tan1*(r2 + 2*point2D.y*point2D.y)
323            ]).reshape(2, -1)
324
325            point2D = (point2D - dx)/delta
326            point2D = Point2D(self.K @ point2D.H) # to pixel coordinates
327        return point2D
328
329    def crop(self, x_slice: slice, y_slice: slice) -> 'Calib':
330        x0 = x_slice.start
331        y0 = y_slice.start
332        width = x_slice.stop - x_slice.start
333        height = y_slice.stop - y_slice.start
334        T = np.array([[1, 0,-x0], [0, 1,-y0], [0, 0, 1]])
335        return self.update(width=width, height=height, K=T@self.K)
336
337    def scale(self, output_width: int=None, output_height: int=None, sx: float=None, sy: float=None) -> 'Calib':
338        """ Returns a calib corresponding to a camera that was scaled by horizontal and vertical scale
339            factors `sx` and `sy`. If `sx` and `sy` are not given, the scale factors are computed with the current
340            width and height and the given `output_width` and `output_height`.
341        """
342        sx = sx or output_width/self.width
343        sy = sy or output_height/self.height
344        width = output_width or int(self.width*sx)
345        height = output_height or int(self.height*sy)
346        S = np.array([[sx, 0, 0], [0, sy, 0], [0, 0, 1]])
347        return self.update(width=width, height=height, K=S@self.K)
348
349    def flip(self) -> 'Calib':
350        """ Returns a calib corresponding to a camera that was flipped horizontally.
351        """
352        F = np.array([[-1, 0, self.width-1], [0, 1, 0], [0, 0, 1]])
353        return self.update(K=F@self.K)
354
355    def rotate(self, angle) -> 'Calib':
356        """ Returns a calib corresponding to a camera that was rotated by `angle` degrees.
357            angle (float) : Angle of rotation in degrees.
358        """
359        if angle == 0:
360            return self
361        A, new_width, new_height = compute_rotate(self.width, self.height, angle, degrees=True)
362        return self.update(K=A@self.K, width=new_width, height=new_height)
363
364    def rot90(self, k) -> 'Calib':
365        """ Returns a calib corresponding to a camera that was rotated `k` times around it's main axis.
366            k (int) : Number of times the array is rotated by 90 degrees.
367
368            .. todo:: This method should be tested.
369        """
370        raise NotImplementedError("Method not tested")
371        R = np.array([[0, 1, 0], [-1, 0, 0], [0, 0, 1]])**k
372        transpose = k % 2
373        return self.update(K=R@self.K, width=self.height if transpose else self.width, height=self.width if transpose else self.height)
374
375    def compute_length2D(self, point3D: Point3D, length3D: float) -> np.ndarray:
376        """ Returns the length in pixel of a 3D length at point3D
377
378            .. versionchanged:: 2.8.0: `length3D` and `point3D` were interverted.
379        """
380        assert np.isscalar(length3D), f"This function expects a scalar `length3D` argument. Received {length3D}"
381        point3D_c = Point3D(np.hstack((self.R, self.T)) @ point3D.H)  # Point3D expressed in camera coordinates system
382        orth = np.dot(point3D_c.flatten(), Point3D(point3D_c.x, point3D_c.y, 0).flatten())
383        orth = orth *length3D / np.linalg.norm(orth)
384        point3D_c.y += orth # add the 3D length orthogonally to the point in camera coordinates
385        point2D = self.distort(Point2D(self.K @ point3D_c)) # go in the 2D world
386        length = np.linalg.norm(point2D - self.project_3D_to_2D(point3D), axis=0)
387        return length#float(length) if point3D.shape[1] == 1 else length
388
389    def compute_length3D(self, point3D: Point3D, length2D: float) -> np.ndarray:
390        """ Returns the (maximum) distance of the 3D point located `length2D` pixels away from `point3D` in the image plane.
391        """
392        assert np.isscalar(length2D), f"This function expects a scalar `length2D` argument. Received {length2D}"
393        point2D_shifted = self.project_3D_to_2D(point3D)
394        point2D_shifted.x += length2D
395        point2D_shifted = self.rectify(point2D_shifted)
396        point3D_shifted_c = Point3D(self.Kinv @ point2D_shifted.H)
397        point3D_c = Point3D(np.hstack((self.R, self.T)) @ point3D.H)  # Point3D expressed in camera coordinates system
398        point3D_shifted_c = point3D_shifted_c*point3D_c.z/point3D_shifted_c.z
399        length = np.linalg.norm(point3D_shifted_c - point3D_c, axis=0)
400        return length
401
402    def projects_in(self, point3D: Point3D) -> np.ndarray:
403        """ Check wether point3D projects into the `Calib` object.
404            Returns `True` where for points that projects in the image and `False` otherwise.
405        """
406        point2D = self.project_3D_to_2D(point3D)
407        cond = (point2D.x >= 0, point2D.y >= 0, point2D.x <= self.width-1, point2D.y <= self.height-1, np.squeeze(~self.is_behind(point3D)))
408        return np.all(cond, axis=0)
409
410    def dist_to_border(self, point3D: Point3D) -> np.ndarray:
411        """ Returns the distance to `Calib` object's borders in both dimensions,
412            For each point given in point3D
413        """
414        point2D = self.project_3D_to_2D(point3D)
415        distx = np.min(np.stack((self.width - point2D.x, point2D.x)), axis=0)
416        disty = np.min(np.stack((self.height - point2D.y, point2D.y)), axis=0)
417        return distx, disty
418
419    def is_behind(self, point3D: Point3D) -> np.ndarray:
420        """ Returns `True` where for points that are behind the camera and `False` otherwise.
421        """
422        n = Point3D(0,0,1) # normal to the camera plane in camera 3D coordinates system
423        point3D_c = Point3D(np.hstack((self.R, self.T)) @ point3D.H)  # point3D expressed in camera 3D coordinates system
424        return np.asarray((n.T @ point3D_c)[0] < 0)
425
426    def __str__(self):
427        return f"Calib object ({self.width}×{self.height})@({self.C.x: 1.6g},{self.C.y: 1.6g},{self.C.z: 1.6g})"
428
429
430def line_plane_intersection(C: Point3D, d, P: Point3D, n) -> Point3D:
431    """ Finds the intersection between a line and a plane.
432        Args:
433            C (Point3D): a point on the line
434            d (np.ndarray): the direction-vector of the line
435            P (Point3D): a Point3D on the plane
436            n (np.ndarray): the normal vector of the plane
437        Returns the Point3D at the intersection between the line and the plane.
438    """
439    d = d/np.linalg.norm(d, axis=0)
440    dot = d.T @ n
441    with np.errstate(divide='ignore', invalid='ignore'):
442        dist = ((P-C).T @ n) / dot
443        dist[np.isinf(dist)] = np.nan
444    #if np.any(np.abs(dot) < EPS):
445    #    return None
446    #dist = ((P-C).T @ n) / dot  # Distance between plane z=Z and camera
447    return Point3D(C + dist.T*d)
448
449
450def point_line_distance(P, d, X) -> float:
451    """ Finds the distance between a line and a point.
452        Args:
453            P (HomogeneousCoordinatesPoint): a point on the line
454            d (np.ndarray): the line direction-vector
455            X (HomogeneousCoordinatesPoint): a point.
456        Returns the distance between the line and the point.
457    """
458    return np.linalg.norm(np.cross(d.flatten(), (P-X).flatten()))/np.linalg.norm(d)
459
460
461def compute_rotate(width, height, angle, degrees=True):
462    """ Computes rotation matrix and new width and height for a rotation of angle degrees of a widthxheight image.
463    """
464    assert degrees, "Angle in gradient is not implemented (yet)"
465    # Convert the OpenCV 3x2 rotation matrix to 3x3
466    R = np.eye(3)
467    R[0:2,:] = cv2.getRotationMatrix2D((width/2, height/2), angle, 1.0)
468    R2D = R[0:2,0:2]
469
470    # Obtain the rotated coordinates of the image corners
471    rotated_coords = [
472        np.array([-width/2,  height/2]) @ R2D,
473        np.array([ width/2,  height/2]) @ R2D,
474        np.array([-width/2, -height/2]) @ R2D,
475        np.array([ width/2, -height/2]) @ R2D
476    ]
477
478    # Find the size of the new image
479    right_bound = max([pt[0] for pt in rotated_coords])
480    left_bound = min([pt[0] for pt in rotated_coords])
481    top_bound = max([pt[1] for pt in rotated_coords])
482    bot_bound = min([pt[1] for pt in rotated_coords])
483
484    new_width = int(abs(right_bound - left_bound))
485    new_height = int(abs(top_bound - bot_bound))
486
487    # We require a translation matrix to keep the image centred
488    T = np.array([
489        [1, 0, new_width/2 - width/2],
490        [0, 1, new_height/2 - height/2],
491        [0, 0, 1]
492    ])
493    return T@R, new_width, new_height
494
495
496def rotate_image(image, angle):
497    """ Rotates an image around its center by the given angle (in degrees).
498        The returned image will be large enough to hold the entire new image, with a black background
499    """
500    height, width = image.shape[0:2]
501    A, new_width, new_height = compute_rotate(width, height, angle)
502    return cv2.warpAffine(image, A[0:2,:], (new_width, new_height), flags=cv2.INTER_LINEAR)
503
504def parameters_to_affine_transform(angle: float, x_slice: slice, y_slice: slice,
505    output_shape: tuple, do_flip: bool=False):
506    """ Compute the affine transformation matrix that correspond to a
507        - horizontal flip if `do_flip` is `True`, followed by a
508        - rotation of `angle` degrees around output image center, followed by a
509        - crop defined by `x_slice` and `y_slice`, followed by a
510        - scale to recover `output_shape`.
511    """
512    # Rotation
513    R = np.eye(3)
514    center = ((x_slice.start + x_slice.stop)/2, (y_slice.start + y_slice.stop)/2)
515    R[0:2,:] = cv2.getRotationMatrix2D(center, angle, 1.0)
516
517    # Crop
518    x0 = x_slice.start
519    y0 = y_slice.start
520    width = x_slice.stop - x_slice.start
521    height = y_slice.stop - y_slice.start
522    C = np.array([[1, 0,-x0], [0, 1,-y0], [0, 0, 1]])
523
524    # Scale
525    sx = output_shape[0]/width
526    sy = output_shape[1]/height
527    S = np.array([[sx, 0, 0], [0, sy, 0], [0, 0, 1]])
528
529    # Random Flip
530    assert not do_flip, "There is a bug with random flip"
531    f = np.random.randint(0,2)*2-1 if do_flip else 1 # random sample in {-1,1}
532    F = np.array([[f, 0, 0], [0, 1, 0], [0, 0, 1]])
533
534    return S@C@R@F
535
536
537def compute_rotation_matrix(point3D: Point3D, camera3D: Point3D):
538    """ Computes the rotation matrix of a camera in `camera3D` pointing
539        towards the point `point3D`. Both are expressed in word coordinates.
540        The convention is that Z is pointing down.
541        Credits: François Ledent
542    """
543    point3D = camera3D - point3D
544    x, y, z = point3D.x, point3D.y, point3D.z
545    d = np.sqrt(x**2 + y**2)
546    D = np.sqrt(x**2 + y**2 + z**2)
547    h = d / D
548    l = z / D
549
550    # camera basis `B` expressed in the world basis `O`
551    _x = np.array([y / d, -x / d, 0])
552    _y = np.array([- l * x / d, - l * y / d, h])
553    _z = np.array([- h * x / d, - h * y / d, -l])
554    B = np.stack((_x, _y, _z), axis=-1)
555
556    # `O = R @ B` (where `O` := `np.identity(3)`)
557    R = B.T # inv(B) == B.T since R is a rotation matrix
558    return R
559
560
561def compute_shear_rectification_matrix(calib: Calib, point2D: Point2D):
562    """ Computes the transformation matrix that rectifies the image shear due to
563        projection in the image plane, at the point `point2D`.
564    """
565    vector = Point2D(point2D.x - calib.K[0,2], point2D.y - calib.K[1,2])
566    R1 = np.eye(3)
567    R1[0:2,:] = cv2.getRotationMatrix2D(point2D.to_int_tuple(), np.arctan2(vector.y, vector.x)*180/np.pi, 1.0)
568    scale = np.cos(np.arctan(np.linalg.norm(vector)/calib.K[0,0]))
569    T1 = np.array([[1, 0, -point2D.x], [0, 1, -point2D.y], [0, 0, 1]])
570    S = np.array([[scale, 0, 0], [0, 1, 0], [0, 0, 1]])
571    R2 = np.eye(3)
572    R2[0:2,:] = cv2.getRotationMatrix2D((0, 0), -np.arctan2(vector.y, vector.x)*180/np.pi, 1.0)
573    T2 = np.array([[1, 0,  point2D.x], [0, 1,  point2D.y], [0, 0, 1]])
574    return T2@R2@S@T1@R1

EPS = 1e-05

def line_plane_intersection( C: calib3d.points.Point3D, d, P: calib3d.points.Point3D, n) -> calib3d.points.Point3D: View Source

431def line_plane_intersection(C: Point3D, d, P: Point3D, n) -> Point3D:
432    """ Finds the intersection between a line and a plane.
433        Args:
434            C (Point3D): a point on the line
435            d (np.ndarray): the direction-vector of the line
436            P (Point3D): a Point3D on the plane
437            n (np.ndarray): the normal vector of the plane
438        Returns the Point3D at the intersection between the line and the plane.
439    """
440    d = d/np.linalg.norm(d, axis=0)
441    dot = d.T @ n
442    with np.errstate(divide='ignore', invalid='ignore'):
443        dist = ((P-C).T @ n) / dot
444        dist[np.isinf(dist)] = np.nan
445    #if np.any(np.abs(dot) < EPS):
446    #    return None
447    #dist = ((P-C).T @ n) / dot  # Distance between plane z=Z and camera
448    return Point3D(C + dist.T*d)

Finds the intersection between a line and a plane. Args: C (Point3D): a point on the line d (np.ndarray): the direction-vector of the line P (Point3D): a Point3D on the plane n (np.ndarray): the normal vector of the plane Returns the Point3D at the intersection between the line and the plane.

def point_line_distance(P, d, X) -> float: View Source

451def point_line_distance(P, d, X) -> float:
452    """ Finds the distance between a line and a point.
453        Args:
454            P (HomogeneousCoordinatesPoint): a point on the line
455            d (np.ndarray): the line direction-vector
456            X (HomogeneousCoordinatesPoint): a point.
457        Returns the distance between the line and the point.
458    """
459    return np.linalg.norm(np.cross(d.flatten(), (P-X).flatten()))/np.linalg.norm(d)

Finds the distance between a line and a point. Args: P (HomogeneousCoordinatesPoint): a point on the line d (np.ndarray): the line direction-vector X (HomogeneousCoordinatesPoint): a point. Returns the distance between the line and the point.

def compute_rotate(width, height, angle, degrees=True): View Source

462def compute_rotate(width, height, angle, degrees=True):
463    """ Computes rotation matrix and new width and height for a rotation of angle degrees of a widthxheight image.
464    """
465    assert degrees, "Angle in gradient is not implemented (yet)"
466    # Convert the OpenCV 3x2 rotation matrix to 3x3
467    R = np.eye(3)
468    R[0:2,:] = cv2.getRotationMatrix2D((width/2, height/2), angle, 1.0)
469    R2D = R[0:2,0:2]
470
471    # Obtain the rotated coordinates of the image corners
472    rotated_coords = [
473        np.array([-width/2,  height/2]) @ R2D,
474        np.array([ width/2,  height/2]) @ R2D,
475        np.array([-width/2, -height/2]) @ R2D,
476        np.array([ width/2, -height/2]) @ R2D
477    ]
478
479    # Find the size of the new image
480    right_bound = max([pt[0] for pt in rotated_coords])
481    left_bound = min([pt[0] for pt in rotated_coords])
482    top_bound = max([pt[1] for pt in rotated_coords])
483    bot_bound = min([pt[1] for pt in rotated_coords])
484
485    new_width = int(abs(right_bound - left_bound))
486    new_height = int(abs(top_bound - bot_bound))
487
488    # We require a translation matrix to keep the image centred
489    T = np.array([
490        [1, 0, new_width/2 - width/2],
491        [0, 1, new_height/2 - height/2],
492        [0, 0, 1]
493    ])
494    return T@R, new_width, new_height

Computes rotation matrix and new width and height for a rotation of angle degrees of a widthxheight image.

def rotate_image(image, angle): View Source

497def rotate_image(image, angle):
498    """ Rotates an image around its center by the given angle (in degrees).
499        The returned image will be large enough to hold the entire new image, with a black background
500    """
501    height, width = image.shape[0:2]
502    A, new_width, new_height = compute_rotate(width, height, angle)
503    return cv2.warpAffine(image, A[0:2,:], (new_width, new_height), flags=cv2.INTER_LINEAR)

Rotates an image around its center by the given angle (in degrees). The returned image will be large enough to hold the entire new image, with a black background

def parameters_to_affine_transform( angle: float, x_slice: slice, y_slice: slice, output_shape: tuple, do_flip: bool = False): View Source

505def parameters_to_affine_transform(angle: float, x_slice: slice, y_slice: slice,
506    output_shape: tuple, do_flip: bool=False):
507    """ Compute the affine transformation matrix that correspond to a
508        - horizontal flip if `do_flip` is `True`, followed by a
509        - rotation of `angle` degrees around output image center, followed by a
510        - crop defined by `x_slice` and `y_slice`, followed by a
511        - scale to recover `output_shape`.
512    """
513    # Rotation
514    R = np.eye(3)
515    center = ((x_slice.start + x_slice.stop)/2, (y_slice.start + y_slice.stop)/2)
516    R[0:2,:] = cv2.getRotationMatrix2D(center, angle, 1.0)
517
518    # Crop
519    x0 = x_slice.start
520    y0 = y_slice.start
521    width = x_slice.stop - x_slice.start
522    height = y_slice.stop - y_slice.start
523    C = np.array([[1, 0,-x0], [0, 1,-y0], [0, 0, 1]])
524
525    # Scale
526    sx = output_shape[0]/width
527    sy = output_shape[1]/height
528    S = np.array([[sx, 0, 0], [0, sy, 0], [0, 0, 1]])
529
530    # Random Flip
531    assert not do_flip, "There is a bug with random flip"
532    f = np.random.randint(0,2)*2-1 if do_flip else 1 # random sample in {-1,1}
533    F = np.array([[f, 0, 0], [0, 1, 0], [0, 0, 1]])
534
535    return S@C@R@F

Compute the affine transformation matrix that correspond to a

horizontal flip if do_flip is True, followed by a
rotation of angle degrees around output image center, followed by a
crop defined by x_slice and y_slice, followed by a
scale to recover output_shape.

def compute_rotation_matrix(point3D: calib3d.points.Point3D, camera3D: calib3d.points.Point3D): View Source

538def compute_rotation_matrix(point3D: Point3D, camera3D: Point3D):
539    """ Computes the rotation matrix of a camera in `camera3D` pointing
540        towards the point `point3D`. Both are expressed in word coordinates.
541        The convention is that Z is pointing down.
542        Credits: François Ledent
543    """
544    point3D = camera3D - point3D
545    x, y, z = point3D.x, point3D.y, point3D.z
546    d = np.sqrt(x**2 + y**2)
547    D = np.sqrt(x**2 + y**2 + z**2)
548    h = d / D
549    l = z / D
550
551    # camera basis `B` expressed in the world basis `O`
552    _x = np.array([y / d, -x / d, 0])
553    _y = np.array([- l * x / d, - l * y / d, h])
554    _z = np.array([- h * x / d, - h * y / d, -l])
555    B = np.stack((_x, _y, _z), axis=-1)
556
557    # `O = R @ B` (where `O` := `np.identity(3)`)
558    R = B.T # inv(B) == B.T since R is a rotation matrix
559    return R

Computes the rotation matrix of a camera in camera3D pointing towards the point point3D. Both are expressed in word coordinates. The convention is that Z is pointing down. Credits: François Ledent

def compute_shear_rectification_matrix(calib: Calib, point2D: calib3d.points.Point2D): View Source

562def compute_shear_rectification_matrix(calib: Calib, point2D: Point2D):
563    """ Computes the transformation matrix that rectifies the image shear due to
564        projection in the image plane, at the point `point2D`.
565    """
566    vector = Point2D(point2D.x - calib.K[0,2], point2D.y - calib.K[1,2])
567    R1 = np.eye(3)
568    R1[0:2,:] = cv2.getRotationMatrix2D(point2D.to_int_tuple(), np.arctan2(vector.y, vector.x)*180/np.pi, 1.0)
569    scale = np.cos(np.arctan(np.linalg.norm(vector)/calib.K[0,0]))
570    T1 = np.array([[1, 0, -point2D.x], [0, 1, -point2D.y], [0, 0, 1]])
571    S = np.array([[scale, 0, 0], [0, 1, 0], [0, 0, 1]])
572    R2 = np.eye(3)
573    R2[0:2,:] = cv2.getRotationMatrix2D((0, 0), -np.arctan2(vector.y, vector.x)*180/np.pi, 1.0)
574    T2 = np.array([[1, 0,  point2D.x], [0, 1,  point2D.y], [0, 0, 1]])
575    return T2@R2@S@T1@R1

Computes the transformation matrix that rectifies the image shear due to projection in the image plane, at the point point2D.