Camera Model. Part 2

Posted on Posted in Artificial Intelligent, Computer Science, Computer Vision

I will continuing this article from Camera Model Part 1. Last, i had mentioned about homogenous coordinate. A question, why we should use homogenous coordinate?, “Geometric intuition useful but not well suited to calculation”. Yes, a observed image plane which captured from real object is not finish as it is, this is the computer vision so the observed image plane need to compute for particular necessary. Particularly, i learn this camera model to advance stereo vision in the near future, at least i will be mastering 2-view geometry. I hope i can share about stereo vision in the future.

First, Homogenous coordinate is naturally over-parameterize space. For example, we have some values in euclidian representation (left) will transform to (right)

  \begin{pmatrix}  x_{1}, & x_{2}, & x_{3}  \end{pmatrix}^{T}  \rightarrow  \begin{bmatrix}  x_{1}\\  x_{2}\\  x_{3}\\  1  \end{bmatrix}

Second, Transformed homogenous coordinate from euclidian always have α = 1 and α ≠ 0, but the condition will different if homogenous transform to euclidian, so euclidian naturally lost one parameter from homogeneous.

   \begin{bmatrix}  x\\  y\\  z\\  w  \end{bmatrix}  \rightarrow   \begin{pmatrix}  x/w, & y/w, & z/w  \end{pmatrix}^{T}

Okey, i think that is a few explain about homogeneous coordinate, i may create article just for homogeneous coordinate next time. Let’s we start again about camera model and its relation with homogeneous coordinate.

Central Projection using homogeneous coordinate. If the world (i called real object before) and image points are represented by homogeneous coordinate (represented as vector), then central projection is very simply expressed as a linear mapping between their homogeneous coordinates. In particular (pinhole camera geometry image at part 1) may be written in terms of matrix multiplication as

  \begin{bmatrix}  X\\  Y\\  Z\\  1  \end{bmatrix}  \rightarrow  \begin{pmatrix}  fX\\  fY\\  fZ  \end{pmatrix}= \begin{bmatrix}  f & & & 0\\  & f & & 0\\  & & 1 & 0  \end{bmatrix}  \begin{pmatrix}  X\\  Y\\  Z\\  1  \end{pmatrix}

 The matrix in this expression may be written as diag\bigl(\begin{smallmatrix}  f, & f, & 1  \end{smallmatrix}\bigr)  \begin{bmatrix}  I|0  \end{bmatrix} where diag\bigl(\begin{smallmatrix}  f, & f, & 1  \end{smallmatrix}\bigr)  is a diagonal matrix and \begin{bmatrix}  I|0  \end{bmatrix} represents a matrix divided up into a 3 x 3 block (the identity matrix) plus a column vector, here the zero vector.


  • Camera model represented as euclidian as default but can’t resolve for calculation
  • Homogeneous coordinate is the solution for calculation
  • diag\bigl(\begin{smallmatrix}  f, & f, & 1  \end{smallmatrix}\bigr)  \begin{bmatrix}  I|0  \end{bmatrix} is homogeneous coordinate that represent camera model

Reference :

  4. Richard Hartley and Andrew Sizzerman, “Multiple View Geometry in Computer Vision”, Cambridge University Press 2000, 2003