Camera Model. Part 1

Posted on Posted in Artificial Intelligent, Computer Science, Computer Vision

In this article i want to explain about first topic of computer vision which is camera model. Before we are going to the main topic, generally image processing is the first material that we should understand but i think we can jump to this topic because computer vision is the topic about artificial eye vision which is copying eye’s work however image processing is the topic about how to processing raw image to the digital image rules, yes it is the same visual topic but image processing talk about processing a two dimensional image but in computer vision mostly talk about visual perspective, artificial pseudo of eye that we don’t need sophiscated algorithm to execute purpose of computer vision but an algorithm computer vision that will build from its model automatically will build a sophiscated computer vision algorithm. Besides a book that titled “Multiple View Geometry in Computer Vision” placed this material before a reader understand more than one view geometry and this material is fundamental and very first (i think) to understand augmented reality with marker-based or image tracking as media to bridge between real and virtual environment, if this material not implemented, the augmented reality would be not immerse.

What is Camera Model?

1. Abstract Camera Pinhole

Camera model is the model of pinhole camera that represented with geometry.

2. Camera pinhole that represented with geometry (a) 3d perspective
Screen Shot 2016-05-27 at 02.54.38
(b) 2d perspective


  • C is the center of camera
  • I or Image Plane, a plane that is a single view geometry of captured object. I perpendicular to Z axis
  • X,Y,Z are axises of space.
  • f is focal length, distance from I to C
  • X, the point with line through image plane is captured object. this object also have 3 axis of space. X = (X,Y,Z) ^{T}

Before i explain a mathematical model of that camera pinhole, i want to tell that the purpose of mathematical model is to keep a captured object that transformed by perspective projection in image plane still can be compute. Dan Huttenlocher said in his presentation material of Computer Vision, Camera Geometry that “Geometric intuition useful but not well suited to calculation”.

 (X,Y,Z)^{T} is mapped to the point  (f\frac{X}{Z},f\frac{Y}{Z},f)^{T} on the image plane. that is describes the central projection mapping from world to image coordinates, that mean it is converting from euclidian 3-space Rˆ3 to euclidian 2-space Rˆ2. The centre of projection is called the camera centre. It is also known as the optical centre.

So that the image plane to be well suited, image plane should represented in homogeneous coordinates. (Continue to Part 2)

Reference :

  4. Richard Hartley and Andrew Sizzerman, “Multiple View Geometry in Computer Vision”, Cambridge University Press 2000, 2003