General information

About this site

This page is a reference guide to the exercise program in the course INF4300 - Digital Image Analysis at the University of Oslo. It is an additional resource provided by me, your teaching assistant, and should not be confused with the official course site.

I will post exercises and assignments, together with solutions, both code and some explanatory text, at a semi-regular frequency, following the course progression. My solutions will be in python, but take a look at Ole Marius’ excellent page from a previous years class for a similar resource, using matlab.

As a disclaimer, I must emphasis that most of the content on this page is my understanding of the different topics. The solution proposals, are excactly that; proposals, so if there are things you do not understand, or you have ideas on how to improve the solutions, I am happy to hear from you.

Image representation

In digital imaging, an image $f$ is essentially a function mapping indices to real (or complex) values

$\begin{align} f &: I_m \times I_n \times I_c \to \mathbb{R} \\ &: [i, j, k] \mapsto f[i, j, k], \end{align}$

where the index set $I_r = \{0, 1, ..., r - 1\}$ for $r = m, n, c$ . Normally we think of $m$ and $n$ as the height and with of the image, respectively, and $c$ as the number of channels. If we collect all the values for all indices in an array, we get the usual array-representation of an image $f \in \mathbb{R}^{m\times n\times c}$ .

Common image categories are $b$ bits one-channel (i.e. graylevel) images, taking values in $[0, 2^b - 1]$ , and rgb(a) images taking real values in $[0, 1]$ for the red, blue, and green channel.

Coordinate conventions

In a conventional 2D Cartesian coordinate system, we use coordinates $(x, y)$ to specify a certain location on the plane. We have an origin at $(0, 0)$ , a positive x-axis horizontally to the right, and a positive y-axis vertically upwards.

When we are dealing with digital images that are represented as an array with numbers, (e.g. a 2D graylevel image with $m$ vertical pixels, and $n$ horizontal pixels), this will be represented with a $m\times n$ matrix (which values are the graylevel intensity values in the image). I will denote a pixel location with $[x, y]$ (or $[i, j]$ ), so that the positive $x$ (or $i$ ) -axis is vertically downwards, and the positive y (or $j$ ) -axis is horizontally to the right (that is, the conventional cartesian system is rotated 90 degrees clockwise).

Note that this conventionis not neccessarily followed by every programming language or library. Most uses the same convention with [row_index, column_index], but when naming the axis, there can be some discrepancies (e.g. matplotlib.pyplot in python uses positive $y$ downwards and positive $x$ horizontally to the right). You can choose whatever convention you want, as long as you know what you are doing, and are consistent, but this is the way that works best for me.

I will explicitly use square brackets $[ ]$ when dealing with images, and parenthesis $( )$ when dealing with arguments of general mathematical functions.