Image operations

Image filtering

This task is about filtering an image with some filter kernel. This is the mathematical operation of convolution (or correlation) applied on an image with some filter kernel in order to distort the original image. We can use different filter kernels to e.g. blur the image, highlight edges in the image, or discriminate between different textures in the image.

For an $M\times N$ image $f$ , and a $(2P - 1)\times (2Q - 1)$ kernel $g$ , the convolution of $f$ with $g$ , at pixel coordinate $[x, y]$ , is

$(f \circ g)[x, y] = \sum_{p = -P}^P \sum_{q = -Q}^Q f[x - p, y - q]g[p, q]$

and similarily for the closely related correlation

$(f \diamond g)[x, y] = \sum_{p = -P}^P \sum_{q = -Q}^Q f[x + p, y + q]g[p, q]$

which is identical to a convolution when the kernel is symmetric about its anti-diagonal axis (bottom left to top right). When applied to a whole image, we often read the term “sliding”, which means that we apply the above definitions on all pixels $[x, y]$ in the image (with special treatement at the boundary).

Observing the definition of the convolution, we see that for pixels close to the boundary ( $x - p < 0, x + p \ge M$ ) and ( $y - q < 0, y + q \ge N$ ), we are outside the domain of the image. There are tree main ways to resolve this.

One way is to apply the full discrete linear convolution of the inputs, in which case the convoluted image has size $(M + 2P)\times (N + 2Q)$ . See this demonstration for what is meant by a full convolution.
A second way is to only use valid values, in which case the convolved image has size $(M - 2P)\times(N - 2Q)$ . In this case the whole “sliding window” is always inside the original image boundaries.
A third way is to force the convolved image to have the same size as the original by padding the image with $P$ pixels at each horizontal border, and with $Q$ pixels at each vertical boarder. This means that when we are in the infeasible boundary region, we extend the boundary with a desired number of pixels, and give them some values, before applying the convolution. Common boundary treatements are demonstrated in the below illustration.

import numpy as np
from scipy import signal
import cv2

# Blurring of a 2D graylevel image

img_filename = '/path/to/some/image.jpg'
img = cv2.imread(img_filename, cv2.IMREAD_GRAYSCALE) # image shape is (M x N)

# Construct a filter kernel used for blurring, here P=Q=5
mean_kernel = np.ones((11, 11)) / (11*11)

# Convolutions with different boundary treatements (uncomment the ones you want
# to try)
#mean_img = signal.convolve2d(img, mean_kernel, mode='full')
#mean_img = signal.convolve2d(img, mean_kernel, mode='valid')
#mean_img = signal.convolve2d(img, mean_kernel, mode='same', boundary='fill', fillvalue=0)
#mean_img = signal.convolve2d(img, mean_kernel, mode='same', boundary='wrap')
mean_img = signal.convolve2d(img, mean_kernel, mode='same', boundary='symm')

print('Shape of original:  ', img.shape)
print('Shape of convolved: ', mean_img.shape)

Below follows a demonstration of what is meant by a full convolution. I chose to illustrate it in 1D, as it is simpler for me, and I think that it illustrates the point quite well.

# :::::::::::::::::::::::::::::::: #
# 1D illustrations of convolutions #
# :::::::::::::::::::::::::::::::: #


# Demonstration of full convolution in one dimension #
# -------------------------------------------------- #

# a = [1, 2, 3, 4]
# b = [5, 6, 7]
#
# convolve(a, b) = [1*5,
#                   2*5 + 1*6,
#                   3*5 + 2*6 + 1*7,
#                   4*5 + 3*6 + 2*7,
#                   4*6 + 3*7,
#                   4*7]
#
#                 = [5, 16, 34, 52, 45, 28]

# Check
import numpy as np
from scipy import signal

a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7])
c = signal.convolve(a, b, mode='full')
print(c)  # should output [ 5 16 34 52 45 28]


# Demonstration of boundary extentions #
# ------------------------------------ #

# Image boundaries are denoted with |, so in this case, the boundary is
# extended with three pixels at each side, and the original image is of length
# 4.

# Constant with value x:          xxx|abcde|xxx
# Mirror (symmetric, reflection): cba|abcde|edc
# Wrap:                           cde|abcde|abc
# Replicate:                      aaa|abcde|eee