Acoustic imaging allows the detection and localisation of sound sources in industrial applications. Commonly, acoustic imaging involves combining an array of microphones with a digital camera module. While the camera module creates images with a lens and an array of light sensors, the microphone array creates audio images through beamforming, estimating audio level in a pixel grid. Locating sound sources then involves overlaying the audio image onto the camera image. Proper alignment between the two images is crucial for correct location of the sound source. This thesis proposes a low-complexity setup for calibrating the alignment between the camera and audio images. A sound source with a known colour is placed in front of an acoustic imaging array, playing a known signal. Sound source location is then found using both the digital camera and the microphone array. By combining several recordings of the sound source in different locations, we can measure any differences in the alignment between the camera and audio images. Alignment errors can be overcome with a least squares estimator used in estimating camera sensor offset and camera rotation. The offset and rotation is applied to the camera image giving near perfect alignment.