Machine learning has in the last decade changed the way we do our daily tasks. In this new age of machine intelligence, the usage of computer assistance has skyrocketed in fields ranging from education to health care. In recent years, the medical field has seen significant improvements regarding the practice of computer-assisted medical diagnosis, and as computing power increases, the models used by medical professionals get more and more accurate. Within the medical field, the practice of automated disease detection in videos and images from the gastrointestinal tract has received much attention in the last years. However, the quality of image data is often reduced due to overlays of text, personal data, and black corners around the medical images.
As an attempt to address the challenge of improving the field of computer-aided diagnosis, our work explores ways to help existing models to increase their accuracy when it comes to finding anomalies in medical images. In this thesis, we tackle the problems associated with the misclassification of data based on overlays and other artefacts in the medical image data.
We will look at how we can use machine learning to develop a system to increase the classification accuracy of existing models, as well as going in-depth into the topic of preprocessing to see if it has a place in modern classification models based on deep learning.
During this thesis, we will look at different tools that we can use to remove dataset specific artefacts, and we will look at the consequences of removing them. Our primary focus lies in the usage of generative adversarial neural networks to cover up parts of images that we have deemed unwanted in our medical images.
In the end, we demonstrate that our system can be of great use as a tool for preprocessing of medical data, showing clearly that with our tools, pretrained networks can be generalised to a much greater extent. With the use of our preprocessing our models saw an increase in classification accuracy of 29.5% when training on new unseen data.