coreclean - Sam Hutton

coreclean is a python utility which uses convolutional neural networks and standard image processing techniques to remove backgrounds, cracks, and burrows from images of sediment cores.

This is important because sediment colour contains information about the Earth's climate. At the Iberian Margin (where the cores I used were taken), the record stretches back over 1.4 million years.

Abstract

Sediment cores contain a long-timescale record of important climatalogical parameters. Even common, expensive, time-consuming sampling methods cannot match the resolution of colour data taken from digital photographs; but this data is noisy, in part due to bioturbation, drilling disturbance, and sampling artifacts. This report presents a number of improvements to existing neural network based methods of segmenting ‘undisturbed’ sediment from core images and applies the newly created model to producing a filtered record of lightness over 160m ∼ 1400ka. These improvements include a novel morphological-process background removal, selection of an improved model architecture (SegNet), and three methods for synthetic data augmentation. Data augmentation did not have a significant effect on model performance. With all of these improvements, our new method produces an Intersection-Over-Union of 0.53 ± 0.09. This model likely underpredicts disturbance, despite selecting 26.6 % of the average core. This suggests a need for more training on a larger dataset to improve the model, and a need for the development of stronger statistical tools to address the gaps created by disturbance modifying the record. Cross-correlation analysis suggests that the resolution available in undisturbed sediment may be lower than previous estimates. Tools for using this model and other common stitching and averaging operations on digital photographs are presented in the coreclean module for python, developed alongside this report.

Selected Figures

Full Report