Marker detection¶

  • Student: Balthazar Neveu
  • Master MVA, Introduction to geoscience

Raw data exploration¶

  • 600 wells.
  • Location: south of Paris

No description has been provided for this image

Raw data captures¶

  • Depth
  • Gamma rays (clampped numerical outliers to zero).

No description has been provided for this image

  • Most of the big "variations" occur in very deep areas (6-7km).
  • Looks like most of the shallow signals is just noise. Characteristics?

Characteristics¶

  • Noise
  • Regular sampling rate
  • Spikes = outliers
  • Missing data.

Curation alternatives (pre-processing):

  • 🎯 interpolate missing data (spikes)
  • 💡 Denoise (bilateral or bandpass?)

Labelled data exploration¶

  • Data has been labeled by experts. We know the depth of 3 types of events.
  • Labels are sometimes missing (⚠️ depth=NaN).

dataset balance

Depth exploration¶

Absolute depths Relative depths
It seems like we observe the same depth histogram basically 3 times with an offset. If there is a relationship between the 3 depths, we should be able to see it on the 2 histograms of differences Looks like if we find one of the events, we almost know where the search the other ones from the distribution. Estimating jointly the 3 events with a prior on their relative depths sounds like a good idea.
  • The events we're looking for are located between 6 to 7 km.

Patterns exploration¶

100 200 400

neighborhood 100¶

neighborhood 200¶

neighborhood 400¶

neighborhood 64 - 100 samples¶

neighborhood 64 - all samples¶

Templates extraction¶

major_template_definition

  • Many other markers cannot be explained by this first major template.
  • More templates shall be extracted (proceed by iterative elimination).

Template retrieval¶

  • Find best match between template and full log signals.
  • Find shift which minimizes L2 cost function ~ maximize correlation.

Template matching

  • One should normalize the Euclidian distance. omitted due to lack of time