Marker detection¶
- Student: Balthazar Neveu
- Master MVA, Introduction to geoscience
Raw data exploration¶
- 600 wells.
- Location: south of Paris
Raw data captures¶
- Depth
- Gamma rays (clampped numerical outliers to zero).
- Most of the big "variations" occur in very deep areas (6-7km).
- Looks like most of the shallow signals is just noise. Characteristics?
Characteristics¶
- Noise
- Regular sampling rate
- Spikes = outliers
- Missing data.
Curation alternatives (pre-processing):
- 🎯 interpolate missing data (spikes)
- 💡 Denoise (bilateral or bandpass?)
Labelled data exploration¶
- Data has been labeled by experts. We know the depth of 3 types of events.
- Labels are sometimes missing (⚠️ depth=
NaN
).
Depth exploration¶
Absolute depths | Relative depths |
---|---|
It seems like we observe the same depth histogram basically 3 times with an offset. If there is a relationship between the 3 depths, we should be able to see it on the 2 histograms of differences | Looks like if we find one of the events, we almost know where the search the other ones from the distribution. Estimating jointly the 3 events with a prior on their relative depths sounds like a good idea. |
- The events we're looking for are located between 6 to 7 km.
Patterns exploration¶
100 | 200 | 400 |
---|---|---|
neighborhood 100¶
neighborhood 200¶
neighborhood 400¶
neighborhood 64 - 100 samples¶
neighborhood 64 - all samples¶
Templates extraction¶
- Many other markers cannot be explained by this first major template.
- More templates shall be extracted (proceed by iterative elimination).
Template retrieval¶
- Find best match between template and full log signals.
- Find shift which minimizes L2 cost function ~ maximize correlation.
- One should normalize the Euclidian distance. omitted due to lack of time