Part V: Real-World Applications

Chapter 19

Capstone Projects: From Raw Data to a Signed Decision

schedule15 min readfitness_center6 exercises

Every chapter so far taught one skill on clean data. The field does not hand you clean data, and it does not ask for a skill; it asks for a decision, signed, with your name on it: perforate here, book this many barrels, drill the next well or walk away. This chapter is where the book pays off. Four projects take messy, broken, real-shaped data and carry it all the way to a number an engineer would stake a development on.

The four are deliberately separate (a log interpretation, a reserves forecast, a drilling benchmark, a volumetric estimate), but each obeys the same discipline the rest of the book has drilled into you. (1) Quality-control the data before trusting it, because the first reading you believe is usually the broken one. (2) Report a range, not a point: P10/P50/P90, where by petroleum convention P90 is the conservative low estimate, P50 the middle, and P10 the optimistic high, because a single number off uncertain data is a lie with a decimal place. (3) Translate the answer into dollars and a decision, because barrels are not the deliverable. (4) Reconcile the statistical answer against the physics, because no reservoir engineer signs a model that disagrees with Archie or material balance. (5) Degrade honestly on a blind well, because a model that only works on the data it has seen has not been tested. That discipline, not any single algorithm, is what separates a portfolio from a pile of notebooks.

infoWhat You'll Learn

  • Build a QC-gated interpretation pipeline that detects, repairs, and quarantines bad log data before it poisons the answer
  • Carry uncertainty end-to-end: porosity → net pay → reserves → NPV, all as P10/P50/P90, never a single number
  • Translate every technical result into dollars and a go/no-go, with the price and discount sensitivities that flip the call
  • Reconcile machine-learning and statistical answers against the deterministic physics, and diagnose where they diverge
  • Benchmark and characterise at field scale: difficulty-normalised drilling performance and Monte-Carlo volumetrics with a value-of-information decision

lightbulbDatasets Used in This Chapter

Project 2 uses the field's real 24-month production export, embedded inline (faults and all). The other three projects generate synthetic-but-physical data with the verified generators from earlier chapters, so every cell runs offline. Each project is self-contained: run its cells in order.

Project 1: Automated Well Log Interpretation Pipeline

A new well, OD-007, has been logged. Before it can be added to the field model someone has to turn four curves into an answer: how many feet of pay, where, and how confident are you? Do it by hand and it takes a petrophysicist a day; do it wrong (miss a thin gas sand, or trust a washed-out density reading) and you perforate water or leave pay behind. We build the pipeline that does it in seconds, and refuses to trust data it should not.

The first cell builds the field: a forward rock-physics model that turns each rock type into the four wireline logs a tool would record, namely GR (gamma ray, high in shale), RHOB (bulk density), NPHI (neutron porosity), and RT (deep resistivity, high in hydrocarbons), giving us a known ground truth to invert against. Skim it top-down; you do not need to trace every term to use what it produces.

main.py

Facies at 0.83 accuracy and porosity at R² 0.89, on a well the models never saw. Those are honest numbers. A pipeline that reports 0.99 here has almost certainly leaked adjacent half-foot samples across its train/test split (depth-correlated logs make a random split cheat); we scored on a separate well precisely to avoid that. Now the data fights back.

main.py

Read the scorecard, then the band, then the reconciliation. The QC step catches the magnitude faults outright (the washout, where the borehole has caved and the density tool reads mud instead of rock, and the resistivity spikes, both flagged by the isolation forest) and catches most of the stuck-tool flatline with a contextual rolling-standard-deviation rule. That is about 74% of the injected bad samples overall; the rolling window inevitably misses a few at the dead zone's edges, which is exactly the precision/recall trade-off Exercise 19.1 makes you tune. It then repairs what it flags by interpolating across the good neighbours, so the interpretation runs on recovered curves, not on a conveniently regenerated well. Net pay, the cumulative feet that clear the reservoir cutoffs, comes back as 33.5 ft, bracketed 32–34.5: the porosity model's own uncertainty propagated through those cutoffs, so the asset team books a range, not a false-precision number. The reconciliation earns trust, and the direction is the whole point: gas is light, so it lowers the density log (RHOB), which makes the naive density-porosity transform read higher porosity than is really there. The ML porosity sits below that transform in the gas sand, and lower is the correct answer, because the model was trained on core, not fooled by the gas effect. And on a completely blind well the pipeline lands within 1.5 ft of truth, not the zero-foot error a leaky split would fake.

The signable output is the composite track above plus one line a geologist can initial: "OD-007: 33.5 ft net pay (32–34.5), gas-bearing sands flagged; recommend perforating the upper pay; blind-well net-pay error ~1.5 ft." That is the project, not the R².

Project 2: Production Forecasting and Reserves Estimation System

The reserves meeting is next week. Four producing wells, twenty-four months of history each, and the question on the table is worth nine figures: how many barrels do we book, and is the development still economic if oil drops to \$55? The data is the field's actual export, and like every real export, it is dirty. We build the system that QCs it, forecasts each well with honest uncertainty, and puts a P10/P50/P90 NPV on the table with the sensitivities that flip the decision.

main.py

The cell prints the three faults it caught (a missing month, a negative rate, and a 15,000-bopd spike) and removes all three before a single curve is fit. That last one matters most: a misplaced decimal, left in, drags the whole decline fit upward and over-books the well. The terminal-decline cap is the other safeguard, though on this field it is insurance that never has to pay out: all four wells decline steeply (fitted b ≈ 0.01–0.10), so they hit the economic limit long before the cap would ever fire, and it changes the booked EUR by nothing. It earns its place for the other kind of well: an unconstrained hyperbolic with b near 1 forecasts a tail that never dies and books reserves that will never be produced (the classic over-booking trap that Exercise 19.3 builds on purpose). Capping the decline at a 6%/yr floor keeps that trap shut. Now turn barrels into the only thing the meeting cares about.

main.py

The reserves-review packet writes itself from here: book P50 ≈ 7.5 MMbbl across the four wells (range 7.1–9.0), field NPV ≈ \341 MM at \75/bbl. The tornado is the headline an asset manager reads first: oil price is the swing factor (±\125 MM), the discount rate barely matters, and the project stays positive down to roughly \21/bbl, so yes, it survives the \$55 stress case comfortably. That sentence, not the EUR, is what gets signed.

Project 3: Drilling Performance Benchmarking Tool

Logs and production told us how much is down there and what it is worth; drilling asks a different question: how well we got to it. Six wells, six different drillers, and a capital meeting that wants to know which well to copy and which to never repeat. The trap is brutal and common: rank them by raw cost-per-foot and you will crucify the team that drilled the deepest, hardest hole and reward the one that drilled a shallow, soft one slowly. Real benchmarking normalises for difficulty first, then ranks what is left (the part the crew actually controls) and puts a dollar figure on the gap, with a confidence interval so one lucky well does not set the target.

main.py

The two bars tell the whole story. By raw cost-per-foot, well B looks worst, but B drilled the second-deepest, second-hardest hole, and punishing it would teach the wrong lesson. Normalise difficulty out and the real laggard is well E: a shallow hole drilled at just 70% efficiency with 52% non-productive time, the least efficient hole in the field, ranked 83rd-percentile despite being the easiest. The technical limit is set, fittingly, by well D, the deepest, hardest hole, drilled best. Two distinct findings fall out: E is the crew to coach (worst efficiency), while B carries the largest absolute recoverable spend (≈\1.6 MM, because it is both deep and inefficiently drilled). What the capital meeting gets is the ranked table with a bootstrap CI on each rank and a next-well target: coach E's practices toward the technical limit, chase B's ≈\1.6 MM of recoverable spend, and do not punish the depth that B and D earned.

Project 4: Reservoir Characterization and Volumetric Uncertainty

The first three projects each answered a question; the last one decides whether to go looking for a better answer. The final question is the biggest: how much oil is in this reservoir, and should we drill the next well? Stripped down, the volumetric equation is grade-school (area × thickness × porosity × oil saturation ÷ shrinkage) and its product is the STOIIP (Stock-Tank Oil Initially In Place: the total barrels in the ground before any is produced). The engineering is in admitting that every one of those inputs is uncertain, propagating that uncertainty with Monte Carlo, reconciling the answer against a deterministic check, and, the part that separates a reservoir engineer from a calculator, turning the uncertainty into a decision about acquiring more data.

main.py

A manager who asks "how much oil is there?" wants to hear "44 million barrels." The honest answer is "between 30 and 60, most likely 43", and the gap between those numbers, not the midpoint, is the decision. The freeze-one-input test makes the value-of-information argument concrete rather than asserted: knowing the area exactly removes far more of the P10–P90 range than any other input, because well control is sparse and the mapped extent is the real unknown. (These shares do not sum to 100% because STOIIP is a product of uncertain factors, so their ranges compound rather than add; it is the ranking, not the absolute percentages, that points to what is worth measuring.) That converts directly into a recommendation: the uncertainty is worth more to reduce than to ignore, so drill one appraisal well to pin the area before committing hundreds of millions to the full development. What ships to the drill-or-wait decision is the distribution above plus that value-of-information call. A coloured map would have looked more finished; this is more useful.

Exercises

These extend the four projects. Each one forces a decision to change, not just a number to recompute.

fitness_center
Exercise 19.1Practice

: Tune the QC Net

In Project 1, the isolation-forest contamination controls how aggressively the pipeline quarantines log samples. Sweep it from 0.02 to 0.15, and for e...

arrow_forward
codePythonSolve Nowarrow_forward
fitness_center
Exercise 19.2Practice

: Does the Stress Case Still Clear?

Re-run Project 2's economics across a sweep of oil prices and find the break-even, the lowest price at which the field NPV is still positive. If your ...

arrow_forward
codePythonSolve Nowarrow_forward
fitness_center
Exercise 19.3Practice

: The Over-Booking Trap

Project 2's wells decline steeply enough that the terminal-decline cap never bites: confirm that removing it barely moves the field EUR. Then build th...

arrow_forward
codePythonSolve Nowarrow_forward
fitness_center
Exercise 19.4Practice

: Blind-Well Honesty

Project 1 scored porosity on a held-out well for a reason. Reproduce both protocols on a field whose logs carry realistic per-well tool drift: an hone...

arrow_forward
codePythonSolve Nowarrow_forward
fitness_center
Exercise 19.5Practice

: Find the Real Underperformer

In Project 3, change well E's formation hardness to match well D's (make it a hard hole too) and re-rank. Does E stay the worst once it is no longer t...

arrow_forward
codePythonSolve Nowarrow_forward
fitness_center
Exercise 19.6Practice

: What Is the Appraisal Well Worth?

In Project 4, halve the area uncertainty (tighten the triangular distribution, as a successful appraisal well would) and recompute the STOIIP P10/P50/...

arrow_forward
codePythonSolve Nowarrow_forward

Summary

  • A project is a decision, not a model. Each of the four ends on something an engineer signs (a net-pay range, a reserves booking, a recoverable-spend target, a drill-or-appraise call), not on an accuracy score.
  • QC is step zero. Real exports carry missing months, negative rates, stuck tools, and 10x spikes; the pipeline that does not detect and quarantine them produces a confident, wrong answer.
  • Report ranges, never points. Porosity uncertainty propagated to net pay, bootstrap to EUR and NPV, Monte Carlo to STOIIP, and in every case the width of the range, not its midpoint, is where the decision actually lives.
  • Translate to dollars and stress-test. Barrels become NPV; NPV gets a tornado; the deliverable is whether the call survives the low-price case, not the base case.
  • Reconcile against the physics and degrade honestly. ML porosity checked against density-porosity and diagnosed in the gas sand; difficulty divided out of cost-per-foot; a blind well that the pipeline never saw. A number no one can reconcile is a number no one will sign.

You have now run the full arc of the book, from a raw, broken log to a signed development decision, using the same Python and the same engineering judgement throughout. Part V continues from here: putting these pipelines in front of users as dashboards, and deploying them beyond a notebook.