Career Paths: Proving What You Can Do

schedule15 min readfitness_center3 exercises

Twenty-one chapters ago this book opened with a decline plot that decides whether a field gets developed or abandoned. Everything since has been the same move in different clothes: take messy, real-shaped data and carry it to a decision an engineer would put their name on. You can now do that: log interpretation, reserves, drilling benchmarks, volumetrics, the machine-learning models behind them, and the dashboards and services that deliver them. The skills are real.

The question that decides a career is not what else should I learn; there is always more. It is can you prove what you already know to someone deciding whether to hire or promote you? That proof is not a certificate or a list of libraries on a résumé; it is a portfolio that shows you turning data into decisions, and the self-awareness to say which roles your skills actually fit. Both of those are measurable, and engineers measure things. So this short closing chapter does what the rest of the book did: it builds the small tools that score them, and turns them honestly on you.

infoWhat You'll Learn

Find the role you are actually closest to, and the specific gaps between you and the others
Score a portfolio project the way a hiring engineer skims it, and see what is missing
Contribute to the open-source tools you have been using
Read the future of AI in oil and gas without the hype

The Job Is the Intersection

You are not going to out-code a computer-science graduate on algorithms, and you are not going to out-reservoir a thirty-year petroleum engineer on material balance. Your value is the overlap: the person who understands Archie and a confusion matrix, who can take a drilling engineer's vague worry and turn it into a difficulty-normalised benchmark with a confidence interval. That overlap shows up as a handful of distinct roles, each needing a different slice of what this book taught. The honest first step is to see which slice you have.

main.py

# Three real petroleum-data roles, mapped to the book chapters that build each skill.
ROLES = {
    "Reservoir/Production Engineer who codes": [2, 3, 4, 5, 7, 8, 9, 10, 12, 13],
    "Petroleum Data Analyst":                  [2, 3, 4, 5, 6, 9, 15, 20],
    "Petroleum Data Scientist / ML Engineer":  [2, 3, 4, 5, 16, 17, 18, 19, 21],
}
CHAPTER_NAME = {
    2: "Python", 3: "Data structures", 4: "NumPy/pandas", 5: "Visualization", 6: "Data sources",
    7: "Well logs", 8: "PVT", 9: "Decline curves", 10: "Material balance", 12: "Nodal analysis",
    13: "Production optimization", 15: "Gas engineering", 16: "ML fundamentals", 17: "Supervised learning",
    18: "Unsupervised learning", 19: "Capstone projects", 20: "Dashboards", 21: "Deployment",
}


def role_readiness(completed):
    """Given the chapters you've worked through, score readiness for each role:
    the fraction of its skills you've covered, and the specific gaps that remain."""
    done = set(completed)
    out = {}
    for role, need in ROLES.items():
        have = [c for c in need if c in done]
        gaps = [CHAPTER_NAME[c] for c in need if c not in done]
        out[role] = dict(readiness_pct=round(100 * len(have) / len(need), 1), gaps=gaps)
    return out


def best_fit_role(completed):
    r = role_readiness(completed)
    return max(r, key=lambda role: r[role]["readiness_pct"])


# Example: someone who took the fundamentals + the ML/data track but skipped most
# classical reservoir engineering.
mine = [2, 3, 4, 5, 6, 9, 16, 17, 18, 19]
for role, v in role_readiness(mine).items():
    print(f"{v['readiness_pct']:5.1f}%  {role}")
    if v["gaps"]:
        print(f"         gaps: {', '.join(v['gaps'])}")
print(f"\nclosest fit: {best_fit_role(mine)}")

The map does something a résumé cannot: it turns "I know some Python and some petroleum" into a direction. This profile is 89% of the way to a petroleum data scientist (one chapter, deployment, from the whole set) but only halfway to a classical reservoir-engineering role, missing logs, PVT, and material balance. That is not a verdict, it is a choice: lean into the strength and become the ML specialist, or close the reservoir-engineering gaps and become the rarer hybrid who can do both. Either is a real job. What kills careers is not having a gap; it is not knowing which one you have.

A Portfolio Is a Measurable Deliverable

Nobody hires from a list of libraries. They hire from evidence: a few projects that show you doing the actual work. The good news is that you already built four of them: the capstone projects in Chapter 19 are exactly the kind of end-to-end, decision-ending work a reviewer wants to see. The trap is that engineers under-sell them, posting a notebook with a great model and no README, no real data, no statement of what the number means. A hiring reviewer spends ninety seconds on a repo. Here is what they are actually checking, and you can score yourself against it before they do.

main.py

RUBRIC = {                  # what a reviewer skims for, and what each is worth (out of 100)
    "real_data": 25,        # a real or realistic dataset, not toy random numbers
    "states_decision": 25,  # ends on an engineering decision/result, not just an accuracy score
    "reproducible": 20,     # someone can clone it and run it (pinned deps, clear steps)
    "has_readme": 15,       # explains the problem and the result up front
    "has_tests": 15,        # at least some asserts/validation on the key results
}


def score_project(project):
    """Score a portfolio project the way a hiring reviewer skims it, and name what is
    missing. `project` is a dict of {criterion: bool}."""
    score = sum(w for k, w in RUBRIC.items() if project.get(k))
    missing = [k for k in RUBRIC if not project.get(k)]
    return dict(score=score, missing=missing, hireable=score >= 80)   # 80 = clears the high-weight items, misses at most one minor one


strong = {"real_data": True, "states_decision": True, "reproducible": True, "has_readme": True, "has_tests": True}
typical = {"real_data": True, "states_decision": False, "reproducible": False, "has_readme": True, "has_tests": False}
print("a finished project :", score_project(strong))
print("the usual notebook :", score_project(typical))

The two scores are the gap between a project that gets you an interview and one that gets a polite "thanks." The "usual notebook" has a real dataset and a README and still scores 40, because it never states the decision and nobody else can run it: the two things worth half the rubric. Notice what is not on the list: how fancy the model is. A linear regression that ends on "perforate the upper sand, blind-well error ±2 ft" beats a neural network that ends on "test R² = 0.94." That has been the whole book's argument, and it is the argument your portfolio should make for you. The fastest way to raise your score is rarely a better model; it is a README and a sentence about the decision.

Contribute to the Tools You Use

You have spent this book standing on pandas, scikit-learn, NumPy, and lasio (the LAS log reader from Chapter 7), all open source, all maintained by people who started exactly where you are. A merged pull request signals something a personal repo never can: that a stranger with commit rights read your code and judged it worth shipping. That is why contributing back is the most credible portfolio entry there is; it is public, reviewed by people who owe you nothing, and proves you can work inside someone else's codebase. Start small: a documentation fix, a clearer error message, a test for an edge case you hit. The workflow is always the same few steps, and the first pull request is the only hard one.

main.py

# The open-source contribution loop -- the first PR is the only hard one.
# git clone https://github.com/<you>/lasio        # 1. fork, then clone your fork
# git checkout -b fix-las-step-warning            # 2. a branch named for the change
#   ...edit code, ADD A TEST that fails before and passes after...
# pytest tests/ -k las_step                        # 3. prove it works
# git commit -m "Clarify warning when LAS STEP is zero"
# git push origin fix-las-step-warning             # 4. push to your fork
#   ...open a Pull Request describing the problem and the fix...

The Future, Without the Hype

It is tempting to end a book like this with a flourish about AI transforming the industry. The honest version is more useful. Machine learning in oil and gas is real and growing, but its hard limit is not algorithms; it is data. The models in this book worked because the inputs were clean and physical; most real field data is sparse, mislabelled, and full of the −999 null codes you learned to refuse. The engineers who matter over the next decade will not be the ones who can call the fanciest model. They will be the ones who can tell when a model is fooling them, who reconcile the statistics against Archie and material balance, who report a P10–P90 range (the 10th-to-90th-percentile spread) instead of a false-precision point, who know that a confident answer on bad data is the most dangerous output there is.

That skepticism, more than any single technique, is what this book tried to teach. The tools will keep changing: the deep-learning libraries, the cloud platforms, the model of the month. The discipline does not: quality-control the data before you trust it, carry the uncertainty all the way to the decision, translate the result into dollars and barrels, and never sign a number you cannot reconcile with the physics. Carry that, and you will still be useful long after today's tools are footnotes. That is the job. Go build something, and put your name on it.

Exercises

These run the self-assessment tools on real inputs, including, ideally, your own.

fitness_center

Exercise 22.1Practice

: Map Your Own Gaps

Build the list of chapters you have genuinely worked through (not just read) and run role_readiness on it. Which role are you closest to, and what are...

arrow_forward

codePythonSolve Nowarrow_forward

fitness_center

Exercise 22.2Practice

: Score, Then Fix, a Project

Take one of your own notebooks (or a Chapter 19 capstone) and score it with score_project, being honest about each criterion. For every missing point,...

arrow_forward

codePythonSolve Nowarrow_forward

fitness_center

Exercise 22.3Practice

: Weight the Rubric for a Real Employer

The rubric weights (real_data and states_decision at 25 each) reflect what most reviewers value. Pick a specific kind of employer (a service company, ...

arrow_forward

codePythonSolve Nowarrow_forward

Summary

Your value is the intersection. Not the best coder or the best engineer, but the rare person who is fluent in both, and a role map turns that overlap into a direction with named gaps.
A portfolio is the proof, and it is measurable. Reviewers skim for real data, a stated decision, reproducibility, a README, and tests, not for a fancy model. Score yourself before they do.
The decision beats the accuracy score. A simple model that ends on a signable recommendation outranks a complex one that ends on R²; your projects should make that argument for you.
Contribute to the tools you stand on. A reviewed, public pull request is the most credible portfolio entry there is; start with a doc fix or a test.
The scarce skill is skepticism, not algorithms. The data is the limit, not the model. QC it, carry the uncertainty, translate to dollars, reconcile against the physics, and never sign a number you cannot defend. That discipline outlasts every tool.

arrow_backPreviousCh 21: Cloud Deployment & Automation