Skip to content

docs: add utility doctest examples#804

Open
EKtheSage wants to merge 3 commits into
casact:mainfrom
EKtheSage:docs/704-utility-examples
Open

docs: add utility doctest examples#804
EKtheSage wants to merge 3 commits into
casact:mainfrom
EKtheSage:docs/704-utility-examples

Conversation

@EKtheSage
Copy link
Copy Markdown
Contributor

@EKtheSage EKtheSage commented May 16, 2026

Summary: Add Sphinx doctest examples for the PatsyFormula utility docs. Split from the larger #792 work and intentionally excludes .github/workflows/sync-main-to-docs.yml. Refs #704


Note

Low Risk
Low risk: changes are documentation-only (docstrings/doctest examples) and do not modify runtime behavior.

Overview
Adds Sphinx-friendly docstrings with doctest examples to several utilities in utility_functions.py, including read_pickle, read_json, concat, minimum, maximum, and PatsyFormula.

The new examples demonstrate common workflows (e.g., serializing/restoring estimators and pipelines, combining triangles for MunichAdjustment, comparing ultimate scenarios, and building formula-based design matrices) to improve documentation coverage and testability.

Reviewed by Cursor Bugbot for commit cf48166. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0a7c2f9. Configure here.

Comment thread chainladder/utils/utility_functions.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.23%. Comparing base (72b270c) to head (cf48166).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #804   +/-   ##
=======================================
  Coverage   86.23%   86.23%           
=======================================
  Files          86       86           
  Lines        4947     4947           
  Branches      643      643           
=======================================
  Hits         4266     4266           
  Misses        484      484           
  Partials      197      197           
Flag Coverage Δ
unittests 86.23% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@henrydingliu
Copy link
Copy Markdown
Collaborator

please pull main and incorporate recent changes

@EKtheSage EKtheSage force-pushed the docs/704-utility-examples branch from 0a7c2f9 to 9175ae7 Compare May 16, 2026 20:31
Comment thread chainladder/utils/utility_functions.py Outdated

.. testcode::

clrd = cl.load_sample("clrd").groupby("LOB").sum().iloc[:2]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test demonstrates that concatting identical columns doesn't do anything, which doesn't match the example text.

def minimum(x1, x2):
"""Element-wise minimum of two triangles (delegates to ``Triangle.minimum``).

Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need more basic docstring before a doctest. what's x1? what's x2?

Comment thread chainladder/utils/utility_functions.py Outdated

Examples
--------
Cap a triangle cell-by-cell by comparing it with another triangle of limits.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we certain this is true? can x2 be a scalar?

Comment thread chainladder/utils/utility_functions.py
def read_json(json_str, array_backend=None):
"""Deserialize JSON produced by ``to_json`` (triangle, estimator, or pipeline).

Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example feels empty without seeing the actual json string. please follow the example from pandas

print(round(float(by_dev.ldf_.values[0, 0, 0, 0]), 6))
print(round(float(by_both.ldf_.values[0, 0, 0, 0]), 6))

.. testoutput::
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be showing all the numbers?

…henrydingliu

- read_pickle: show fitted Development estimator round-trip via pickle, verify transform works after restore
- read_json: show full Pipeline serialization round-trip with step names and params
- concat: show paid+incurred column join enabling MunichAdjustment directly
- minimum: compare volume vs simple CL ultimates, pick element-wise lower for low-side scenario
- maximum: same comparison, pick element-wise higher for high-side scenario
- PatsyFormula: clarify when to use custom DevelopmentML pipeline vs TweedieGLM; show ldf_ output instead of coefficient count
import chainladder as cl

tri = cl.load_sample("raa")
dev = cl.Development(average="volume").fit(tri)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to demonstrate that to_pickle does something, we should use non-default parameters. something like avg = simple, n = 4.

dev.to_pickle(p)
restored = cl.read_pickle(p)
os.remove(p)
print(restored.transform(tri).ldf_.values[0, 0, 0, :4].round(4))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we print the full ldf_ from both the original and the restored estimators?

combined = cl.concat([paid, incurred], axis=1)
adj = cl.MunichAdjustment(paid_to_incurred=("CumPaidLoss", "IncurLoss"))
result = adj.fit_transform(combined)
print(result.ldf_["CumPaidLoss"].values[0, 0, 0, :4].round(4))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good use case for concat. can we focus the test output around concat only?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants