January 31, 2025

Unified version 25.01

Published

January 31, 2025

This release includes our new unified versioning scheme for our software, support for thresholds in unit metrics and custom context for test descriptions within the ValidMind Library, and many more enhancements.

Release highlights — `25.01`

Our documentation now follows the new unified versioning scheme for our software, starting with this 25.01 release. Included in this release are:

ValidMind Library — v2.7.7
ValidMind Platform — v1.29.10

Why a unified versioning scheme?

We manage multiple repositories, each with its own version tags. The new versioning scheme replaces the ValidMind Library version in the documentation to clarify that each release includes code from multiple repositories rather than a single source.

This change simplifies tracking changes for each ValidMind release and streamlines version management for you. Release frequency and the upgrade process remain unchanged.

ValidMind Library (v2.7.7)

Threshold lines in unit metric plots

When logging metrics using log_metric(), you can now include a thresholds dictionary. For example, use thresholds={"target": 0.8, "minimum": 0.6} to define multiple reference levels.

These thresholds automatically appear as horizontal reference lines when you add a Metric Over Time block to the documentation.
The visualization uses a distinct color palette to differentiate between thresholds. It displays only the most recent threshold configuration and includes threshold information in both the chart legend and data table.

log_metric()

Add metrics over time

Usage example:

log_metric(
    key="AUC Score",
    value=auc,
    recorded_at=datetime(2024, 1, 1),
    thresholds={
        "high_risk": 0.6,
        "medium_risk": 0.7,
        "low_risk": 0.8,
    }
)

This enhancement provides immediate visual context for metric values. It helps track metric performance against multiple defined thresholds over time.

Add context to enhance LLM-based text generation for model test results

You can now include contextual information to enhance LLM-based generation of test results descriptions and interpretations. This enhancement improves test result descriptions by incorporating additional context that can be specified through environment variables.

Add context to LLM-generated test descriptions

Learn how to add custom context to LLM-generated test descriptions

A new notebook demonstrates adding context to LLM-based descriptions with examples of:

Setting up the environment
Initializing datasets and models
Running tests with and without context

Document credit risk scorecard models using XGBoost

We’ve introduced enhancements to the ValidMind Library that focus on documenting credit risk scorecard models:

New notebooks: Learn how to document application scorecard models using the library. These notebooks provide a step-by-step guide for loading a demo dataset, preprocessing data, training models, and documenting the model.

You can choose from three different approaches: running individual tests, running a full test suite, or using a single function to document a model.

Document an application scorecard model

Individual Tests

Open notebook in JupyterHub

Document an application scorecard model

Full Test Suite

Open notebook in JupyterHub

Document an application scorecard model

Single Function

Open notebook in JupyterHub

New tests:
- MutualInformation: Evaluates feature relevance by calculating mutual information scores between features and the target variable.
- ScoreBandDefaultRates: Analyzes default rates and population distribution across credit score bands.
- CalibrationCurve: Assesses calibration by comparing predicted probabilities against observed frequencies.
- ClassifierThresholdOptimization: Visualizes threshold optimization methods for binary classification models.
- ModelParameters: Extracts and displays model parameters for transparency and reproducibility.
- ScoreProbabilityAlignment: Evaluates alignment between credit scores and predicted probabilities.

Modifications have also been made to existing tests to improve functionality and accuracy. The TooManyZeroValues test now includes a row count and applies a percentage threshold for zero values.

The split function in lending_club.py has been enhanced to support an optional validation set, allowing for more flexible dataset splitting.

A new utility function, get_demo_test_config, has been added to generate a default test configuration for demo purposes.

Ongoing monitoring notebook for application scorecard model

Several enhancements to the ValidMind Library focus on ongoing monitoring capabilities:

New notebook: Learn how to use ongoing monitoring with credit risk datasets in this step-by-step guide for the ValidMind Library.
- Use our new metrics for data and model drift, and populate the ongoing monitoring documentation for a scorecard model.¹

¹ Document credit risk scorecard models using XGBoost

Ongoing monitoring for application scorecard

Open notebook in JupyterHub

Custom tests: Define and run your own tests using the library:
- ScoreBandDiscriminationMetrics.py: Evaluates discrimination metrics across different score bands.
New tests:
- CalibrationCurveDrift: Evaluates changes in probability calibration.
- ClassDiscriminationDrift: Compares classification discrimination metrics.
- ClassImbalanceDrift: Evaluates drift in class distribution.
- ClassificationAccuracyDrift: Compares classification accuracy metrics.
- ConfusionMatrixDrift: Compares confusion matrix metrics.
- CumulativePredictionProbabilitiesDrift: Compares cumulative prediction probability distributions.
- FeatureDrift: Evaluates changes in feature distribution.
- PredictionAcrossEachFeature: Assesses prediction distributions across features.
- PredictionCorrelation: Assesses correlation changes between predictions and features.
- PredictionProbabilitiesHistogramDrift: Compares prediction probability distributions.
- PredictionQuantilesAcrossFeatures: Assesses prediction distributions across features using quantiles.
- ROCCurveDrift: Compares ROC curves.
- ScoreBandsDrift: Analyzes drift in score bands.
- ScorecardHistogramDrift: Compares score distributions.
- TargetPredictionDistributionPlot: Assesses differences in prediction distributions.

We also improved dataset loading, preprocessing, and feature engineering functions with verbosity control for cleaner output.

Jupyter Notebook templates

Want to create your own code samples using ValidMind’s? We’ve now made it easier for contributors to submit custom code samples.

Our end-to-end notebook template generation notebook will generate a new file with all the bits and pieces of a standard ValidMind notebook to get you started.

The same functionality is also accessible from our Makefile:

make notebook

End-to-end notebook template generation

Open notebook on GitHub

Mini-templates

The template generation notebook draws from a number of mini-templates, should you need to revise them or grab the information from them manually:

about-validmind.ipynb: Conceptual overview of ValidMind & prerequisites.
install-initialize-validmind.ipynb: ValidMind Library installation & initialization instructions.
next-steps.ipynb: Directions to review the generated documentation within the ValidMind Platform & additional learning resources.
upgrade-validmind.ipynb: Instructions for comparing & upgrading versions of the ValidMind Library.

ValidMind Platform (v1.29.10)

Edit your dashboards

We’ve streamlined dashboard configuration with dedicated view and edit modes. Click Edit Mode to make changes, then click Done Editing to save and return to view mode:

Customize your dashboard

A screenshot showing edit mode — Edit mode for your dashboard

To prevent any confusion when multiple people are working on the same dashboard, we’ve added some helpful safeguards:

If someone else makes changes while you’re editing, you’ll get a friendly notification to reload the page
The system automatically detects if you’re looking at an older version of the dashboard and prompts you to get the latest updates

Optional prompt for risk assessments

Risk assessment generation has been enhanced to allow you to provide an optional prompt before starting text generation. This feature lets you guide the output, ensuring that the generated text aligns more closely with your specific requirements.

Assess compliance

Enhancements

ValidMind Library (v2.7.7)

Static descriptions in test results

The TestResult class now exposes pre-populated test descriptions through the doc property, separating them from dynamically generated GenAI descriptions:

result.doc — contains the original docstring of the test.
result.description — contains the dynamically generated description.

TestResult

This enhancement makes it easier to distinguish between ValidMind’s standard test documentation and the dynamic, context-aware descriptions generated for your specific test results.

You can browse the full catalog of official test descriptions in our test documentation:

Test descriptions

Raw data storage for tests

We added raw data storage across all ValidMind Library tests. Every test now returns a RawData object, allowing post-processing functions to recreate any test output. This feature enhances flexibility and customizability.

RawData

New `print_env` function

We’ve added a new diagnostic print_env() utility function that displays comprehensive information about your running environment. This function is particularly useful when:

print_env()

Troubleshooting issues in your code
Seeking support from the ValidMind team
Verifying your environment configuration

Usage example:

import validmind

validmind.print_env()

This function outputs key details, such as Python version, installed package versions, and relevant environment variables, making it easier to diagnose issues and share your setup with others.

ValidMind Platform (v1.29.10)

Simplified workflow nodes

Workflows are now easier to read when zoomed out, helped by a larger modal window and simplified nodes:

Setting up model workflows

A screenshot showing the simplified workflow visualization with nodes — Workflow visualization showing simplified nodes

Zooming in reveals more details:

A screenshot showing the simplified workflow visualization — Workflow visualization in zoomed-out view

Hovering over a node highlights all in and out connections, making relationships clearer:

A screenshot showing the workflow connection highlighting — Workflow connection highlighting on hover

New editor for mathematical formulas

We replaced the plugin for the editor of mathematical equations and formulas. The new plugin provides an improved interface for adding and editing LaTeX expressions in your documentation.

The new editor also includes a real-time preview and common mathematical symbols for easier equation creation.

Add mathematical formulas

A screenshot showing the new editor for mathematical equations and formulas — New editor for mathematical equations and formulas

How to upgrade

ValidMind Platform

To access the latest version of the ValidMind Platform,² hard refresh your browser tab:

² Log in to ValidMind

Windows: Ctrl + Shift + R OR Ctrl + F5
MacOS: ⌘ Cmd + Shift + R OR hold down ⌘ Cmd and click the Reload button

ValidMind Library

To upgrade the ValidMind Library:³

³ ValidMind Library

In your Jupyter Notebook:
- Using JupyterHub: Hard refresh your browser tab.
- In your own developer environment: Restart your notebook.
Then within a code cell or your terminal, run:
```
%pip install --upgrade validmind
```

You may need to restart your kernel after running the upgrade package for changes to be applied.

Release highlights — 25.01

Why a unified versioning scheme?

ValidMind Library (v2.7.7)

Threshold lines in unit metric plots

Add context to enhance LLM-based text generation for model test results

Add context to LLM-generated test descriptions

Document credit risk scorecard models using XGBoost

Document an application scorecard model

Document an application scorecard model

Document an application scorecard model

Ongoing monitoring notebook for application scorecard model

Ongoing monitoring for application scorecard

Jupyter Notebook templates

End-to-end notebook template generation

Mini-templates

ValidMind Platform (v1.29.10)

Edit your dashboards

Optional prompt for risk assessments

Enhancements

ValidMind Library (v2.7.7)

Static descriptions in test results

Raw data storage for tests

New print_env function

ValidMind Platform (v1.29.10)

Simplified workflow nodes

New editor for mathematical formulas

How to upgrade

ValidMind Platform

ValidMind Library

Release highlights — `25.01`

New `print_env` function