| Package | |
| Quality | |
| Documentation | |
| Code style | |
| Downloads | |
| Community |
Iguanas is a library built on top of Polars, designed to streamline the entire rule-based system development workflow — from raw data to production-ready rules — leveraging Polars' blazing-fast multi-core processing.
Built by the PSP Data Team at PayPal, Iguanas makes rule generation, evaluation, and selection both faster and simpler.
- 🚀 Lightning Fast: Built on Polars for multi-core parallel processing
- 🎯 End-to-End: Generate, evaluate, combine, and select rules in one library
- 📦 Production Ready: Lightweight rule strings that deploy anywhere
- 🔧 Flexible: Sequential and parallel grid search strategies
- 🔗 Composable: Chain generation → evaluation → selection with a few function calls
- 🎓 Easy to Learn: Simple functional API with clear, consistent signatures
Generate interpretable rules from labelled datasets using XGBoost tree extraction:
rule_grid_search_sequential- Single-process grid search over weight transformations and scale_pos_weight valuesrule_grid_search_parallel_weights- Parallel grid search parallelised over weight transformationsrule_grid_search_parallel_scales- Parallel grid search parallelised over scale_pos_weight valuesextract_rules- Extract rules from a fitted XGBoost model (with optional monotone constraints)extract_rule_by_max_gain- Extract the highest-gain rule path from a single treeextract_rule_with_monotone_constraints- Extract a rule path respecting monotone constraints
Compute classification performance metrics for rule predictions:
compute_metrics- Compute a full metrics table (accuracy, precision, recall, F-beta, TP/FP/TN/FN, flagged %) for a set of rulescompute_single_metric- Compute a single scalar metric (accuracy, precision, recall or F-beta) — optimised for hot-path evaluation
Evaluate rules on data and filter by performance:
apply_rules- Evaluate rule expressions on a DataFrame and return a boolean prediction matrixapply_and_filter_by_performance- Evaluate rules and filter by user-defined metric thresholdsselect_diverse_top_rules- Select top-performing rules while removing highly correlated duplicatesapply_filter_and_deduplicate_rules- Complete end-to-end pipeline: evaluate → filter → deduplicate
Combine individual rules into compound rules to improve performance:
combine_rules_full_search- Exhaustive search over all rule pairscombine_rules_cumulative- Incrementally combine rules with a running candidatecombine_rules_greedy- Greedy combination selecting the best pair at each stepcombine_rules_beam_search- Beam search combination balancing quality and efficiencycombine_rules_a_star- A* search combination using a heuristic cost function
Deduplicate and prune rule sets:
filter_rules_by_feature_overlap- Remove rules that share too many features with higher-importance rulesfilter_correlated_rules- Remove rules whose predictions are highly correlatedselect_best_rule_per_column_combination- Keep only the best-performing rule for each unique column combinationextract_feature_names_from_rule- Parse a rule string and return the feature names it references
Inspect and report on rule sets:
generate_rule_performance_report- Generate a combined performance and structure report for a rule setparse_conditions- Parse a rule expression into its constituent conditionsparse_levels- Parse a rule expression into a structured level-by-level representationrebuild_from_levels- Reconstruct a rule string from a level representation
Clean up rule expressions for display or logging:
simplify_rule- Simplify a rule expression by removing redundant conditions
Infer feature directionality to guide rule generation:
infer_monotone_constraints_from_correlations- Infer monotone constraints (±1) from feature–target correlationsinfer_monotone_constraints_from_stumps- Infer monotone constraints (±1) from decision stumps
Generate sample weight schedules to steer rule learning:
generate_increasing_weights- Weights that increase with feature value (power, log families)generate_decreasing_weights- Weights that decrease with feature value (reciprocal families)generate_weights- Generate both increasing and decreasing weight schedules in one call
import polars as pl
import numpy as np
from xgboost import XGBClassifier
from iguanas.weight_transformations import generate_weights
from iguanas.rule_generation import rule_grid_search_parallel_weights
from iguanas.rule_evaluation import apply_filter_and_deduplicate_rules
# 1. Load your data
X_train = pl.DataFrame({
"age": [25, 45, 35, 50, 30, 55, 40, 28],
"income": [30000, 80000, 50000, 90000, 40000, 95000, 70000, 35000],
})
y_train = pl.Series([0, 1, 0, 1, 0, 1, 1, 0])
# 2. Generate sample weight transformations
weights = generate_weights(X_train["income"])
# 3. Run a parallel grid search to extract rules
estimator = XGBClassifier(max_depth=2, n_estimators=5, random_state=42)
scale_pos_weights = np.logspace(0, 1, 5)
rules_df = rule_grid_search_parallel_weights(
estimator, X_train, y_train,
scale_pos_weights=scale_pos_weights,
weights_train_vec=weights,
n_jobs=-1,
)
# 4. Evaluate, filter, and deduplicate rules
R, metrics, selected_rules = apply_filter_and_deduplicate_rules(
X_train, y_train, rules_df,
metric_thresholds=[
{"name": "precision", "operator": ">=", "value": 0.6},
{"name": "recall", "operator": ">=", "value": 0.5},
],
max_corr=0.8,
)
print(selected_rules)Requires Python 3.10 or higher.
pip install iguanasOr install from source:
git clone https://github.com/paypal/iguanas.git
cd iguanas
pip install -e . # Install in editable/development modeFor detailed documentation, tutorials, and API reference, visit:
https://paypal.github.io/iguanas/
Iguanas is perfect for:
- Fraud Detection - Generate high-precision rules to flag suspicious transactions
- Risk Scoring - Build interpretable rule sets for credit or operational risk
- Compliance & Policy - Encode business policies as auditable rule expressions
- Anomaly Detection - Surface rare but meaningful patterns in labelled data
- Model Explainability - Extract human-readable rules from gradient boosted models
Iguanas powers rule-based systems at:
- PayPal (internal use)
We welcome contributions! Please check out our contributing guidelines.
Iguanas is licensed under the Apache License 2.0. See LICENSE file for details.
Developed by the PSP Data Team at PayPal.
Built by data scientists, for data scientists