Neer Match Utilities Logo

Installation

  • Installation
    • PiPy
    • From Source

Examples

  • A Minimal Training Pipeline
    • Loading the Data
    • Defining Features and Similarity Concepts
    • Harmonizing the data
      • Left and Right
    • Re-Structuring the Matches dataframe
    • Splitting Data
    • Training and Exporting the Model
  • Creating a Common Identifier (ID)
    • Between Two Sources
      • 1. Load Data & Model
      • 2. Harmonize Format
      • 3. Generate a Common ID
      • 4. Merge Results
    • Repeated Cross-Sections (Panel ID)
  • Alternative Classification Models
    • Baseline Models
      • Logit Model
      • Probit Model
      • Gradient Boosting Model
    • Deep Learning Model (ANN)
      • Key Parameters Explained
      • Single-Stage Training
    • Model Comparison
  • Additional Functionalities
    • Name Commonness
    • Stop Word Removal with spaCy
    • Feature Selection
    • Handling Many-to-Many Matches
      • Loading the Data
      • Inspecting the Data
      • Simulating a Many-to-Many Relationship
      • Understanding the Matching Issue
      • Correcting the Relationships
      • 6. Verifying the Adjustments

Documentation

  • Base
  • Model
    • EpochEndSaver
      • EpochEndSaver.__init__()
      • EpochEndSaver.on_epoch_end()
    • Model
      • Model.load()
      • Model.save()
  • Baseline Models
    • GradientBoostingModel
      • GradientBoostingModel.__init__()
      • GradientBoostingModel.best_threshold()
      • GradientBoostingModel.evaluate()
      • GradientBoostingModel.fit()
      • GradientBoostingModel.predict_proba()
      • GradientBoostingModel.summary()
    • LogitMatchingModel
      • LogitMatchingModel.__init__()
      • LogitMatchingModel.fit()
    • ProbitMatchingModel
      • ProbitMatchingModel.__init__()
      • ProbitMatchingModel.fit()
    • SuggestMixin
      • SuggestMixin.suggest()
  • Baseline Training
    • BaselineTrainingPipe
      • BaselineTrainingPipe.__init__()
  • Baseline IO
    • ModelBaseline
      • ModelBaseline.save()
  • Panel
    • GenerateID
      • GenerateID.__init__()
      • GenerateID.assign_ids()
      • GenerateID.execute()
      • GenerateID.generate_suggestions()
      • GenerateID.group_by_subgroups()
      • GenerateID.harmonize_ids()
      • GenerateID.relations_left_right()
    • SetupData
      • SetupData.matches
      • SetupData.__init__()
      • SetupData.adjust_overlap()
      • SetupData.create_connected_groups()
      • SetupData.data_preparation_panel()
      • SetupData.drop_repetitions()
      • SetupData.panel_preparation()
  • Prepare
    • Prepare
      • Prepare.__init__()
      • Prepare.do_remove_stop_words()
      • Prepare.format()
    • similarity_map_to_dict()
    • synth_mismatches()
  • Split
    • SplitError
    • split_test_train()
  • Training
    • Training
      • Training.evaluate_dataframe()
      • Training.matches_reorder()
      • Training.performance_statistics_export()
    • TrainingPipe
      • TrainingPipe.WarmupCosine
        • TrainingPipe.WarmupCosine.__init__()
        • TrainingPipe.WarmupCosine.from_config()
      • TrainingPipe.__init__()
    • alpha_balanced()
    • combined_loss()
    • focal_loss()
    • soft_f1_loss()
  • Feature Selection
    • FeatureSelectionResult
      • FeatureSelectionResult.updated_similarity_map
      • FeatureSelectionResult.selected_feature_columns
      • FeatureSelectionResult.selected_pairs
      • FeatureSelectionResult.coef_by_feature
      • FeatureSelectionResult.meta
      • FeatureSelectionResult.__init__()
    • FeatureSelector
      • FeatureSelector.similarity_map
      • FeatureSelector.__init__()
      • FeatureSelector.execute()
    • tqdm_joblib()
  • Similarity Features
    • SimilarityFeatures
      • SimilarityFeatures.__init__()
      • SimilarityFeatures.pairwise_similarity_dataframe()
    • subsample_non_matches()
    • to_X_y()
  • Custom Similarities
    • CustomSimilarities
      • CustomSimilarities.__init__()
      • CustomSimilarities.notmissing()
      • CustomSimilarities.notzero()

License

  • License
Neer Match Utilities
  • Base
  • View page source

Base

  • Index

  • Module Index

Previous Next

© Copyright 2026, Pantelis Karapanagiotis, Marius Liebald.

Built with Sphinx using a theme provided by Read the Docs.