Loading a Model

Load Data & Model

Assume you have trained a model and stored it under demonstration_model. We can now load the model and use it to make predictions using left and right DataFrames. In the first step, both the model and the data are loaded into memory.

import pandas as pd

from neer_match_utilities.model import Model
from neer_match_utilities.prepare import Prepare, similarity_map_to_dict
from pathlib import Path
from neer_match_utilities.custom_similarities import CustomSimilarities

# Load files

left = pd.read_csv('left.csv')
right = pd.read_csv('right.csv')

# Load custom similarity functions

CustomSimilarities()

# Load model

loaded_model = Model.load(
    'demonstration_model'
)

Harmonize Format

Next, we must ensure that the formatting logic remains consistent with that applied before training. Note that it is not necessary to redefine the similarity map, as it was stored and is loaded along with the model.

prepare = Prepare(
    similarity_map=similarity_map_to_dict(
        loaded_model.similarity_map
    ), 
    df_left=left, 
    df_right=right, 
    id_left='company_id', 
    id_right='company_id'
)

left, right = prepare.format(
    fill_numeric_na=False,
    to_numeric=['found_year'],
    fill_string_na=True, 
    capitalize=True
)

Make Suggestions

Now we can make suggestions:

# Make suggestions for the first observation in left

suggestions = loaded_model.suggest(
    left[:2], 
    right, 
    count=10, 
    verbose=0
)

suggestions

left

right

prediction

0

0

0

0.059657

263

0

263

0.002876

530

0

530

0.001645

602

0

602

0.000593

336

0

336

0.000448

633

0

633

0.000424

381

0

381

0.000343

436

0

436

0.000311

169

0

169

0.000300

517

0

517

0.000240

693

1

1

0.999256

861

1

169

0.313228

971

1

279

0.027856

937

1

245

0.024864

1198

1

506

0.015210

936

1

244

0.006719

899

1

207

0.003680

1030

1

338

0.003023

738

1

46

0.002617

1299

1

607

0.002036

Based on this output, we can assess whether the suggestion is correct.

left.iloc[1]
company_id                                                    810c9c3435
company_name             DEUTSCH-OESTERREICHISCHE MANNESMANNRÖHREN-WERKE
city                                                              BERLIN
industry                            BERGWERKE, HÜTTEN- UND SALINENWESEN.
purpose                BETRIEB DER MANNESMANNRÖHREN-WALZWERKE IN REMS...
bs_text                GENERALDIREKTION D SSELDORF MOBILIAR U UTENSIL...
found_year                                                        1890.0
found_date_modified                                           1890-07-16
Name: 1, dtype: object
right.iloc[1]
company_id                                                    8bf51ba8a0
company_name            DEUTSCH-OESTERREICHISCHE MANNESMANNRÖHREN-WERKE.
city                                                              BERLIN
industry                            BERGWERKE, HÜTTEN- UND SALINENWESEN.
purpose                BETRIEB DER MANNESMANNRÖHREN-WALZWERKE IN REMS...
bs_text                GENERALDIREKTION GRUNDST CKSKONTO M MOBILIEN U...
found_year                                                        1890.0
found_date_modified                                           1890-07-16
Name: 1, dtype: object