RAP® Data Science Blog/Portfolio Insights
Alt

RAP Use Cases

Portfolio Insights

By Becca Smith
September 4, 2024 | 18 min read

The RAP® Portfolio Insights Suite uses advanced analytics to improve customer retention and increase growth for innovators in the origination and servicing space.

Retention here refers to the ability of a mortgage lender to keep existing mortgage customers and loans within its portfolio.

Retention Strategies often include:

  • Offering competitive interest rates and loan terms
  • Providing timely and attractive refinance options to prevent customers from moving to other lenders.
  • Implementing loyalty programs or incentives for long-term customers

Portfolio Growth is defined as the expansion of a lender’s portfolio by acquiring new mortgage loans.

Strategies often include:

  • Developing new mortgage products to attract a broader range of customers.
  • Enhancing marketing efforts to reach potential new customers.
  • Expanding into new geographic markets
  • Forming partnerships with real estate agents, builders, and other relevant entities
  • Offering competitive loan terms and innovative mortgage solutions

Both retention and growth are crucial for a mortgage lender’s financial health and overall market competitiveness. Retention helps ensure a stable revenue stream from interest payments on existing loans, while growth helps increase market share and overall revenue potential.

The RAP® Portfolio Insights Suite takes a data-driven approach to portfolio retention and growth strategies.

Marketshare Analysis

  1. ICE Public records deed and mortgage records are combined, and lender, broker and agent names are normalized and standardized.
  2. The Mortgage data is then combined with ICE Value Range Express MLS Data to offer overall origination footprint and benchmarking analysis with our Marketshare Analysis.
  3. The analysis in RAP looks back for the last two years is refreshed every month. KPIs and are calculated for Marketshare % by volume and loan amount and are visualized over time in Tableau.
  4. Lenders can filter in to see how they compare against the top 80% of the market or even on specific competitors of interest. Loan officers and Agent trends are also calculated for each lender so users can see where Los and Agents are doing their business in each market and what type of loans they originate the most of. This enables RAP customers to understand how they compare against their peers on origination volume + broker, real estate, loan officer trends by geography.
rap-portfolio-insights-1-01.jpg

1. Define Lookback with IPy Widget Variables and pass to spark for later use

rap-portfolio-insights-1-02.jpg
%%local 

from ipywidgets 
import widgets,Button,Layout,Label 
import IPython 
import time 

layout= Layout(width='500px',display='left',Positioning='left',justify_content='space-between') 
style = {'description_width': '400px'} 

start_date= widgets.DatePicker(description='Start Date', layout=layout) 
end_date= widgets.DatePicker(description='End Date', layout=layout) 
label1 = widgets.HTML(value='Database to save results to (No special characters or spaces)') 
db_for_results= widgets.Text(value="database_name",style =style,layout=layout) 
label2 = widgets.HTML(value='Table to save results to (No special characters or spaces)') 
table_name_for_results= widgets.Text(value="table_name",style =style,layout=layout) 
label3 = widgets.Label(value='Notebook File Path') 
file_path= widgets.Text(value= "Paste the file path here",style =style,layout=layout) 
run_button = widgets.Button(description="Run", style=dict(font_weight='bold', text_color='white',button_color='blue')) 
display(start_date,end_date,run_button) 

def on_button_clicked(b): 
  global localString,start_date_new,end_date_new 
  start_date_new=str(start_date.value).replace(",)",")") 
  end_date_new=str(end_date.value).replace(",)",")") 
  db_for_results_new=str(db_for_results.value).replace(",)",")") 
  table_name_for_results_new=str(table_name_for_results.value).replace(",)",")") 
  file_path_new=str(file_path.value).replace(",)",")") 
  local_list=[start_date_new,end_date_new] 
  localString = ' | '.join(local_list) 
  run_button.style.button_color = 'gray' 
  time.sleep(1) 
  run_button.style.button_color = 'blue' 
run_button.on_click(on_button_clicked)

rap-portfolio-insights-1-03.jpg
%%send_to_spark -i localString -t str -n sparkString

2. Parse User input variables

rap-portfolio-insights-1-04.jpg
start_date,end_date= sparkString.split(' | ')
save_database = "market_share"
spark.sql("CREATE DATABASE IF NOT EXISTS {}".format(save_database))
date = spark.sql("select string(current_date())").collect()[0][0].replace("-","_")
save_table = "analysis_"+date
a,b,c = end_date.split("-",3)
origination_month = a+"-"+b

3. Query Deed records

rap-portfolio-insights-1-05.jpg
## QUERY DEED MORTGAGE RECORDS

query = """

create or replace temp view origination_records_deed as

select

row_number() over (partition by d.fipscode, d.assessors_parcel_number, d.bkfsinternal_pid order by d.fipscode, d.assessors_parcel_number, d.bkfsinternal_pid) as instance,

'DEED' as from_table,

d.bkfsinternal_pid,

-- ADDRESS DETAIL

d.fipscode, d.assessors_parcel_number, d.property_full_street_address, d.property_city_name, d.property_state, d.property_zip_code, d.assessors_land_use, d.residential_indicator,

a.property_address_latitude, a.property_address_longitude,

-- LOAN DETAIL

d.original_date_of_contract, d.loan_1_transaction_type, d.concurrent_tdloan_type, d.concurrent_tdloan_amount, d.concurrent_tdtype_financing,

d.concurrent_tdinterest_rate/100 as interest_rate,

ir.interestrate as interest_rate_inferred,

case when ifnull(d.concurrent_tdinterest_rate,'') in ('',0) then ir.interestrate else d.concurrent_tdinterest_rate/100 end as interest_rate_used,

d.stand_alone_refi, d.equity_credit_line,

case when d.concurrent_tdloan_type = 'Z' then 'Reverse Mortgage'

else 'Purchase' end

as loan_type_derived,

-- LENDER DETAIL
d.concurrent_tdlender_name, d.concurrent_tdlender_type, d.loan_organization_nmls_id, d.loan_organization_name, d.mortgage_broker_nmls_id, d.mortgage_broker, d.loan_officer_nmls_id, d.loan_officer_name,

-- TITLE COMPANY
d.title_company_name,

-- BORROWER DETAIL
d.number1_buyer_first_name_and_middle_name, d.number1_buyer_last_name_or_corporation_name

from public_records.deed_nationwide d

left outer join public_records.assessment_nationwide a on

d.fipscode = a.fipscode and d.assessors_parcel_number = a.assessors_parcel_number

left outer join public_records.ml_interestrate ir on

SUBSTRING(cast(d.original_date_of_contract as string),1,4) = cast(ir.year as string) and

case when SUBSTRING(cast(d.original_date_of_contract as string),5,1)='0' then

SUBSTRING(cast(d.original_date_of_contract as string),6,1)

else SUBSTRING(cast(d.original_date_of_contract as string),5,2) end

=cast(ir.month as string)

where
to_date(d.original_date_of_contract,'yyyyMMdd') >= '{}'
and to_date(d.original_date_of_contract,'yyyyMMdd') <= '{}'

and d.concurrent_tdloan_amount > 0
and d.concurrent_tdloan_amount < 10000000

order by d.original_date_of_contract

""".format(start_date, end_date)

spark.sql(query)

4. Query Mortgage Records

rap-portfolio-insights-1-06.jpg
## QUERY SAM MORTGAGE RECORDS

query = """

create or replace temp view origination_records_sam as

select

row_number() over (partition by s.fipscode, s.assessors_parcel_number, s.bkfsinternal_pid order by s.fipscode, s.assessors_parcel_number, s.bkfsinternal_pid) as instance,

'SAM' as from_table,

s.bkfsinternal_pid,

-- ADDRESS DETAIL

s.fipscode, s.assessors_parcel_number, s.property_full_street_address, s.property_city_name, s.property_state, s.property_zip_code, s.assessors_land_use, s.residential_indicator,

a.property_address_latitude, a.property_address_longitude,

-- LOAN DETAIL

s.original_date_of_contract, s.loan_transaction_type, s.loan_type, s.loan_amount, s.type_financing,

s.interest_rate/100 as interest_rate,

ir.interestrate as interest_rate_inferred,

case when ifnull(s.interest_rate,'') in ('',0) then ir.interestrate else s.interest_rate/100 end as interest_rate_used,

s.stand_alone_refi, s.equity_credit_line,

case when s.loan_type = 'Z' then 'Reverse Mortgage'

when s.equity_credit_line = 1 then 'Home Equity'

when s.loan_type in ('2','D','E') then 'Home Equity'

when s.loan_transaction_type = 4 then 'Home Equity'

else 'Refinance' end

as loan_type_derived,

-- LENDER DETAIL

s.lender_name, s.lender_type, s.loan_organization_nmls_id, s.loan_organization_name, s.mortgage_broker_nmls_id, s.mortgage_broker, s.loan_officer_nmls_id, s.loan_officer_name,

-- TITLE COMPANY

s.title_company_name,

-- BORROWER DETAIL

s.number1_borrower_first_name_and_middle_name, s.number1_borrower_last_name_or_corporation_name

from public_records.sam_nationwide s

left outer join public_records.assessment_nationwide a on

s.fipscode = a.fipscode and s.assessors_parcel_number = a.assessors_parcel_number

left outer join public_records.ml_interestrate ir on

SUBSTRING(cast(s.original_date_of_contract as string),1,4) = cast(ir.year as string) and

case when SUBSTRING(cast(s.original_date_of_contract as string),5,1)='0' then

SUBSTRING(cast(s.original_date_of_contract as string),6,1)

else SUBSTRING(cast(s.original_date_of_contract as string),5,2) end

=cast(ir.month as string)

where

to_date(s.original_date_of_contract,'yyyyMMdd') >= '{}'

and to_date(s.original_date_of_contract,'yyyyMMdd') <= '{}'

and s.loan_amount > 0

and s.loan_amount < 10000000

order by s.original_date_of_contract

""".format(start_date, end_date)

spark.sql(query)

5. Combine Deed and Mortgage records into single loan-level table

rap-portfolio-insights-1-07.jpg
## COMBINE ALL MORTGAGE RECORDS

query = """

create or replace temp view origination_records_staging as

select * from origination_records_sam where instance = 1

union

select * from origination_records_deed where instance = 1

"""

spark.sql(query)

rap-portfolio-insights-1-08.jpg

6. Normalize Lender, Broker and Agent Names

rap-portfolio-insights-1-09.jpg
## STANDARDIZE LENDER NAMES

## Get the majority NMLS ID per lender name

query = """

create or replace temp view origination_records_lender1 as

select row_number() over (partition by upper(trim(lender_name)) order by count(*) desc) as lender_rank,

loan_organization_nmls_id, upper(trim(lender_name)) as lender_name_2,

count(*) as loan_count

from origination_records_staging

where

loan_organization_nmls_id <> ''

and lender_name <> ''

group by

loan_organization_nmls_id, lender_name_2

order by

lender_name_2, loan_count desc

"""

spark.sql(query)

query = """

create or replace temp view origination_records_lender2 as

select *

from origination_records_lender1

where lender_rank = 1

"""

spark.sql(query)

7. Add MLS Data

rap-portfolio-insights-1-10.jpg
query = """

create or replace temp view vre_records_staging as

select

o.fipscode, o.assessors_parcel_number, o.from_table, o.bkfsinternal_pid,

vh.ds_internal_pid,

row_number() over (partition by o.fipscode, o.assessors_parcel_number order by vh.mls_sold_date desc) as vre_instance,

vh.mls_list_date, vh.mls_list_date_std, vh.mls_offmarket_date, vh.mls_offmarket_date_std, vh.mls_sold_date, vh.mls_sold_price,

vh.mls_status, vh.mls_status_std, vh.mls_status_date,

vh.agt_sold_agt_id, vh.agt_sold_agt_name, vh.agt_sold_off_id, vh.agt_sold_off_name

from {}.{}_pre o

inner join public_records.vre_history vh on

o.fipscode = vh.ds_fipscode and o.assessors_parcel_number = vh.ds_apn

and vh.mls_list_date < o.original_date_of_contract

where

o.from_table = 'DEED'

and vh.mls_sold_date is not null and vh.mls_sold_date <> ''

""".format(save_database, save_table)

spark.sql(query)

8. Visualize Aggregate Results in Plotly

rap-portfolio-insights-1-11.jpg
%%local

# Plotly visualization

from plotly.offline import iplot, init_notebook_mode

init_notebook_mode(connected = True)

import plotly.graph_objs as go

from datetime import datetime

#import pandas as pd

#query["Origination_Month"] = pd.to_datetime(query["Origination_Month"], format="%Y-%m")

query_agg = query.groupby(['Origination_Month', 'Loan_Type'])['Cnt'].sum().unstack().fillna(0)

#fig = px.bar(query, x="Next_Transaction_Date", color="Estimated_Payoff_Type", y='Cnt', title="A Grouped Bar Chart With Plotly Express in Python", barmode='relative')

fig = go.Figure(

data = [

go.Bar(name="Home Equity", x = [date for date in query_agg.index], y=query_agg["Home Equity"].values),

go.Bar(name="Purchase", x= [date for date in query_agg.index], y=query_agg["Purchase"].values),

go.Bar(name="Refinance", x= [date for date in query_agg.index], y=query_agg["Refinance"].values),

go.Bar(name="Reverse Mortgage", x= [date for date in query_agg.index], y=query_agg["Reverse Mortgage"].values)

],

layout = go.Layout(title="Origination Type Volumes by Date", title_x=0.5, width=2000, height=1000, xaxis=dict(type='category', dtick=1, fixedrange=True, tickangle=-45), plot_bgcolor='black', legend={'title':{'text': 'Loan Type Derived'}})

)

fig.update_layout(xaxis_title= "Origination Month", yaxis_title= "Loan Count", barmode='group')

iplot(fig)

rap-portfolio-insights-1-12.jpg

9. Optional - Summary Dashboard in tableau

rap-portfolio-insights-1-13.jpg
rap-portfolio-insights-1-14.jpg
rap-portfolio-insights-1-15.jpg
rap-portfolio-insights-1-16.jpg

Runoff Analysis

Categorize mortgage transactions (sales, refis, reverse mortgages, natural payoffs) and classify paid in full loans as retained or lost. Get a competitive view of which lenders and loan officers you are losing out to. Understand which types of loan products you retain the most of and which regions you perform the best in.

rap-portfolio-insights-1-17.jpg

1. Portfolio Match & Append. Match your monthly portfolio paid in full snapshot with ICE data to get a unique property ID assigned. This allows you to utilize the ICE Property, Real Estate and Mortgage datasets and tie them back to your own portfolio.

rap-portfolio-insights-1-18.jpg
%%local 

 

from ipywidgets import widgets,Button,Layout,Label 

import IPython 

import time 

 

layout= Layout(width='500px',display='left',Positioning='left',justify_content='space-between') 

style = {'description_width': '400px'} 

 

start_date= widgets.DatePicker(description='Start Date', layout=layout) 

end_date= widgets.DatePicker(description='End Date', layout=layout) 

label1 = widgets.HTML(value='Database to save results to (No special characters or spaces)') 

db_for_results= widgets.Text(value="database_name",style =style,layout=layout) 

label2 = widgets.HTML(value='Table to save results to (No special characters or spaces)') 

table_name_for_results= widgets.Text(value="table_name",style =style,layout=layout) 

label3 = widgets.Label(value='Notebook File Path') 

file_path= widgets.Text(value= "Paste the file path here",style =style,layout=layout) 

run_button = widgets.Button(description="Run", style=dict(font_weight='bold', text_color='white',button_color='blue')) 

display(start_date,end_date,run_button) 

 

def on_button_clicked(b): 

  global localString,start_date_new,end_date_new 
  start_date_new=str(start_date.value).replace(",)",")") 
  end_date_new=str(end_date.value).replace(",)",")") 

   db_for_results_new=str(db_for_results.value).replace(",)",")") 

   table_name_for_results_new=str(table_name_for_results.value).replace(",)",")") 

  file_path_new=str(file_path.value).replace(",)",")") 
  local_list=[start_date_new,end_date_new]    
  localString = ' | '.join(local_list) 
  run_button.style.button_color = 'gray' 
  time.sleep(1) 
  run_button.style.button_color = 'blue' 
run_button.on_click(on_button_clicked) 

2. Check public records data for updates in the PIF time range for portfolio properties. Assign “No county updates” to any loans that have not been updated yet by the county. Assign “no match” to any records that were not matched based on the Portfolio input property address.

3. Combine deed and mortgage records falling into the given input portfolio payoff time range and match the transactions to the corresponding input portfolio property using the ICE distinct property id, transaction id and subject lien information.

query = """ 

create or replace temp view ma_runoff_deed{} as 

select  
row_number() over (partition by p.input_record_number, d.fipscode, d.assessors_parcel_number order by p.input_record_number, d.fipscode, d.assessors_parcel_number, d.bkfsinternal_pid) as instance, 

p.input_record_number, 

-- ADDRESS DETAIL 
d.fipscode, d.assessors_parcel_number, d.property_full_street_address, d.property_city_name, d.property_state, d.property_zip_code, 

p.{}, 

-- DEED DETAIL 
d.original_date_of_contract, d.recording_date, d.document_type_code, 

-- LOAN DETAIL 
d.concurrent_tdloan_type, d.concurrent_tdloan_amount, d.concurrent_tdtype_financing, d.concurrent_tdinterest_rate, d.concurrent_tddue_date, d.adjustable_rate_rider, d.adjustable_rate_index, d.interest_only_period, d.fixed_step_rate_rider, d.stand_alone_refi, d.equity_credit_line,  

-- LENDER DETAIL 
d.concurrent_tdlender_name, d.concurrent_tdlender_type, d.loan_organization_nmls_id, d.loan_organization_name, d.mortgage_broker_nmls_id, d.mortgage_broker, d.loan_officer_nmls_id, d.loan_officer_name, 

-- BORROWER DETAIL 
d.number1_buyer_first_name_and_middle_name, d.number1_buyer_last_name_or_corporation_name, 

-- PROPERTY DETAIL 
d.assessors_land_use, d.residential_indicator, 

-- OTHER 

d.deed_transaction_type, d.bkfsinternal_pid 

from ma_runoff_deedpid{} p 
  inner join public_records.deed_nationwide d on 
    p.fipscode = d.fipscode and p.assessors_parcel_number = d.assessors_parcel_number and p.bkfsinternal_pid = d.bkfsinternal_pid 

where residential_indicator = 1 

""".format(table_ext_for_results, mapping_payoff_date, table_ext_for_results) 

spark.sql(query) 

rap-portfolio-insights-1-19.jpg
rap-portfolio-insights-1-21.jpg

4. Add additional data from the active lien table and release tables that enable mortgage transaction classification into sales, refis, forgiveness programs, reverse mortgages and natural payoffs.

5. Organize transactions into recaptured and lost categories. Add additional detail on the next transaction that has been classified in 3. when it is available

6. Visualize results in Plotly and/or Tableau. Track monthly changes in loss%.

rap-portfolio-insights-1-21.jpg
rap-portfolio-insights-1-22.jpg
rap-portfolio-insights-1-23.jpg
rap-portfolio-insights-1-24.jpg
rap-portfolio-insights-1-25.jpg

Portfolio Surveillance

Track current LTV and borrower equity, inclusive of subordinate lien monitoring for your active portfolio. See how many properties in your active portfolio are listed and how many have notice of defaults filed. Get natural catastrophe alerts and risk scores associated with your active portfolio. Visualize regional trends.

rap-portfolio-insights-1-26.jpg
1. Portfolio Match & Append. Match your monthly portfolio paid in full snapshot with ICE data to get a unique property ID assigned. This allows you to utilize the ICE Property, Real Estate and Mortgage datasets and tie them back to your own portfolio.
rap-portfolio-insights-1-27.jpg
2. Match Active portfolio that is now mapped to the ICE distinct property ID to the active lien table to get all open primary and subordinate lien information.
3. Query results that match active portfolio properties for pre-foreclosure detail , VRE real estate listing detail and avm values.
rap-portfolio-insights-1-28.jpg
%%pretty 

query = """ 
create or replace temp view open_liens as  
select i.loannumber, i.fipscode as fipscode, i.assessors_parcel_number,  
max(a.loan_sequence_no) as open_lien_count, 
sum(a.loan_amount) as open_lien_amount 
from ma_input i 
inner join public_records.activeloan a on 
i.fipscode = a.fipscode and i.assessors_parcel_number = a.assessors_parcel_number_apn_pin 
group by 1, 2, 3 
""" 

spark.sql(query) 

%%pretty 

query = """ 
create or replace temp view prefc as  
select i.loannumber, i.fipscode as fipscode, i.assessors_parcel_number,  
max(n.lpsinternal_pid) as lpsinternal_pid 
from ma_input i 
inner join public_records.nod_nationwide n on 
i.fipscode = n.fipscode and i.assessors_parcel_number = n.assessors_parcel_number     
group by 1, 2, 3 
""" 

spark.sql(query) 

%%pretty 

query = """ 
create or replace temp view vre as  
select i.loannumber, i.fipscode, i.assessors_parcel_number,  
max(v.ds_internal_pid) as ds_internal_pid 
from ma_input i 
inner join public_records.vre_active v on 
i.fipscode = v.ds_fipscode and i.assessors_parcel_number = v.ds_apn     
group by 1, 2, 3 
""" 

spark.sql(query) 

4. Calculate LTV, inferred IR, equity and assign prepay score (1 = lowest prepay propensity; 10 = highest) to each active loan based on LTV, Equity, IR, market conditions and active lien detail.
rap-portfolio-insights-1-29.jpg
5. Visualize in Plotly and/or Tableau
rap-portfolio-insights-1-30.jpg
rap-portfolio-insights-1-31.jpg
headshot-becca-smith.jpg

Becca Smith, Director, Mortgage Data & Analytics

Becca has over 10 years of experience as a Data Scientist implementing Machine Learning and AI solutions within the life sciences, healthcare, manufacturing and financial services industries and 4 years of experience as a product leader. She holds a Bachelor of Science in Computational Biochemistry from the University of Texas at Austin and a Master of Science in Analytics from the Georgia Institute of Technology.

Related resources
icon

Follow us on Linkedin

icon

Access Mortgage Monitor reports

icon

2025 Borrower Insights Survey report

Read more from the Connection Point

Ready to get started? Get in touch.

Get Started