SlideShare a Scribd company logo
Data as a service to enable compliance 
reporting 
Girish Juneja, CTO 
October 7, 2014 
© 22001144 AAllttiissoouurrccee LLaabbss.. AAllll RRiigghhttss RReesseerrvveedd.. Page | 1
Chairman: William C. Erbey 
CEO: William B. Shepro 
Employees: ~8,000 
NASDAQ: ASPS 
Market Cap: 
$2.2 Billion 
(Sept. 15, 2014) 
Performance since August 2009 
Separation from Ocwen® 
CAGR Share Price: 
(Through Sept. 15, 2014) 
47% 
CAGR Service Revenue: 
(Through Sept. 30, 2013) 
39% 
Altisource Overview 
 Separated from Ocwen in August 2009 
 Created and separated RESI and AAMC in 
December 2012 
 Strong free cash flow 
 Strong growth prospects in very large markets 
© 2014 Altisource Labs. All Rights Reserved. Page | 2
Altisource Vision 
Vision 
To be the premier real estate and mortgage marketplace offering both 
content and distribution to the marketplace participants 
Mission 
To offer homeowners, buyers, sellers, agents, mortgage originators and 
servicers trusted and efficient marketplaces to conduct real estate and 
mortgage transactions, and improve outcomes for market participants 
Real Estate Marketplace Mortgage Marketplace 
 Home Sales 
 Home Rentals 
 Home Maintenance 
 Mortgage Originations 
 Mortgage Servicing 
© 2014 Altisource Labs. All Rights Reserved. Page | 3
State of the Business: Servicing 
COMPLEXITY 
- Meeting borrower 
/customer expectations 
- Elevated scrutiny of 
borrower interactions 
- Proliferation of servicer 
products 
- Reporting requirements 
Increased Risk 
Increased costs 
Increased penalties 
and fines 
Decreased customer 
satisfaction 
COMPLIANCE 
- Velocity of new and 
changing rules 
- Magnitude of financial 
exposure 
- Existing technology limits 
compliance capabilities 
CHANGE 
- Lack of end-to-end visibility 
- Rigid and inflexible systems 
- Volume and nature of data 
interoperability between 
data silos 
Compliance 
© 2014 Altisource Labs. All Rights Reserved. Page | 4
Future of Servicing 
For servicers’ businesses to grow, a modern servicing platform must be: 
Flexible and 
Adaptable 
to easily and cost effectively 
respond to evolving market and 
business dynamics 
Scalable and 
Automated 
to enable cost effective business 
growth 
Interoperable 
to seamlessly interface with third 
party apps and other software 
platforms 
Compliance 
Centric 
to meet ever-changing regulatory 
mandates 
Analytical 
to drive continuous improvement 
and manage risk 
© 2014 Altisource Labs. All Rights Reserved. Page | 5
Common Foundational Layer 
Customer Experience 
Menus & Navigation API Management Caching DMZ Gateway 
Identity Mgmt 
Single Sign-on 
Multi-tenant 
Authorization & RBAC 
Compliance & 
Entitlements 
Security 
Framework 
Encryption 
MFA Authentication 
Access Governance 
User Profile 
Rules Mgmt 
Workflows, 
Business & 
Compliance 
Workflow Mgmt 
Messaging 
Notification & 
Subscription 
Rules, 
Messaging, 
Integrations 
Search 
3rd Party 
Integrations 
Metadata 
Management 
Master Data 
Management 
Reporting 
Compliant 
Auditing 
Data 
Management 
Transactional, 
Reporting & 
Analytics 
Data Archival 
Warehousing 
Data as a Service 
Provisioning 
Monitoring/Alerting 
Backup/Restore 
Configuration & 
Customization 
Elastic Performance 
Multi-tenant 
Operational 
Framework 
Availability/DR 
Metering 
High performance scala based framework 
App Provisioning, isolation, multi-tenancy & Life-cycle 
Service Registry 
Multi-tenant Cloud Provider Independent Rapid Deployment PaaS 
Deployment & Test Automation 
Cloud Abstraction layer 
© 2014 Altisource Labs. All Rights Reserved. Page | 6
Environment Overview 
Financial Industry faces: 
– Increasing regulatory requirements 
– Increasing customer compliance requirements 
– Regulatory & customers’ changing requirements need correlating data across sources 
– Risk Analysis requires correlating internal and non-conforming structured external data sets 
– Existing data stores unable to respond rapidly 
Organizations need solutions that: 
– Enable financial institutions to address changing regulations 
– Enable automated compliance monitoring processes and systems 
– Improve internal controls by retaining data lineage to unmodified source datasets 
– Provide actionable & timely information from data 
– Enable on-demand reporting on massive datasets based on schema defined with the 
reporting request 
© 2014 Altisource Labs. All Rights Reserved. Page | 7
The Traditional Enterprise 
Data Warehouse approach 
– Maintain data integrity by storing the organizational data with regulatory 
information in respective data dimensions. 
– Data modeling and the design of facts and dimensions are very critical to 
the success of compliance data warehouse. 
– The regulatory sufficiency needs to be maintained in the regulation 
dimension of the compliance data warehouse. 
– From data warehouse data is transmitted into a regulatory data mart. 
– Transformation of base data elements and regulation rules are maintained 
in the meta data. 
© 2014 Altisource Labs. All Rights Reserved. Page | 8
Metadata, ETL & Core Model 
Metadata 
– Reflects the business and business 
processes 
– In EDW, all functionality is metadata 
driven 
Data definitions 
• Source and Core model from technical 
perspective 
• business perspective 
– Transformations & Aggregations 
• Transformations to derive cleansed Core 
data from source data 
• Aggregations and de-normalizations to 
create Access model data 
Loading Service 
– Reflects the business and business 
processes 
– In EDW, all functionality is metadata 
driven 
Data definitions 
• Source and Core model from technical 
perspective 
• business perspective 
– Transformations & Aggregations 
• Transformations to derive cleansed Core 
data from source data 
• Aggregations and de-normalizations to 
create Access model data 
Access Model 
– Data in the Access model is directly 
traceable to the Core model 
• De-normalized 
• Aggregated 
• Designed for query and access 
performance. 
• End user requirements 
• Access restrictions and controls 
– As a design decision, access model 
objects can either be physical 
structures, or structures materialized 
on access. 
Core Model 
– Control 
• Only approved, tested, and validated 
processes can update data in the Core 
model. 
– Data Model 
• Highly Normalized 
• No redundancy 
• Subject areas, loans, investors.. 
• Not optimized for apps 
• target data formats post-cleansing 
• It is optimized for efficiency and 
correctness. 
© 2014 Altisource Labs. All Rights Reserved. Page | 9
The Change/Update Process 
Each Change involves the following steps: 
– Update the extraction module 
– Update the staging module 
– Update the Transformation module 
– Update the metadata 
– Update the data repository 
This process is tedious, involved, and brittle. 
© 2014 Altisource Labs. All Rights Reserved. Page | 10
Enhancing with Big Data Technologies 
We were driven to adopt big data technology for many reasons: 
– Demand to analyze new data sources in an ever shorter timeframe 
– Growth in data complexity 
– Variety of data types 
– Volume of data and inability to move it around due to time constraints 
– Velocity of data generation, internal and external 
– Veracity of data from multiple sources 
– Growth in analytical complexity 
– Increasing availability of cost-effective computing and data storage 
The big reason for us was the frequent change of requirements due to changing business & 
regulatory changes. 
Spark is a more flexible platform 
© 2014 Altisource Labs. All Rights Reserved. Page | 11
Data as a Service 
© 22001144 AAllttiissoouurrccee LLaabbss.. AAllll RRiigghhttss RReesseerrvveedd.. Page | 12
Data Mobilization View for Data Lake 
© 2014 Altisource Labs. All Rights Reserved. Page | 13
Borrower Data Service 
Request Details: 
Service Name: Borrower Data 
Context: Current/Cleansed & Conformed/History 
Request Filter: <Borrower Name>, <Date Range> 
Response Details: 
© 2014 Altisource Labs. All Rights Reserved. Page | 14
Borrower Data Service 
© 2014 Altisource Labs. All Rights Reserved. Page | 15
Mortgage Data Service 
Request Details: 
Service Name: Mortgage Data 
Context: Current/Cleansed & Conformed/History 
Request Filter: <Loan Number>, <Date Range> 
Response Details: 
© 2014 Altisource Labs. All Rights Reserved. Page | 16
Mortgage Data Service 
© 2014 Altisource Labs. All Rights Reserved. Page | 17
Loan Default Event 
Event Details: 
Event Name: Loan Default 
Context: Current 
© 2014 Altisource Labs. All Rights Reserved. Page | 18
Loan Default Event 
© 2014 Altisource Labs. All Rights Reserved. Page | 19
Data Lake vs. Data Warehouse 
Feature Data Lake Data warehouse 
Data Volume Extremely large 
(Petabytes) 
Large (Terabytes) 
Access Methods NoSQL SQL 
Schema Schema on read Schema on write 
Scalability Scales horizontally Scales vertically 
Hardware Commodity hardware Specialized hardware/ 
appliances 
Data Structure Structured and 
unstructured 
Structured 
Data Raw Cleansed/Aggregated 
© 2014 Altisource Labs. All Rights Reserved. Page | 20
Data Lake Technology Stack 
GraphX 
Services/ 
Application Portals 
Spark (DAG construct and execute engine) 
RDD Instances/Schemas 
3 rd Party 
Drivers 
Analytics Portals 
Data Storage 
Cassandra /HDFS/ 
Parquet 
BI/ETL tools 
ODBC/ 
JDBC 
Spark 
Streaming 
External 
Data Stores 
In-house 
API 
Interactive 
Mlib/ 
SparkR 
Hive/HQL Spark SQL 
In-house 
Drivers 
YARN 
© 2014 Altisource Labs. All Rights Reserved. Page | 21
Benefits of Apache Spark based Data Lake 
- 
– Load data as its stored in the source system - no transformation needed 
– Build structure on it, apply Hive external tables on this raw data 
– Data sets built with our business logic 
– The intermediate and final results saved back to data storages 
– Working data sets saved as Parquet files 
– Distinction between data view and update view 
– When the data file changes in Hadoop or Cassandra, we have to update 
the Hive or Schema RDD’s: then we are done. 
© 2014 Altisource Labs. All Rights Reserved. Page | 22
Data Storage Access Layer 
– Abstract the details of data accessing through contexts/drivers. 
 Hive/HQL 
 Spark Sql 
 Cassandra driver for Spark 
– Unify the data into RDD interfaces. 
 SchemaRDD 
 HadoopRDD 
 CassandraRDD 
© 2014 Altisource Labs. All Rights Reserved. Page | 23
Code Samples - Apply Hive Schema to Raw Data 
Pour data 
Into HDFS 
Create 
Hive 
Schema 
Use HQL 
inside 
Spark 
SQL 
Save 
result in 
Parquet 
format 
RDBMS’s 
Excel Files 
Documents 
External Sources 
Cluster Details: 
16 VM’s 
128 GB Memory 
126 GB Disk 
© 2014 Altisource Labs. All Rights Reserved. Page | 24
The Spark Cluster 
App App Service Service Tool Tool … … … … 
Spark Driver 
Worker Worker Worker Worker Worker Worker … … … … 
Data Data … … … … Data Data Data Data 
Worker Worker Worker Worker Worker Worker … … … … 
Storage Storage Storage Storage Storage Storage … … … … 
© 2014 Altisource Labs. All Rights Reserved. Page | 25
Performance observations 10 18 Rows 
4.5 hrs 
48 minutes 
1 min 
Engineered 
Solutions 
Cores 128 
Memory 2048 Gb 
Disk 12 Tb 
In-memory 
Databases 
Cores 160 
Memory 2048 Gb 
Disk 12 Tb 
Spark Cluster 
VM’s 
Cores 128 
Memory 2048 Gb 
Disk 12 Tb 
© 2014 Altisource Labs. All Rights Reserved. Page | 26
Challenges 
– Open source Apache Spark, while very promising, has to mature 
– Spark production deployment is complicated 
– Security of data is not enterprise class, needs additional layers 
– Tools eco system is still developing – BI Tools still in development 
But.. 
– Done right has a lot of business value 
– We are hiring engineers! 
© 2014 Altisource Labs. All Rights Reserved. Page | 27
Q & A 
© 2014 Altisource Labs. All Rights Reserved. Page | 28

More Related Content

PPTX
Traditional data warehouse vs data lake
BHASKAR CHAUDHURY
 
PPTX
Developing a Strategy for Data Lake Governance
Tony Baer
 
PDF
Data Lakes - The Key to a Scalable Data Architecture
Zaloni
 
PDF
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Denodo
 
PDF
Designing the Next Generation Data Lake
Robert Chong
 
PDF
Datalake Architecture
TechYugadi IT Solutions & Consulting
 
PPTX
Data Governance, Compliance and Security in Hadoop with Cloudera
Caserta
 
PDF
The Emerging Data Lake IT Strategy
Thomas Kelly, PMP
 
Traditional data warehouse vs data lake
BHASKAR CHAUDHURY
 
Developing a Strategy for Data Lake Governance
Tony Baer
 
Data Lakes - The Key to a Scalable Data Architecture
Zaloni
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Denodo
 
Designing the Next Generation Data Lake
Robert Chong
 
Data Governance, Compliance and Security in Hadoop with Cloudera
Caserta
 
The Emerging Data Lake IT Strategy
Thomas Kelly, PMP
 

What's hot (20)

PDF
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
PDF
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
PDF
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Denodo
 
PDF
Modern Data Management for Federal Modernization
Denodo
 
PPTX
Hadoop: Extending your Data Warehouse
Cloudera, Inc.
 
PPTX
Deploying a Governed Data Lake
WaterlineData
 
PDF
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Denodo
 
PDF
Big Data and Data Virtualization
Kenneth Peeples
 
PPTX
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
PPT
DW 101
jeffd00
 
PPTX
Hadoop and Your Data Warehouse
Caserta
 
PPTX
Better Together: The New Data Management Orchestra
Cloudera, Inc.
 
PDF
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
Eric Javier Espino Man
 
PDF
A beginners guide to Cloudera Hadoop
David Yahalom
 
PDF
Enterprise Data Lake - Scalable Digital
sambiswal
 
PDF
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
PDF
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Data Con LA
 
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
PPT
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
PDF
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
Moacyr Passador
 
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Denodo
 
Modern Data Management for Federal Modernization
Denodo
 
Hadoop: Extending your Data Warehouse
Cloudera, Inc.
 
Deploying a Governed Data Lake
WaterlineData
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Denodo
 
Big Data and Data Virtualization
Kenneth Peeples
 
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
DW 101
jeffd00
 
Hadoop and Your Data Warehouse
Caserta
 
Better Together: The New Data Management Orchestra
Cloudera, Inc.
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
Eric Javier Espino Man
 
A beginners guide to Cloudera Hadoop
David Yahalom
 
Enterprise Data Lake - Scalable Digital
sambiswal
 
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Data Con LA
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
Moacyr Passador
 
Ad

Viewers also liked (8)

PDF
Security & Compliance for Startups
Symosis Security (Previously C-Level Security)
 
PDF
On Analyzing and Specifying Concerns for Data as a Service
Hong-Linh Truong
 
PPTX
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Health Catalyst
 
PPTX
UberTest Quick Guide
Amira Elsayed Ismail
 
PDF
ML and Data Science at Uber - GITPro talk 2017
Sudhir Tonse
 
PDF
Stream Computing & Analytics at Uber
Sudhir Tonse
 
PPTX
Uber Analytics Test
Coursetake
 
PDF
Uber Real Time Data Analytics
Ankur Bansal
 
Security & Compliance for Startups
Symosis Security (Previously C-Level Security)
 
On Analyzing and Specifying Concerns for Data as a Service
Hong-Linh Truong
 
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Health Catalyst
 
UberTest Quick Guide
Amira Elsayed Ismail
 
ML and Data Science at Uber - GITPro talk 2017
Sudhir Tonse
 
Stream Computing & Analytics at Uber
Sudhir Tonse
 
Uber Analytics Test
Coursetake
 
Uber Real Time Data Analytics
Ankur Bansal
 
Ad

Similar to Data-As-A-Service to enable compliance reporting (20)

PPTX
Big Data Case study - caixa bank
Chungsik Yun
 
PDF
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppDynamics
 
PPTX
Fast Data Overview
C. Scyphers
 
PPTX
Best Practices for Monitoring Cloud Networks
ThousandEyes
 
PPTX
Approach to Data Management v0.2
Simon Baig, FCCA, CGEIT, MSc
 
PDF
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DATAVERSITY
 
PDF
Contexti / Oracle - Big Data : From Pilot to Production
Contexti
 
PDF
Ensure a Successful SAP Hybris Implementation – Part 2: Architecture and Buil...
Kellton Tech Solutions Ltd
 
PPTX
Cloud Services Brokerage Demystified
Zach Gardner
 
PDF
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
Denodo
 
PDF
Redefining End-to-End Monitoring: The Foundation - High-Performance Architect...
SL Corporation
 
PPTX
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
PPTX
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
PDF
Oracle communications data model product overview
GreenHamster
 
PDF
First Friday Forum December 5th Featuring Pentaho
ArchipelagoIS
 
PDF
The Shifting Landscape of Data Integration
DATAVERSITY
 
PDF
Realign Process & Data To Improve Your Customer-Centricity
Bizagi
 
PDF
CSC - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
PPT
Data Management Strategy
Nandeep Nagarkar
 
PPTX
Big Data and Analytics
Cameron. A. Bradbury
 
Big Data Case study - caixa bank
Chungsik Yun
 
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppDynamics
 
Fast Data Overview
C. Scyphers
 
Best Practices for Monitoring Cloud Networks
ThousandEyes
 
Approach to Data Management v0.2
Simon Baig, FCCA, CGEIT, MSc
 
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DATAVERSITY
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti
 
Ensure a Successful SAP Hybris Implementation – Part 2: Architecture and Buil...
Kellton Tech Solutions Ltd
 
Cloud Services Brokerage Demystified
Zach Gardner
 
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
Denodo
 
Redefining End-to-End Monitoring: The Foundation - High-Performance Architect...
SL Corporation
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
Oracle communications data model product overview
GreenHamster
 
First Friday Forum December 5th Featuring Pentaho
ArchipelagoIS
 
The Shifting Landscape of Data Integration
DATAVERSITY
 
Realign Process & Data To Improve Your Customer-Centricity
Bizagi
 
CSC - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
Data Management Strategy
Nandeep Nagarkar
 
Big Data and Analytics
Cameron. A. Bradbury
 

More from AnalyticsWeek (8)

PDF
Understanding Customer Buying Journey with Big Data
AnalyticsWeek
 
PPTX
Making sense of unstructured data by turning strings into things
AnalyticsWeek
 
PPTX
Reimagining the role of data in government
AnalyticsWeek
 
PDF
The History and Use of R
AnalyticsWeek
 
PDF
Advanced Analytics in Hadoop
AnalyticsWeek
 
PDF
Rethinking classical approaches to analysis and predictive modeling
AnalyticsWeek
 
PDF
Using Topological Data Analysis on your BigData
AnalyticsWeek
 
PDF
Big Data Introduction to D3
AnalyticsWeek
 
Understanding Customer Buying Journey with Big Data
AnalyticsWeek
 
Making sense of unstructured data by turning strings into things
AnalyticsWeek
 
Reimagining the role of data in government
AnalyticsWeek
 
The History and Use of R
AnalyticsWeek
 
Advanced Analytics in Hadoop
AnalyticsWeek
 
Rethinking classical approaches to analysis and predictive modeling
AnalyticsWeek
 
Using Topological Data Analysis on your BigData
AnalyticsWeek
 
Big Data Introduction to D3
AnalyticsWeek
 

Recently uploaded (20)

PDF
Using Innovative Solar Manufacturing to Drive India's Renewable Energy Revolu...
Insolation Energy
 
PDF
Unveiling the Latest Threat Intelligence Practical Strategies for Strengtheni...
Auxis Consulting & Outsourcing
 
PPTX
Integrative Negotiation: Expanding the Pie
badranomar1990
 
PPTX
Business Plan Presentation: Vision, Strategy, Services, Growth Goals & Future...
neelsoni2108
 
PDF
NewBase 26 July 2025 Energy News issue - 1806 by Khaled Al Awadi_compressed.pdf
Khaled Al Awadi
 
PPTX
Appreciations - July 25.pptxffsdjjjjjjjjjjjj
anushavnayak
 
PPTX
Appreciations - July 25.pptxdddddddddddss
anushavnayak
 
PDF
askOdin - An Introduction to AI-Powered Investment Judgment
YekSoon LOK
 
PDF
MBA-I-Year-Session-2024-20hzuxutiytidydy
cminati49
 
PDF
India Cold Chain Storage And Logistics Market: From Farm Gate to Consumer – T...
Kumar Satyam
 
PPTX
PUBLIC RELATIONS N6 slides (4).pptx poin
chernae08
 
PDF
William Trowell - A Construction Project Manager
William Trowell
 
PDF
Alan Stalcup - Principal Of GVA Real Estate Investments
Alan Stalcup
 
PPTX
Pakistan’s Leading Manpower Export Agencies for Qatar
Glassrooms Dubai
 
PDF
New Royals Distribution Plan Presentation
ksherwin
 
PDF
Danielle Oliveira New Jersey - A Seasoned Lieutenant
Danielle Oliveira New Jersey
 
DOCX
India's Emerging Global Leadership in Sustainable Energy Production The Rise ...
Insolation Energy
 
PDF
GenAI for Risk Management: Refresher for the Boards and Executives
Alexei Sidorenko, CRMP
 
PDF
Tariff Surcharge and Price Increase Decision
Joshua Gao
 
PDF
A Complete Guide to Data Migration Services for Modern Businesses
Aurnex
 
Using Innovative Solar Manufacturing to Drive India's Renewable Energy Revolu...
Insolation Energy
 
Unveiling the Latest Threat Intelligence Practical Strategies for Strengtheni...
Auxis Consulting & Outsourcing
 
Integrative Negotiation: Expanding the Pie
badranomar1990
 
Business Plan Presentation: Vision, Strategy, Services, Growth Goals & Future...
neelsoni2108
 
NewBase 26 July 2025 Energy News issue - 1806 by Khaled Al Awadi_compressed.pdf
Khaled Al Awadi
 
Appreciations - July 25.pptxffsdjjjjjjjjjjjj
anushavnayak
 
Appreciations - July 25.pptxdddddddddddss
anushavnayak
 
askOdin - An Introduction to AI-Powered Investment Judgment
YekSoon LOK
 
MBA-I-Year-Session-2024-20hzuxutiytidydy
cminati49
 
India Cold Chain Storage And Logistics Market: From Farm Gate to Consumer – T...
Kumar Satyam
 
PUBLIC RELATIONS N6 slides (4).pptx poin
chernae08
 
William Trowell - A Construction Project Manager
William Trowell
 
Alan Stalcup - Principal Of GVA Real Estate Investments
Alan Stalcup
 
Pakistan’s Leading Manpower Export Agencies for Qatar
Glassrooms Dubai
 
New Royals Distribution Plan Presentation
ksherwin
 
Danielle Oliveira New Jersey - A Seasoned Lieutenant
Danielle Oliveira New Jersey
 
India's Emerging Global Leadership in Sustainable Energy Production The Rise ...
Insolation Energy
 
GenAI for Risk Management: Refresher for the Boards and Executives
Alexei Sidorenko, CRMP
 
Tariff Surcharge and Price Increase Decision
Joshua Gao
 
A Complete Guide to Data Migration Services for Modern Businesses
Aurnex
 

Data-As-A-Service to enable compliance reporting

  • 1. Data as a service to enable compliance reporting Girish Juneja, CTO October 7, 2014 © 22001144 AAllttiissoouurrccee LLaabbss.. AAllll RRiigghhttss RReesseerrvveedd.. Page | 1
  • 2. Chairman: William C. Erbey CEO: William B. Shepro Employees: ~8,000 NASDAQ: ASPS Market Cap: $2.2 Billion (Sept. 15, 2014) Performance since August 2009 Separation from Ocwen® CAGR Share Price: (Through Sept. 15, 2014) 47% CAGR Service Revenue: (Through Sept. 30, 2013) 39% Altisource Overview  Separated from Ocwen in August 2009  Created and separated RESI and AAMC in December 2012  Strong free cash flow  Strong growth prospects in very large markets © 2014 Altisource Labs. All Rights Reserved. Page | 2
  • 3. Altisource Vision Vision To be the premier real estate and mortgage marketplace offering both content and distribution to the marketplace participants Mission To offer homeowners, buyers, sellers, agents, mortgage originators and servicers trusted and efficient marketplaces to conduct real estate and mortgage transactions, and improve outcomes for market participants Real Estate Marketplace Mortgage Marketplace  Home Sales  Home Rentals  Home Maintenance  Mortgage Originations  Mortgage Servicing © 2014 Altisource Labs. All Rights Reserved. Page | 3
  • 4. State of the Business: Servicing COMPLEXITY - Meeting borrower /customer expectations - Elevated scrutiny of borrower interactions - Proliferation of servicer products - Reporting requirements Increased Risk Increased costs Increased penalties and fines Decreased customer satisfaction COMPLIANCE - Velocity of new and changing rules - Magnitude of financial exposure - Existing technology limits compliance capabilities CHANGE - Lack of end-to-end visibility - Rigid and inflexible systems - Volume and nature of data interoperability between data silos Compliance © 2014 Altisource Labs. All Rights Reserved. Page | 4
  • 5. Future of Servicing For servicers’ businesses to grow, a modern servicing platform must be: Flexible and Adaptable to easily and cost effectively respond to evolving market and business dynamics Scalable and Automated to enable cost effective business growth Interoperable to seamlessly interface with third party apps and other software platforms Compliance Centric to meet ever-changing regulatory mandates Analytical to drive continuous improvement and manage risk © 2014 Altisource Labs. All Rights Reserved. Page | 5
  • 6. Common Foundational Layer Customer Experience Menus & Navigation API Management Caching DMZ Gateway Identity Mgmt Single Sign-on Multi-tenant Authorization & RBAC Compliance & Entitlements Security Framework Encryption MFA Authentication Access Governance User Profile Rules Mgmt Workflows, Business & Compliance Workflow Mgmt Messaging Notification & Subscription Rules, Messaging, Integrations Search 3rd Party Integrations Metadata Management Master Data Management Reporting Compliant Auditing Data Management Transactional, Reporting & Analytics Data Archival Warehousing Data as a Service Provisioning Monitoring/Alerting Backup/Restore Configuration & Customization Elastic Performance Multi-tenant Operational Framework Availability/DR Metering High performance scala based framework App Provisioning, isolation, multi-tenancy & Life-cycle Service Registry Multi-tenant Cloud Provider Independent Rapid Deployment PaaS Deployment & Test Automation Cloud Abstraction layer © 2014 Altisource Labs. All Rights Reserved. Page | 6
  • 7. Environment Overview Financial Industry faces: – Increasing regulatory requirements – Increasing customer compliance requirements – Regulatory & customers’ changing requirements need correlating data across sources – Risk Analysis requires correlating internal and non-conforming structured external data sets – Existing data stores unable to respond rapidly Organizations need solutions that: – Enable financial institutions to address changing regulations – Enable automated compliance monitoring processes and systems – Improve internal controls by retaining data lineage to unmodified source datasets – Provide actionable & timely information from data – Enable on-demand reporting on massive datasets based on schema defined with the reporting request © 2014 Altisource Labs. All Rights Reserved. Page | 7
  • 8. The Traditional Enterprise Data Warehouse approach – Maintain data integrity by storing the organizational data with regulatory information in respective data dimensions. – Data modeling and the design of facts and dimensions are very critical to the success of compliance data warehouse. – The regulatory sufficiency needs to be maintained in the regulation dimension of the compliance data warehouse. – From data warehouse data is transmitted into a regulatory data mart. – Transformation of base data elements and regulation rules are maintained in the meta data. © 2014 Altisource Labs. All Rights Reserved. Page | 8
  • 9. Metadata, ETL & Core Model Metadata – Reflects the business and business processes – In EDW, all functionality is metadata driven Data definitions • Source and Core model from technical perspective • business perspective – Transformations & Aggregations • Transformations to derive cleansed Core data from source data • Aggregations and de-normalizations to create Access model data Loading Service – Reflects the business and business processes – In EDW, all functionality is metadata driven Data definitions • Source and Core model from technical perspective • business perspective – Transformations & Aggregations • Transformations to derive cleansed Core data from source data • Aggregations and de-normalizations to create Access model data Access Model – Data in the Access model is directly traceable to the Core model • De-normalized • Aggregated • Designed for query and access performance. • End user requirements • Access restrictions and controls – As a design decision, access model objects can either be physical structures, or structures materialized on access. Core Model – Control • Only approved, tested, and validated processes can update data in the Core model. – Data Model • Highly Normalized • No redundancy • Subject areas, loans, investors.. • Not optimized for apps • target data formats post-cleansing • It is optimized for efficiency and correctness. © 2014 Altisource Labs. All Rights Reserved. Page | 9
  • 10. The Change/Update Process Each Change involves the following steps: – Update the extraction module – Update the staging module – Update the Transformation module – Update the metadata – Update the data repository This process is tedious, involved, and brittle. © 2014 Altisource Labs. All Rights Reserved. Page | 10
  • 11. Enhancing with Big Data Technologies We were driven to adopt big data technology for many reasons: – Demand to analyze new data sources in an ever shorter timeframe – Growth in data complexity – Variety of data types – Volume of data and inability to move it around due to time constraints – Velocity of data generation, internal and external – Veracity of data from multiple sources – Growth in analytical complexity – Increasing availability of cost-effective computing and data storage The big reason for us was the frequent change of requirements due to changing business & regulatory changes. Spark is a more flexible platform © 2014 Altisource Labs. All Rights Reserved. Page | 11
  • 12. Data as a Service © 22001144 AAllttiissoouurrccee LLaabbss.. AAllll RRiigghhttss RReesseerrvveedd.. Page | 12
  • 13. Data Mobilization View for Data Lake © 2014 Altisource Labs. All Rights Reserved. Page | 13
  • 14. Borrower Data Service Request Details: Service Name: Borrower Data Context: Current/Cleansed & Conformed/History Request Filter: <Borrower Name>, <Date Range> Response Details: © 2014 Altisource Labs. All Rights Reserved. Page | 14
  • 15. Borrower Data Service © 2014 Altisource Labs. All Rights Reserved. Page | 15
  • 16. Mortgage Data Service Request Details: Service Name: Mortgage Data Context: Current/Cleansed & Conformed/History Request Filter: <Loan Number>, <Date Range> Response Details: © 2014 Altisource Labs. All Rights Reserved. Page | 16
  • 17. Mortgage Data Service © 2014 Altisource Labs. All Rights Reserved. Page | 17
  • 18. Loan Default Event Event Details: Event Name: Loan Default Context: Current © 2014 Altisource Labs. All Rights Reserved. Page | 18
  • 19. Loan Default Event © 2014 Altisource Labs. All Rights Reserved. Page | 19
  • 20. Data Lake vs. Data Warehouse Feature Data Lake Data warehouse Data Volume Extremely large (Petabytes) Large (Terabytes) Access Methods NoSQL SQL Schema Schema on read Schema on write Scalability Scales horizontally Scales vertically Hardware Commodity hardware Specialized hardware/ appliances Data Structure Structured and unstructured Structured Data Raw Cleansed/Aggregated © 2014 Altisource Labs. All Rights Reserved. Page | 20
  • 21. Data Lake Technology Stack GraphX Services/ Application Portals Spark (DAG construct and execute engine) RDD Instances/Schemas 3 rd Party Drivers Analytics Portals Data Storage Cassandra /HDFS/ Parquet BI/ETL tools ODBC/ JDBC Spark Streaming External Data Stores In-house API Interactive Mlib/ SparkR Hive/HQL Spark SQL In-house Drivers YARN © 2014 Altisource Labs. All Rights Reserved. Page | 21
  • 22. Benefits of Apache Spark based Data Lake - – Load data as its stored in the source system - no transformation needed – Build structure on it, apply Hive external tables on this raw data – Data sets built with our business logic – The intermediate and final results saved back to data storages – Working data sets saved as Parquet files – Distinction between data view and update view – When the data file changes in Hadoop or Cassandra, we have to update the Hive or Schema RDD’s: then we are done. © 2014 Altisource Labs. All Rights Reserved. Page | 22
  • 23. Data Storage Access Layer – Abstract the details of data accessing through contexts/drivers.  Hive/HQL  Spark Sql  Cassandra driver for Spark – Unify the data into RDD interfaces.  SchemaRDD  HadoopRDD  CassandraRDD © 2014 Altisource Labs. All Rights Reserved. Page | 23
  • 24. Code Samples - Apply Hive Schema to Raw Data Pour data Into HDFS Create Hive Schema Use HQL inside Spark SQL Save result in Parquet format RDBMS’s Excel Files Documents External Sources Cluster Details: 16 VM’s 128 GB Memory 126 GB Disk © 2014 Altisource Labs. All Rights Reserved. Page | 24
  • 25. The Spark Cluster App App Service Service Tool Tool … … … … Spark Driver Worker Worker Worker Worker Worker Worker … … … … Data Data … … … … Data Data Data Data Worker Worker Worker Worker Worker Worker … … … … Storage Storage Storage Storage Storage Storage … … … … © 2014 Altisource Labs. All Rights Reserved. Page | 25
  • 26. Performance observations 10 18 Rows 4.5 hrs 48 minutes 1 min Engineered Solutions Cores 128 Memory 2048 Gb Disk 12 Tb In-memory Databases Cores 160 Memory 2048 Gb Disk 12 Tb Spark Cluster VM’s Cores 128 Memory 2048 Gb Disk 12 Tb © 2014 Altisource Labs. All Rights Reserved. Page | 26
  • 27. Challenges – Open source Apache Spark, while very promising, has to mature – Spark production deployment is complicated – Security of data is not enterprise class, needs additional layers – Tools eco system is still developing – BI Tools still in development But.. – Done right has a lot of business value – We are hiring engineers! © 2014 Altisource Labs. All Rights Reserved. Page | 27
  • 28. Q & A © 2014 Altisource Labs. All Rights Reserved. Page | 28

Editor's Notes

  • #3: Thank you and good morning Altisource separated from Ocwen in August 2009 As you can see from slide 7, our market capitalization has grown to $2.6 billion since separation Ocwen is the largest independent residential mortgage servicer in the U.S. Altisource is a provider of services to Ocwen Altisource has fundamentally different investment characteristics from Ocwen Altisource is capital light. Because of our high margins, we are very unique in that the faster we grow our revenue, the faster our net free cash flow grows. Even with our strong cash flow, we are seeing such significant opportunities that we are turning to the senior secured term loan market. We view senior secured term loans as a short to medium term capital extender to take advantage of attractive opportunities. Given our strong cash generating capability, we hope that your greatest complaint will be that we paid the loan back too quickly.