Common Issues of Data Integration
The New Approach to Data Integration
True Data Federation
Data stays in place – no copies, no heavy ETL processing, no point-to-point integration.
True Data Federation
- Data Sources stay in place with LA – no heavy ETL or data migration needed & no copies of data are created – with LA you “Embrace Your Data Silos”
- LA is composed of data connectors designed for specific data types (via direct connection or APIs)
- LA queries each data source independently in its native language (SQL, CSV, etc) – filters and other functions are done within the data source directly, then passed back to LA
- LA can speak numerous machine languages allowing for connection to a wide range of data sources
Virtualization
- LA acts like a virtualized data lake, so data can remain federated while queries and analyses exist in an abstraction layer
- Virtualization provides a means to spin up new queries & analytics on-demand
- Reduces time-consuming and costly integration phases – new sources can be spun up & used in a fraction of the time required in Data Lakes or Data Warehouses
- Users can drive connectivity directly without the need for IT to migrate data together
Virtualization
Search and analyze data from a single unified layer that virtually spans across multiple data sources & services.
Semantics
Meaning & context of data is captured in powerful knowledge graphs that provide multiple perspectives for users.
Semantics
- LA employs powerful semantic models, which exist in multiple layers of complexity
- Layered semantics allows for a base Knowledge Graph (KG) plus limitless perspective-driven models to be layered on top of the KG
- LA’s semantics capture a data source’s schema as metadata and align that metadata to the Knowledge Graph
- Business rules and perspectives can change over time and must only be replaced 1 time within the Knowledge Graph
User-Defined Functions
- LA allows not only for user-driven queries and analysis, it provides a means to insert code directly into the engine
- Alogrithms, Python Scripts, R, etc., can all be employed within LA and run automatically against connected data sources
- This provides a powerful machine-to-machine form of automation for complex applications
User-Defined Functions
Advanced analytics where users can enter custom code to automate processing of data.
Minimal IT Footprint Security
No storage of instance data is needed – only metadata models and alignments are stored.
Minimal IT Footprint
- LA stores minimal data within it – instance-level data remains in the data sources themselves
- LA stores the base Knowledge Graph and relationships (i.e., joins) across federated sources
- This reduces the amount of space required inside of LA – keeping it nimble.
- Queries and analytics can be re-constituted on-the-fly using the metadata in the Knowledge Graph combined with the joins across data sources
Data Governance & Stewardship
- LA doesn’t require new governance or stewardship policies to be created, since they should already exist within the data sources themselves
- Data owners can maintain localized control of their systems with ease
- Data sources can be presented to LA in their original formats or as views
Data Governance & Stewardship
Control of data stays at the source and reduces risk – no added governance and stewardship is needed.
Data Security
No storage of instance data is needed – only metadata models and alignments are stored
Data Security
- LA allows IT organizations to maintain security protocols directly on data sources (via federation)
- This reduces liability and risk for organizations, since data must not be copied and maintained separately – especially helpful with 3rd party data
- Internally, LA has role-based security where users are recognized via single sign-on & only that data (or metadata) they have permission to see is shown