Cadenz Diffen: Governed Ingestion for Data Lakes
TheDataTeam
Cadenz Diffen: Governed Ingestion for Data Lakes
TheDataTeam
Cadenz Diffen: Governed Ingestion for Data Lakes
TheDataTeam
This solution aims to challenge today's "dump the data and then figure out" data lake techniques. Inspired by the DWH best practices to provide data lineage and governance, therefore, allowing engineers to take control of the data pipeline.
Modern data platforms, especially data lakes, are complex. Many projects fail due to wild west practices of “first dump and then figure out”. Storage costs trending downward has only exacerbated the problem. Established businesses that are subject to regulation usually rely upon best practices honed over years of managing data, but these are not available easily in these data platforms. Many sources of data are designed for legacy applications in ways that are not easily compatible with today’s technologies. The explosion in digitization has meant that traditional data ingestion approaches are not cutting it.
The need of the hour is an efficient and scalable mechanism for ingesting and managing massive data with key emphasis on operational ease. Cadenz Diffen uses configurations to drive normal patterns of data ingestion and management and extends out-of-the-box functionality using custom scripts that are modular and plug-and-play. Cadenz Diffen relies upon best practices from data warehousing to get usable business data.
Salient Solution Features
Ease of use
- Data ops enabled via dashboard and alerts
- Supports schema changes and data drift
- Built using Spark to run on modern platforms
- Decoupled compute from storage for horizontal scalability
Fast and Flexible
- Handles data at scale
- Technical metadata quality checks
- Standard processing patterns for dealing with all paradigms of extraction
- Guarantees valid snapshots of source systems as of defined periodicity
- Stands up a usable data lake in as little as 6 weeks for all downstream purposes.
Governance and Security
- Exception logging and alerting
- Encrypted, access on data
- Supports business closure processing for critical fields
- Handling rejects and default substitution
- Native support for standard column transformations, and extensible