top of page

The Enterprise
Semi-Structured
Data Assurance Layer.

Deterministic, recursive structural refinement that preserves every type variation, governs by default, and stays complete as schemas evolve.

"We looked everywhere...

Nothing comes close

to what DataPancake delivers."

logo.svg.png
Most Snowflake customers struggle with this issue, and almost all of our SEs have had to deal with it. DataPancake saves data engineers weeks of time, money, and pain.
image.png

Cameron Wasilewsky, Technical Lead

Snowflake Startup Accelerator

Accelerate Your Data Pipeline with DataPancake

10-minutes-black-icon_edited_edited_edit

10 Minutes

Average time to scan 1 billion rows

user-privacy-security-icon_edited_edited

10x ROI

Demonstrated across compute, engineering, and data quality outcomes

image.png

10+ Industries

Financial services, healthcare/life science,

public sector, manufacturing, retail and more

Structural assurance is only meaningful if it's verifiable, configurable, and maintainable at enterprise scale.

Every capability below is designed with that constraint in mind, not as a feature added to a tool, but as a component of a
complete assurance architecture.

Are You Ready To Pancake Your Data™?

DataPancake is the only Snowflake Native App designed to take you from raw, complex and deeply nested semi-structured data to accurately flattened, normalized, enriched, secure, documented, and GenAI-ready relational Dynamic Tables and Views, with zero technical debt in minutes not months.

ribbon4.png

Schema Discovery

Screenshot 2025-05-27 at 4.24.40 PM.png
Key Features:

Recursively scans 100% of your semi-structured data (JSON, XML, Avro, Parquet, and ORC) to discover all attributes including: nested arrays, objects, and every polymorphic version of each attribute

Detects all 7 polymorphic data type variations (4 primitives, 2 types of arrays, and objects)

Identifies escaped JSON within string fields

Scanning and discovery benefits from Snowflake's vertical scaling

Infers Snowflake destination data types including correct datetime formats for accurate type conversion

Pipeline Design

Key Features:
Enables users to customize how each pipeline SQL DDL will be generated:
Configure foreign key relationships for nested arrays
Apply column-level transformation logic during the materialization process
Create virtual attributes for derived fields or semantic model metrics and filters
Configure row access and column masking policies integration
Configure semantic layer of views including additional column level transformation
Screenshot 2025-05-26 at 11.42.40 AM.png

SQL Code Generation

Screenshot 2025-05-27 at 8.30_edited.jpg
Key Features:
Generates SQL DDL code needed to create relational dynamic tables and policy infused views in Snowflake, based on your configured attribute metadata
Code generated Snowflake Dynamic Table SQL DDL using DataPancake ITDCs™ (Immutably Typed Derived Columns™) to create technical-debt free pipelines
Reflects configured transformations, foreign keys, and virtual attributes allowing for post-normalized table joins
Code generated Views selecting data from normalized dynamic tables that incorporate row-access and column-masking security policies and additional column level transformations
Code generated streams, tasks, and tables to track Dynamic Table metadata including insert and last updated datetime

Schema Drift Monitoring

Key Features:
Continuously monitors and alerts you when your semi-structured data source schema changes
Detects schema drift in semi-structured data sources like JSON and XML
Flags changes in data types, structure, and new attributes
Alerts users to configure newly discovered attributes and regenerate pipeline code
Optionally generates updated pipeline SQL DDL upon schema change detection
Screenshot 2025-05-27 at 4.41.28 PM.png

 Data Dictionary Builder

Screenshot 2025-05-27 at 4.42.48 PM.png
Key Features:
Creates a comprehensive data dictionary that includes definitions, synonyms, and sample values for every attribute with integration to our Semantic Model Generator for Cortex Analyst
Use your preferred LLM to generate definitions, synonyms, and sample values.
Extend DataPancake's system prompt with your own custom context for greater clarity and improved responses
Generate descriptions for the datasource, nested arrays, and attributes

 Semantic Model Generator 

Key Features:
Generates Cortex Analyst-ready YAML files that define your complete semantic model
Automatically includes relationship metadata based on selected columns
Integrated with the Pipeline Designer and Data Dictionary Builder
Configure custom metrics, facts, and filters through virtual attributes, then add verified queries and custom instructions.
Screenshot 2025-05-27 at 4.39.41 PM.png

Pancake Your
Streaming Data
With Ease 

kafka_edited.png

DataPancake includes native support for Kafka topics streamed into Snowflake, giving you full control over complex, high-volume data in motion, without the overhead.

mag_edited.png

Discovery of Stringified JSON

Accurate schema discovery of complex escaped message strings

clean.png

Deduplication Logic

Includes configurable logic to materialize the most recent message per message key

box.png

Flattened Message Metadata

Normalized dynamic tables and views can be configured to include flattened Kafka-specific metadata (e.g. topic, partition, offset, timestamp)

repeat.png

Incremental Scanning

Recurring scans use the last processed message timestamp for incremental data scanning, reducing compute cost 

"DataPancake has proven to be a superior tool. Its adoption has lead to substantial savings in engineering resources, cloud compute costs, and many other areas and has accelerated our adoption of GenAI."​

Scott Hamilton

Data Engineering Manager

logo.svg.png

Blog

The art and science of nested,
polymorphic JSON transformation in Snowflake

Read how we leveraged the Snowflake Native App framework to help you dynamically transform JSON data into actionable insights in a fraction of the usual time.

Semi-Structured Data Pipelines
How to Mitigate Risk and Accelerate
Innovation with Snowflake

Explore the unknown challenges that contribute to downstream impact and risk, and how to identify and solve them with DataPancake® from TDAA inside of Snowflake.

Normalize complex semi-structured data in minutes instead of months

bottom of page