Data Automation Platform

DataCanva

DataCanva effortlessly integrates with a variety of data sources and offers sophisticated data preparation features. This enables businesses to automate their data workflows, enhancing both accuracy and efficiency. By leveraging its diverse functionalities, DataCanva helps you optimize your data processes through automation and AI technologies, creating unified datasets that drive analytics and support data-driven decision-making throughout the organization. DataCanva mainly focuses on solving below business paint points:

Various Disconnected Data

Customers generate data across various touchpoints, often in different formats and from multiple sources. Traditionally, there hasn't been a unified repository for holistic access and analysis of this data.

Customer Query and Segmentation

Blending and transforming data, as well as executing queries, typically require technical expertise. Even simple queries can be delayed, as they often rely on IT specialists to create hard-coded scripts for business users.

Handling Real-Time Updates

Data is generated continuously, and faster processing can enhance customer engagement. Traditionally, customer data has been processed in batches, leading to unnecessary delays in certain situations.

Incorporating Custom Programs

It is increasingly common to integrate data from affiliate sources or the internet to inform better business decisions. Traditionally, only first-party data has been manageable through standardized models.

Core Functions

Dashboard

An integrated place to store all your data: The Dashboard provides a quick snapshot of essential metrics, showcasing totals and trends, along with an overall data summary about the healthy and problematic data entities and pipelines you have kept.

Entity Metrics

Provides a comprehensive overview of data entities, including total counts, active entities, and those with errors. This metric helps to assess the overall health and performance of the data landscape, enabling quick identification of potential issues.

Pipeline Metrics

Offers valuable insights into data pipelines by detailing total counts, active pipelines, and error counts. This information is crucial for monitoring the efficiency and reliability of data flow within the system, allowing teams to optimize performance and address bottlenecks.

Execution Timeline

Tracks and visualizes recent modifications to both data and pipeline configurations. This timeline feature enables users to understand the sequence of changes, facilitating better management and troubleshooting of data processes.

Recent Changes

Displays the latest changes in descending chronological order for data pipelines and data entities within the data warehouse and support systems. This metric provides a historical view of all modifications, including relevant function sections, ensuring transparency and accountability in data management practices.

Data Warehouse

In our Data Warehouse, users can view the list of data entities, also can create any number of data entities and define various properties (schema, ingestion method) about data entities. The lifecycle of data entities is strictly controlled.

Connect with Various Data Sources

Seamless connectivity with diverse data sources to ensure comprehensive data coverage, support various connectors to synchronize data sources to the Platform.

Support Various Connectors for Data Ingestion

Real Time API

Enables users to update the data entities directly through API.

Tracker API

Enables users to integrate the tracker into their web applications or apps and sends event data to the platform.

Dataset API

Allows users to define data sources such as FTP connection, API Inbound, Google Sheet, Http Request, Database Connector (MySQL, RedShift, etc.) to send data to the platform.

File Upload

Supports direct file upload with some simple clicks.

Creation and Maintenance of Data Entities

Support simplified processes to create and maintain schema structures for data organization.

Automatic Data Type Detection

Implement intelligent algorithms that automatically identify and assign appropriate data types based on the content and context of the data entered. This reduces manual input errors and accelerates the schema creation process.

Support for Diverse Data Types

Ensure compatibility with a wide range of data types, including but not limited to integers, floats, strings, booleans, dates, and complex types like arrays and objects. This flexibility allows for comprehensive data modeling to meet various organizational needs.

Schema Versioning

Facilitate robust schema versioning to track changes over time. This feature will enable users to maintain historical records of schema modifications, allowing for easy rollbacks and comparisons between different versions. It supports better governance and compliance, ensuring that data integrity is preserved throughout its lifecycle.

Validation and Error Alerts

Integrate validation mechanisms to ensure that data entries conform to the defined schema. Provide clear error messages and suggestions for correction to enhance user experience and data integrity.

Large Volume of Data

The ingestion processes are designed with scalability, allowing for easy accommodation of growing data volumes.

Define Linking Relationships between Data Entities

Establishing clear and efficient linking relationships between data entities is crucial for maintaining data integrity and enabling complex data interactions. The following relationship types are supported.

One-to-One (1:1)

Facilitate direct associations between two entities, ensuring that each instance of one entity corresponds uniquely to a single instance of another. This is essential for scenarios where specific attributes are exclusive to a single entity.

One-to-Many (1:N)

Enable a single entity to be associated with multiple instances of another entity. This relationship type is pivotal for representing hierarchies and collections, allowing for efficient data organization and retrieval.

Many-to-One (N:1)

Support scenarios where multiple instances of one entity can relate back to a single instance of another entity. This is useful for managing shared attributes or categorizations across multiple entities.

Define Access Rights of Data Entities

Establish a robust access control framework to manage permissions for data entities based on user roles through owner, editor and viewer. This framework includes:

Owner Rights

Full control over data entities, including the ability to create, modify, delete, and grant permissions to other users. Owners can also manage schema changes and oversee data integrity.

Editor Rights

Permissions to modify existing data entities and structure. Editors can update records, adjust relationships, and implement changes while maintaining the overall integrity of the data.

Viewer Rights

Restricted access that allows users to view data entities without making any modifications. This role is crucial for stakeholders who need viewing access without altering the underlying data.

Implement Secure Access Control

Regulate permissions and access based on user roles, including:

Role-Based Access Control

RBAC Defines roles and associated permissions, ensuring users only have access to the data necessary for their responsibilities.

Audit Trails

Maintain logs of user interactions with data entities to enhance accountability and traceability. This includes tracking changes made by owners and editors.

Dynamic Access Management

Support real-time updates to access rights in response to changes in user roles or organizational policies, ensuring ongoing compliance with security protocols.

Safeguard the data integrity

Throughout the whole life-cycle of the data entities-creation to purge by meticulously design data ingestion process.

Robust Data Validation

When different data sources in diverse formats come into the system, data validations are built to detect errors early.

Comprehensive Monitoring

Establish comprehensive monitoring systems to quickly identify and resolve any errors or bottlenecks in the ingestion process.

Enable Robust Archiving Mechanism

To efficiently manage old records within a data entity, This feature allows users to seamlessly move outdated entries into an archive, ensuring that the active dataset remains streamlined and performance-optimized.

Customizable Archiving Settings

Users can define specific criteria for archiving, such as date ranges, status indicators, or data relevance, allowing for tailored archiving processes that meet organizational needs.

Comprehensive Archive History

Users can access a detailed archive history, providing insights into archived records, including timestamps, user actions, and the reasons for archiving. This fosters accountability and transparency in data management.

Searching and Retrieval Options

Implement powerful search functionalities that allow users to quickly locate archived records, ensuring that important information remains accessible when needed.

Programmatic Data Access

A robust framework designed to facilitate efficient data retrieval from the system, ensuring users can easily access the information they need.

Simple Query API

Enables users to perform queries on data entities using specific criteria.
Supports a variety of query parameters, allowing for flexible filtering and sorting of results.
Provides comprehensive documentation and examples for ease of use, ensuring that both novice and experienced users can effectively construct queries.

Data Fetching API

Allows for efficient reading of data entities either in batches or by specific keys.
Optimized for high performance, enabling quick retrieval of large datasets while minimizing resource consumption.
Supports pagination and limits to enhance the user experience, ensuring that users can manage and process data in manageable chunks.
Includes error handling and response validation to ensure robust and reliable data access.

Data Pipeline

Transform your data with the intuitive interface: The Data Pipeline feature offers a graphical representation of workflows, making it easier to design and manage complex pipelines with over 20 components, including various data transformations and efficient input-output management.

Workflow Visualization

WYSIWYG:The Data Pipeline setup features a WYSIWYG graphical interface that enables users to create and merge data entities from both batch and streaming sources for various downstream transformations. The graphical representation of data workflows makes it easier for users to understand and manage their processes.
Additionally, users can build ad-hoc automated data pipelines to transform multiple input data entities into ready formats using a simple drag-and-drop interface.

Creation and Maintenance of Data Pipelines

Design and manage complex data pipelines with robust transformation and extensible data modeling components to create flexible workflows.
Support 20+ transformation components( Filter, join, intersect, union, except, aggregate, split by group, enrich, sort, etc.) and custom-built modules(e.g. prediction, scripts), allow users each to create their pipeline without hard coding/ additional scripting.

Input and Output Management

Versatile Data Inputs: Support various methods of data import, including manual uploads, APIs, file servers, databases, and web clicks. The Data Pipeline facilitates the transformation process by merging diverse data sources and generating formatted data.
Versatile Data Outputs: Enable outputs to a range of applications, such as marketing automation, data visualization, AI models, and peripheral systems via APIs. The resulting data can be directed to multiple activation channels, including marketing automation, file exports, and email communications.

Batch Data Processing

Batch and Real-Time Processing: Support both traditional batch processing and emerging real-time processing to enable immediate and continuous communication with customers.
Batch and Streaming Data Pipelines: Facilitate the management of both batch data pipelines and streaming data pipelines, ensuring efficient handling of real-time data.
Actionable Customer Segments: Utilize a canvas area to define unlimited actionable customer segments for marketing automation using drag-and-drop components. This allows for the identification of potential customers and those with recommended products.

Customizable Script Module

Offers flexibility for tailored scripts to meet specific data processing needs, empowers users to create bespoke solutions aligned with unique workflows and requirements.
Supports customization of Advanced data transformations, Automation of repetitive tasks, Integration with other systems.
Allows for seamless customization using various scripting languages, ensures efficiency and effectiveness in data management processes as needs evolve.

Execution and Monitoring

The Execution and Monitoring feature provides robust tools for executing data pipelines and closely monitoring their performance. This functionality allows users to initiate and manage pipeline operations with ease, ensuring that data flows smoothly from source to destination.

Execution Tools

Users can schedule, trigger, and execute pipelines on-demand, enabling flexibility to meet various operational requirements. With intuitive controls, you can quickly start or stop processes as needed, facilitating efficient data management.

Performance Monitoring

Real-time monitoring capabilities provide insights into pipeline status and samples, allowing for quick identification and resolution of any issues.

Alerts and Notifications

Set up customizable alerts to notify users of critical events, failures, or performance bottlenecks, ensuring proactive management of data workflows.

Smart Query

Perform flexible query over your data entities

Creation and Maintenance of Smart Queries

User-Friendly Interface

Smart Queries provide an intuitive interface for designing and managing queries.

Drag-and-Drop Functionality

The feature includes a convenient drag-and-drop mechanism for ease of use.

Effortless Data Querying

Users can easily query data entities and select output fields.

Simplified Data Extraction

This functionality streamlines the process of data extraction and analysis.

Ad-Hoc Query Creation

Users can create ad-hoc queries that span multiple data entities.

Support of Data Retrieval across Data Entities

Flexible Querying Options

Supports diverse data retrieval needs through various querying methods.

Single Data Entity Queries

Users can apply AND/OR rules to filter and refine results within a single data entity.

Linked Data Entity Queries

Enables users to connect multiple entities for more complex data analysis and retrieval.

Versatility in Data Extraction

Ensures effective extraction of information regardless of data relationship complexity.

Practical Example

Users can use the customer entity as a base and connect it with the transaction entity to retrieve aggregated statistics for each customer.

Create Custom Fields with Formulas

Custom Field Creation

Allows users to create custom fields based on existing fields using formulas.

Formula-Based Derivation

Users can derive custom fields as output conditions in their queries.

Example Use Case

Create a custom field to define customer VIP levels based on the total number of purchases made.

Tailored Insights

Enables generation of specific insights from data for better analysis and interpretation of trends and behaviors.

Enhanced Data Analysis

Improves the ability to analyze and interpret specific patterns within the data.

Support of Subqueries

Subquery Functionality

Allows embedding one query within another for complex data retrieval and manipulation.

Filtering and Aggregation

Useful for filtering or aggregating data based on calculated values from another query.

Practical Example

Find employees with salaries above the average salary of their respective departments using a subquery.

Average Salary Calculation

A subquery can first calculate the average salary for each department to facilitate filtering.

Simplified Query Process

Breaks down complex queries into manageable parts, enhancing readability and dynamic data analysis.

Support of Input and Output Filters

Input and Output Filters

Supports the use of pre-filters to refine input data and post-filters to enhance output.

Field Specification

Users can easily specify fields to include in results, tailoring data to their needs.

Pre-Filters

Streamline input data for more focused queries and analyses.

Post-Filters

Allow fine-tuning of output for improved results and insights.

Control and Precision

Provides greater control over data handling processes, ensuring accuracy and relevance.

Versatile Output Options

Output Field Specification

Allows users to choose which fields to include in the output.

Inclusion of Selected Fields

All selected fields, whether from primary or reference entities, are included in the results.

Data Export Options

The output can be sent to a Data Warehouse or other peer systems.

Marketing Activation

Facilitates use of data for marketing purposes or other applications.

Full Control of Exported Data

Ensures users can effectively utilize and integrate their data.

Data Comparator

Compare two data entities and export what they are common and different

Intuitive Interface

This feature is a user-friendly, spreadsheet-like interface designed for effortless data exploration and analysis. Users can easily import data from a variety of sources, such as databases, spreadsheets, and cloud services, and view them side-by-side in a familiar tabular format. This intuitive layout enhances usability and allows for quick comprehension of complex datasets.

Easily Selection

With a few simple drop-down selections, the platform intelligently joins related information from different datasets, enabling users to seamlessly navigate through the data. This functionality not only facilitates a deeper understanding of the relationships between data points but also empowers users to uncover valuable insights across multiple data sources.

Entity Comparison

The Entity Comparison feature allows users to systematically compare two data entities, highlighting both commonalities and differences. This is particularly useful for identifying trends, discrepancies, and unique attributes that may not be immediately apparent in isolated datasets. Users can quickly discern how entities relate to one another, whether for data validation, anomaly detection, or strategic decision-making.

Export Functionality

The platform also includes a robust Export Function, allowing users to easily generate and export comparison results in various formats, such as CSV, PDF, or Excel. This capability is essential for reporting and further analysis, enabling users to share their findings with stakeholders or incorporate them into presentations and documentation. The streamlined export process ensures that critical insights are readily accessible, supporting informed decision-making and strategic initiatives.

Website Crawler

Extract website content in an automated fashion

Website Crawler is an automated content extraction tool designed to efficiently gather and consolidate information from websites without the need for manual intervention. It supports automated content extraction and from one-level to three-level crawling complex website structure data extraction.

Automated Content Extraction

Support users set up schedules in crawling, streamline the process of extracting and aggregating content from websites, eliminating the need for manual operation.

Flexible Scraping Field

One-layer crawling: Ideal for scenarios involving a single-page website, allowing for straightforward extraction of content.
Two-layer crawling: Suitable for websites with a list on the first level, enabling users to click on items to access detailed pages.
Three-layer crawling: Designed for more complex structures, this feature supports a hierarchical approach where a category serves as the first level, followed by a list on the second level, and detailed pages at the third level.

Clear View of Crawling Process

The full status of the crawling process is indicated next to each crawler.

Structured Extracted Data

The content crawled will be structured for immediate viewing and further analysis.

Integration with Content Hub

The structured extracted data will be automatically integrated to the content hub. This allows further use in chatbot training.

AI-powered Chatbot

Perform AI chat with your data

The AI-powered Chatbot is a GPT-based, context-aware search engine designed to enhance information retrieval within organizations. By utilizing a GPT-based model, this chatbot understands the nuances of user queries and provides accurate, context-rich responses. This advanced technology not only replaces the limitations of keyword-based searches but also improves the overall efficiency of information access inside your own database.

Content Hub

Content repository for your custom data to build your chatbot

Supports creating your own database in Content Hub by uploading files. These databases can be used to train your own customized chatbot.
Supports managing your content hub, easily view, edit and delete content.

Chatbot Customization

Customize your chatbot with your data

Link to Content Hub Database: allows users to tailor chatbot’s responses and knowledge base using users’ unique data sets which were uploaded in Content Hub.
Various LLMs for Selection: allows users to select specific LLM to train customized chatbot.
Status Update: allows users to know the readiness of the chatbot through various statuses.

AI-Powered Conversation

Engage in intelligent conversations with your data through an intuitive interface.

Chatbot selection: users can access any defined chatbot, the responses are context-aware out of the databases trained.
Chat history: users can view the chat history and copy and delete individual response.

Account Management and Admin Setting

Manage users/ Access rights/ Advanced system setting in one place

User Management

Our platform offers robust user management capabilities, allowing administrators to effectively control user access and permissions. Users can be assigned specific roles tailored to their responsibilities, ensuring that everyone has the appropriate level of access for their tasks.

Group Management

With our group management feature, users can be organized into distinct groups, facilitating a more streamlined management process. This organization enhances collaboration and communication within teams.

Access Control

The access control functionality enables precise management of access rights for different system roles. Administrators can create user accounts, assign roles, and categorize users into groups. Additionally, access control settings can be established to define which functions are accessible to various user roles, ensuring security and compliance.

My Profile

Users have the ability to manage their own accounts through the "My Profile" feature. This allows them to view and update their profile information, change or reset passwords, and customize notification settings to suit their preferences.

Admin Settings

The admin settings provide users with the tools needed to manage advanced organizational settings and notifications. This feature ensures that administrators have full control over the operational aspects of the platform, enhancing overall efficiency.

Benefits of DataCanva

What makes DataCanva stand out in the market?

No-Code Data Pipelines

Effortlessly create and manage data pipelines using an intuitive drag-and-drop interface, minimizing the need for technical expertise.

Centralized Data Hub

Integrate data from various sources, breaking down information silos and providing a comprehensive view for better analysis.

Interactive Data Exploration

Analyze multiple datasets side by side in a user-friendly, spreadsheet-like format, facilitating easier comparisons and insights.

Custom LLM Training

Train specialized large language models with your organization's unique data, ensuring tailored insights that reflect specific industry needs.

Tailored AI Responses

Enhance customer interactions with a chatbot that provides accurate and relevant responses based on your proprietary data and terminology.

Robust Data Security

Maintain complete control over sensitive information with on-premise deployment, ensuring compliance with regulations and protection against unauthorized access.

Holistic Data Integration

Utilize flexible ingestion options, including automated web crawling, to continuously enrich your datasets for improved performance.

High Customizability

Tailor the chatbot’s personality and functionalities to align with your brand identity, creating a unique user experience that meets specific operational requirements.

Use Scenarios

How was DataCanva applied in different industries?

The integration capabilities of DataCanva include a comprehensive list of compatible tools and platforms, such as CDP, IoT, Custom AI, Web Data Extraction, Data Visualization and AI-powered Chatbot, along with detailed information on APIs and customization options, ensuring seamless integration with organizations’ existing systems.
Here are the enhancements that DataCanva has delivered across various use cases.