Data Curator Overview
The FAI Data Curator is a smart data-quality engine that automatically detects, flags, and optionally fixes data issues in sensor data streams. It is designed to help organizations maintain clean, reliable datasets that support monitoring, analysis, and operational decisions.
Beta Notice:
► The AI Data Curator is currently available as a Proof-of-Concept (PoC) beta feature.
► It is not enabled by default. Activation requires a request via your Ayyeka account representative.
► Intended for early adopters exploring advanced data quality and automation capabilities.
Contents
Introduction
Key Capabilities
Profiling: Detecting Data Issues
Fixing: The Imputer Engine
Auxiliary Streams
Navigating the Dashboard
Setup and Configuration
Workflow Summary
Known Limitations
Contact and Access
Introduction
The AI Data Curator applies AI-driven profiling and automated imputation to enhance data integrity. The system operates on a non-destructive model, preserving original raw data and applying fixes to parallel ‘curated’ streams.
All fixes are tracked with full audit trails and can be applied manually or automatically based on user-configured templates.
Key Capabilities
Profiling
Automatically identifies missing, unstable, or irregular values
Fixing (Imputer)
Suggests or applies fixes using replacement or interpolation
Custom Templates
Tailor detection logic per stream with adjustable sensitivity
Math Expressions
Create conditional logic using auxiliary streams
Audit & Control
Logs all detected issues and corrective actions
Dashboard Filters
Streamline review by timeframe, site, issue type, and more
Profiling: Detecting Data Issues
The Profiling Engine scans enabled streams to identify the following:
Missing Data
Gaps >2× the daily mode based on sampling interval
Irregular Sampling
Deviations from the expected interval using standard deviation
Flatline
Constant readings beyond configured duration
Out-of-Range
Exceeds user-defined min/max limits
Unstable (Erratic)
Rapid changes (spikes, saw-tooth) based on slope threshold
Detection logic is configured via Profiling Templates for per-stream customization.
Fixing: The Imputer Engine
Fix Categories
Replacement
ML-based pattern matching to restore realistic data
Interpolation
Statistical filling using regression and seasonal models
Replacement Methods
DTW (Dynamic Time Warping)
Finds best match even with time-shifted patterns
MP (Matrix Profile)
Z-normalized segment matching
CC (Cross-Correlation)
Pattern alignment using correlation scoring
A screening phase selects the best match before fix application.
Adjustment Logic
If matching and query slopes differ by <20% → apply mean adjustment.
If >20% → align trends using regression.
Interpolation Methods
Curve Fitting
Linear/quadratic/cubic regression
Seasonal
Extracts repeating patterns (e.g., daily or weekly)
Math Expression
Custom logic based on auxiliary data (e.g., StreamA = StreamB * 1.5 + 3)
Auxiliary Streams
Auxiliary sensor streams can enhance:
Detection Logic:
IF StreamA > 100 AND StreamB < 10 THEN flag issueFixing Calculations: E.g., Level = Velocity × Coefficient
Match Screening: Filters candidate patterns by similarity across multiple dimensions
Navigating the Dashboard
Accessible from the main FAI menu, the Data Curator Dashboard displays:
Sites with active issues
Summary of issue types
Filtered time views (e.g., 24h, last 7 days)
Curated streams with fix previews and manual override options
Import/export of corrections via CSV
Setup and Configuration
Requirements
Not enabled by default
Requires API key generated by an Account Owner
Manual selection of monitored Wavelets and Streams
Settings Options
Enable/disable detection
Toggle profiling for each stream
Enable fixes
Choose between automatic and manual
Apply templates
Assign different logic per stream
Exclude streams
Filter out technical values (battery, signal, etc.)
Templates & Tuning
Manage Profiling Templates to control:
Detection sensitivity
Flatline duration threshold
Out-of-range limits
Irregular sampling tolerance
Imputation model version
Templates can also be retrospectively applied to analyze historical data quality ("rewind" capability).
Workflow Summary
1. Stream Activation
Select Wavelets and Streams
2. Data Profiling
Engine scans each point for anomalies
3. Flag or Fix
Issues logged or corrected
4. Review
Users assess and accept/reject proposed fixes
5. Export
Corrections tracked and available via dashboard or CSV
Known Limitations (Beta)
UI bugs when saving/renaming templates
Some visibility issues in fix logs
Error when adding newly ingested streams
Performance bottlenecks on deployments >100 devices
Contact and Access
To request access to the FAI Data Curator, contact support or your account manager.
Last updated