Data Curator Overview

The FAI Data Curator is a smart data-quality engine that automatically detects, flags, and optionally fixes data issues in sensor data streams. It is designed to help organizations maintain clean, reliable datasets that support monitoring, analysis, and operational decisions.

Beta Notice:

► The AI Data Curator is currently available as a Proof-of-Concept (PoC) beta feature.

► It is not enabled by default. Activation requires a request via your Ayyeka account representative.

► Intended for early adopters exploring advanced data quality and automation capabilities.

Contents

  • Introduction

  • Key Capabilities

  • Profiling: Detecting Data Issues

  • Fixing: The Imputer Engine

  • Auxiliary Streams

  • Navigating the Dashboard

  • Setup and Configuration

  • Workflow Summary

  • Known Limitations

  • Contact and Access

Introduction

The AI Data Curator applies AI-driven profiling and automated imputation to enhance data integrity. The system operates on a non-destructive model, preserving original raw data and applying fixes to parallel ‘curated’ streams.

All fixes are tracked with full audit trails and can be applied manually or automatically based on user-configured templates.

Key Capabilities

Feature
Description

Profiling

Automatically identifies missing, unstable, or irregular values

Fixing (Imputer)

Suggests or applies fixes using replacement or interpolation

Custom Templates

Tailor detection logic per stream with adjustable sensitivity

Math Expressions

Create conditional logic using auxiliary streams

Audit & Control

Logs all detected issues and corrective actions

Dashboard Filters

Streamline review by timeframe, site, issue type, and more

Profiling: Detecting Data Issues

The Profiling Engine scans enabled streams to identify the following:

Issue Type
Description

Missing Data

Gaps >2× the daily mode based on sampling interval

Irregular Sampling

Deviations from the expected interval using standard deviation

Flatline

Constant readings beyond configured duration

Out-of-Range

Exceeds user-defined min/max limits

Unstable (Erratic)

Rapid changes (spikes, saw-tooth) based on slope threshold

Detection logic is configured via Profiling Templates for per-stream customization.

Fixing: The Imputer Engine

Fix Categories

Category
Description

Replacement

ML-based pattern matching to restore realistic data

Interpolation

Statistical filling using regression and seasonal models

Replacement Methods

Method
Description

DTW (Dynamic Time Warping)

Finds best match even with time-shifted patterns

MP (Matrix Profile)

Z-normalized segment matching

CC (Cross-Correlation)

Pattern alignment using correlation scoring

A screening phase selects the best match before fix application.

Adjustment Logic

  • If matching and query slopes differ by <20% → apply mean adjustment.

  • If >20% → align trends using regression.

Interpolation Methods

Method
Description

Curve Fitting

Linear/quadratic/cubic regression

Seasonal

Extracts repeating patterns (e.g., daily or weekly)

Math Expression

Custom logic based on auxiliary data (e.g., StreamA = StreamB * 1.5 + 3)

Auxiliary Streams

Auxiliary sensor streams can enhance:

  • Detection Logic: IF StreamA > 100 AND StreamB < 10 THEN flag issue

  • Fixing Calculations: E.g., Level = Velocity × Coefficient

  • Match Screening: Filters candidate patterns by similarity across multiple dimensions

Accessible from the main FAI menu, the Data Curator Dashboard displays:

  • Sites with active issues

  • Summary of issue types

  • Filtered time views (e.g., 24h, last 7 days)

  • Curated streams with fix previews and manual override options

  • Import/export of corrections via CSV

Setup and Configuration

Requirements

  • Not enabled by default

  • Requires API key generated by an Account Owner

  • Manual selection of monitored Wavelets and Streams

Settings Options

Option
Description

Enable/disable detection

Toggle profiling for each stream

Enable fixes

Choose between automatic and manual

Apply templates

Assign different logic per stream

Exclude streams

Filter out technical values (battery, signal, etc.)

Templates & Tuning

Manage Profiling Templates to control:

  • Detection sensitivity

  • Flatline duration threshold

  • Out-of-range limits

  • Irregular sampling tolerance

  • Imputation model version

Templates can also be retrospectively applied to analyze historical data quality ("rewind" capability).

Workflow Summary

Step
Description

1. Stream Activation

Select Wavelets and Streams

2. Data Profiling

Engine scans each point for anomalies

3. Flag or Fix

Issues logged or corrected

4. Review

Users assess and accept/reject proposed fixes

5. Export

Corrections tracked and available via dashboard or CSV

Known Limitations (Beta)

  • UI bugs when saving/renaming templates

  • Some visibility issues in fix logs

  • Error when adding newly ingested streams

  • Performance bottlenecks on deployments >100 devices

Contact and Access

To request access to the FAI Data Curator, contact support or your account manager.

Last updated