Automated Data Cleansing with Machine Learning (Navy) | Illumination Works LLC

Navy Automated Data Cleansing with ML

Automated Data Cleansing with Machine Learning

Customer Challenge

Poor data quality is hindering the Department of Navy’s (DON) ability to gain valuable and accurate insight from their data. Given the volume of errors, manual correction is ineffective and inefficient.

Innovative Solution

ILW data scientists implemented Phase I of our Automated Data Cleansing and Analysis Tool (ADCAT), which applies machine learning (ML) and probabilistic graphical modeling (PGM) to automatically cleanse DON data of errors. For Phase II, ILW applied algorithm enhancements, optimization, model quality monitoring, and user interface creation for improved healing functionality across domains as well as deployed ADCAT to a DON production environment.

Benefits/Outcomes

Robust natural language processing (NLP) and ML classifier models, achieve 96 – 99.8% accuracy
ADCAT’s PGMs provide end-users with the five most probable corrections for a given error; 98% of the time the correct value was in the top five most probable values
Exposes black box of ML error correction logic by providing transparent, human-understandable explanations
Scalable processes and automatic discovery methods enable new error correction models to be built quickly
Human-in-the-loop solution is available to enable review and validation of the ML-driven error corrections

Business Value

Improved analyst productivity: less time correcting data, increased focus on core mission tasks
Higher quality data: higher-confidence, data-informed decisions, cost savings

Toolbox

Supervised/unsupervised ML
Probabilistic graphical models (Bayesian Networks)
Natural language processing
Open-source Python solution using DoD-compatible libraries

Domain Expertise

NAVAIR maintenance data
NAVSEA labor data

Related Case Studies You May Like

Intelligent Data Extraction, Analysis & Content Generation (Air Force)

Intelligent Data Extraction, Analysis & Content Generation (Air Force)

Automated Data Labeling & Curation (Army)

Automated Data Labeling & Curation (Army)

Time-Series Forecasting Tool (Air Force)

Time-Series Forecasting Tool (Air Force)

Self-Service Analytic Environment Assessment & Roadmap

Self-Service Analytic Environment Assessment & Roadmap

Generative AI for Predictive Logistics (Air Force)

Generative AI for Predictive Logistics (Air Force)

Automated Data Curation, Crosslinking & Document Generation (Air Force)

Automated Data Curation, Crosslinking & Document Generation (Air Force)

Data-Driven Financial Budget Planning (Air Force)

Data-Driven Financial Budget Planning (Air Force)

AI/ML Analytics Framework & Services (Air Force)

AI/ML Analytics Framework & Services (Air Force)

Geospatial Location Analysis Application (eCommerce/Retail)

Geospatial Location Analysis Application (eCommerce/Retail)

ML/AI Object Tracking Model (Army)

ML/AI Object Tracking Model (Army)

Standard Missile Maintenance Data with AI/ML (Navy)

Standard Missile Maintenance Data with AI/ML (Navy)

Automated Part Candidacy Analysis Pipeline (Army)

Automated Part Candidacy Analysis Pipeline (Army)

Automated Data Rights Understanding (Air Force)

Automated Data Rights Understanding (Air Force)

Edge Data Management & Analytics (Navy)

Edge Data Management & Analytics (Navy)

Statistical Model & Training Algorithms (Air Force)

Statistical Model & Training Algorithms (Air Force)

Data Science & Architecture Assessment (Marketing)

Data Science & Architecture Assessment (Marketing)

Text Analytics of PDF Technical Documents (Air Force)

Text Analytics of PDF Technical Documents (Air Force)

Deep Learning on Raw Google Analytics Data (Retail)

Deep Learning on Raw Google Analytics Data (Retail)

Automated Data Cleansing with Machine Learning (Navy)

Automated Data Capture and Prediction (Air Force)

Automated Data Capture and Prediction (Air Force)

Automated Data Crosswalks (Air Force SBIR)

Automated Data Crosswalks (Air Force SBIR)

Rapid Data Ingestion to Speed Analytics (AFWERX SBIR)

Rapid Data Ingestion to Speed Analytics (AFWERX SBIR)

Contract Conversion & Analytics (Air Force)

Contract Conversion & Analytics (Air Force)

Decision Support for Cyber Hygiene (Air Force)

Decision Support for Cyber Hygiene (Air Force)

Supply Chain Predictive Analytics (Air Force)

Supply Chain Predictive Analytics (Air Force)

Cost Allocation Rules Engine Modernization (Insurance)

Cost Allocation Rules Engine Modernization (Insurance)

On-Demand Maintenance Analytics for Logistics (Air Force)

On-Demand Maintenance Analytics for Logistics (Air Force)

User-Centric Predictive Insights with Machine Learning (Air Force)

User-Centric Predictive Insights with Machine Learning (Air Force)

Machine Learning & NLP for Decision Support (Healthcare)

Machine Learning & NLP for Decision Support (Healthcare)

Data Science Big Data Ingestion (Energy)

Data Science Big Data Ingestion (Energy)

Engines Forecast Reporting Tool (Air Force)

Engines Forecast Reporting Tool (Air Force)

All Case Studies

Interested In Working With Us?