Text Analytics of PDF Technical Documents

Customer Challenge

The Air Force required a logistics data crosswalk to mitigate known maintenance and supply data connection challenges limiting accurate demand planning and forecasting.

Innovative Solution

ILW data scientists used natural language processing (NLP) and unsupervised machine learning (ML) techniques to evaluate and determine an automated method to tie Work Unit Code (WUC) to related National Item Identification Numbers (NIIN). They used information extracted from Technical Orders in native PDF format as well as data captured in maintenance and supply data systems.

Benefits/Outcomes

  • Extracted master parts list (MPL) for two Air Force weapon system programs
  • Developed multiple table extraction techniques that read PDF documents and pull tabular information out with high degrees of accuracy. Techniques leverage and improve open-source libraries
  • Provide enterprise search capability of Air Force technical documents

Business Value

  • Improves parts supportability, contract lead times, integrated repair planning
  • Enables planning for predictable shifts in demands and condemnations, buying the right quantities of the right parts, avoiding overbuy on other parts

Toolbox

  • Open-source Python solution using DoD-compatible libraries: Pandas, Tabula and Fitz, Scikit-learn, and OpenCV
  • Native PDFs
  • Text Analytics, NLP, Machine Learning, Computer Vision

Related Case Studies You May Like

Digital Transformation for Industrial Modernization (Air Force)

Digital Transformation for Industrial Modernization (Air Force)

AI-Driven Feature Extraction from Engineering Drawings (Air Force)

AI-Driven Feature Extraction from Engineering Drawings (Air Force)

Real-Time Predictive Logistics with AI & IIoT (Air Force)

Real-Time Predictive Logistics with AI & IIoT (Air Force)

AI Assistant & RAG for Cybersecurity Compliance (Air Force)

AI Assistant & RAG for Cybersecurity Compliance (Air Force)

Agentic AI & RAG for Cybersecurity (Air Force)

Agentic AI & RAG for Cybersecurity (Air Force)

Agentic AI Natural Language Reasoning (Air Force)

Agentic AI Natural Language Reasoning (Air Force)

Intelligent Data Extraction, Analysis & Content Generation (Air Force)

Intelligent Data Extraction, Analysis & Content Generation (Air Force)

Time-Series Forecasting Tool (Air Force)

Time-Series Forecasting Tool (Air Force)

Generative AI for Predictive Logistics (Air Force)

Generative AI for Predictive Logistics (Air Force)

ML/AI Object Tracking Model (Army)

ML/AI Object Tracking Model (Army)

Standard Missile Maintenance Data with AI/ML (Navy)

Standard Missile Maintenance Data with AI/ML (Navy)

Automated Part Candidacy Analysis Pipeline (Army)

Automated Part Candidacy Analysis Pipeline (Army)

Automated Data Rights Understanding (Air Force)

Automated Data Rights Understanding (Air Force)

Edge Data Management & Analytics (Navy)

Edge Data Management & Analytics (Navy)

Data Science & Architecture Assessment (Marketing)

Data Science & Architecture Assessment (Marketing)

Text Analytics of PDF Technical Documents (Air Force)

Text Analytics of PDF Technical Documents (Air Force)

Deep Learning on Raw Google Analytics Data (Retail)

Deep Learning on Raw Google Analytics Data (Retail)

Automated Data Cleansing with Machine Learning (Navy)

Automated Data Cleansing with Machine Learning (Navy)

Automated Data Capture and Prediction (Air Force)

Automated Data Capture and Prediction (Air Force)

Automated Data Crosswalks (Air Force SBIR)

Automated Data Crosswalks (Air Force SBIR)

Contract Conversion & Analytics (Air Force)

Contract Conversion & Analytics (Air Force)

Decision Support for Cyber Hygiene (Air Force)

Decision Support for Cyber Hygiene (Air Force)

Cost Allocation Rules Engine Modernization (Insurance)

Cost Allocation Rules Engine Modernization (Insurance)

On-Demand Maintenance Analytics for Logistics (Air Force)

On-Demand Maintenance Analytics for Logistics (Air Force)

User-Centric Predictive Insights with Machine Learning (Air Force)

User-Centric Predictive Insights with Machine Learning (Air Force)

Machine Learning & NLP for Decision Support (Healthcare)

Machine Learning & NLP for Decision Support (Healthcare)

Data Science Big Data Ingestion (Energy)

Data Science Big Data Ingestion (Energy)

Interested In Working With Us?