Big Data Engineering for Improved Analytics
Customer Challenge
A marketing analytics company that houses and analyzes large amounts of data for a top grocery retailer required modernization of their big data platforms (on prem and in cloud) to scale and deliver insights.
Innovative Solution
Illumination Works is retooling legacy data pipelines into modern data platforms like Google Cloud Platform (GCP), Azure Platform, and on-prem Hadoop to shrink the deliver time of long-loop analytics. Legacy systems took up to a month to run long queries to drive deep learning results on massive datasets. Modern platforms can produce results in minutes when correctly reengineered.
Benefits/Outcomes
- Near real-time deep learning and machine learning
- Code written to run on any big data architecture
for future growth - Discover and apply learned results to adjust market strategies
Toolbox
- 1,000s of terabytes of data―a couple petabytes across multiple clusters
- Google Cloud, Azure, Hadoop, Spark, Python, PySpark, Airflow, Oracle
- Data science, machine learning, deep learning, cloud services, big data
- Agile methodology, big data best practices
- Large amounts of data and hundreds of jobs running daily