NEWS

Iguazio has been acquired by McKinsey!

Best 10 Free Datasets for Manufacturing

Alexandra Quinn | November 29, 2022

The manufacturing industry can benefit from AI, data and machine learning to advance manufacturing quality and productivity, minimize waste and reduce costs. With ML, manufacturers can modernize their businesses through use cases like forecasting demand, optimizing scheduling, preventing malfunctioning and managing quality. These all significantly contribute to bottom line improvement. In times of global recession, supply chain cut-offs and difficulties meeting consumer demands for materials and products, manufacturing optimization becomes even more important for companies that wish to remain competitive and relevant without impairing their revenue streams.

How can manufacturers develop, grow and optimize their use of data and ML? Open and free datasets for machine learning are an important starting point for data scientists and engineers who are developing and training ML models for manufacturing. But these datasets for manufacturing can be hard to come by, since manufacturing often takes a legacy approach and data is not always available. Here are 10 excellent open manufacturing datasets and data sources for manufacturing data for machine learning.

1. Eurostat Industrial Production Index

The output and activity of the European industry sector, measured on a monthly basis. The dataset’s base year is 2015 and depicts monthly growth rates.

Get the dataset here.

2. US Manufacturing Trends

Manufacturing trends in the US related to wage rates, profits, employment, production, capacity utilization, productivity, exports and shipments. The dataset provides information for the present and year-to-date.

Get the dataset here.

3. Energy Consumption

A dataset providing information about energy consumption at manufacturing sites, homes, commercial buildings and transportation. The data in this dataset is updated monthly or annually.

Get the dataset here.

4. Personal Protective Equipment Computer Vision Dataset and Model

A dataset and model for identifying the use of protective equipment (like helmets, shoes, gloves, goggles, etc.) in warehouses and manufacturing plants through object detection. The business use of this data set is to minimize workplace injuries that derive from lack of safety equipment, by automating safety inspections. The dataset provides 19,000 training set images, 3,600 validation set images and 1,900 testing set images.

Get the dataset here.

5. Degradation Measurement of Robot Arm Position Accuracy

A dataset with information to support robot health management and the development of robot health solutions. The dataset contains the examined robot’s high-level tool center position (TCP) health data and controller-level components' information: joint positions, velocities, currents, temperatures and currents.

Get the dataset here.

6. On-Site Construction Equipment Computer Vision Project

An object detection manufacturing data science project for identifying on-site work equipment: excavators, dump trucks and wheel loaders. The business use case the data supports is inventory management, preventing accidents and tracking construction progress. The dataset contains 6,700 testing images, 267 validation images, and 144 testing images, ready for training.

Get the dataset here.

7. Global Value Chain and Manufacturing Analysis on Geothermal Power Plant Turbines

An analysis of the global supply chain and the cost of manufacturing components of Organic Rankine Cycle (ORC) Turboexpander and steam turbines used in geothermal power plants. The business use case is to help identify manufacturing costs and requirements for equipment, materials, labor and facilities.

Get the dataset here.

8. Materials Discovery: Inorganic Crystals

Crystal structure data to help solve research and applications challenges when researching materials. Common use cases include materials design, property prediction and compound identification. This dataset includes 210,000 entries of non-organic compounds of crystal structure data: inorganics, ceramics, minerals, pure elements, metals, intermetallic systems and more. The dataset is user-friendly and enables easily searching through the data and analyzing results.

Get the dataset here.

9. Radio Frequency Measurements

A dataset based on a PN Code Sounding methodology for understanding how radio waves at 2.4 GHz and 5 GHz propagate in industrial environments. The measurements in the dataset include complex impulse responses and spectrum analysis traces. 

Get the dataset here.

10. NIST Investment Tool

Investment analysis data documented by NIST. The data includes net present value, internal rate of return and payback period. In addition, it provides sensitivity analysis with Monte Carlo techniques. The business use case is to identify investments with the highest ROI.

Get the dataset here.

The Future of Manufacturing Data for Machine Learning

Manufacturers who identify the opportunity in digital transformation will be able to leverage data with ML to help optimize manufacturing, increase productivity, reduce waste and improve quality. ML can help plants, factories, suppliers and government organizations (and more) transform their strategy and bottom line and differentiate themselves for customers while increasing productivity. They will be able to ride the storm of global recession while maintaining and improving their market share. Innovation at these times is key for remaining relevant and cash-flow positive.

Bosch, for example, uses MLOps to ensure simplicity, performance, security and agility of their operations. To learn more about ML and manufacturing, click here.

New call-to-action