I’ve spent the last decade deploying machine learning models where the rubber meets the plant floor: on PLC‑adjacent gateways that need to run reliably 24/7, speak industrial protocols, and survive long maintenance cycles. In this article I’ll walk you through a pragmatic architecture for an AI inference pipeline that meets industrial constraints — deterministic latency, limited compute/memory, 0‑touch updates, and strong safety/security boundaries — and share the patterns and pitfalls I’ve learned on real projects.
Why PLC‑adjacent gateways?
Gateways that sit next to PLCs (rather than inside the PLC) give you the best of both worlds: access to real‑time I/O and deterministic control from the PLC, plus the flexibility to run richer inference workloads, local preprocessing, and connectivity to cloud or MES systems. I prefer this topology when models require more compute than a PLC can provide or when you need rapid model iteration without impacting control logic.
High‑level architecture
Here’s the mental model I use:
Each of those stages must be deterministic, observable, and fail‑safe. Below I unpack design choices and concrete implementation patterns that make the pipeline production‑ready.
Hardware & OS: pick for determinism and support
Choice of gateway hardware and OS drives everything else. For industrial deployments I typically select one of these patterns:
Practical tips:
Model format and optimization
Production on constrained gateways is about making the model fit reliably, not about getting the absolute best accuracy in lab conditions. Steps I use every time:
Don’t skip edge validation: run the optimized model on an identical gateway in a lab with recorded production data and measure latency, CPU, memory, and accuracy drift.
Inference runtime choices
Your runtime must be lightweight, stable, and instrumentable. Below is a compact comparison I use when selecting one:
| Runtime | Strengths | Considerations |
|---|---|---|
| ONNX Runtime | Cross‑platform, many accelerators, active ecosystem | Binary size, plugin maturity varies by backend |
| TFLite | Small footprint, excellent for ARM, easy quantization | Better for CNNs/NNs than some custom ops |
| TensorRT | Best GPU performance on NVIDIA Jetson | Vendor lock to NVIDIA |
| OpenVINO | Optimized for Intel CPUs/VPUs | Platform specific |
In many projects I standardize on ONNX Runtime where possible because it provides a stable API and broad backend choices; for Jetson targets I often switch to TensorRT for maximum throughput.
Integration with PLC and industrial protocols
Reliability comes from clean separation of responsibilities. I recommend:
These patterns simplify validation and make it trivial to fail back to manual/operator control if the gateway dies.
Latency, batching, and real‑time constraints
Industrial systems often care about worst‑case latency, not average latency. My approach:
Observability, drift detection, and logging
Metrics and logs are your first line of defense. Implement:
Ship logs to a centralized telemetry platform (Prometheus + Grafana for metrics, Elastic for logs) or to a cloud diagnostics service, but ensure network outages don’t block inference — logs should buffer locally and forward when available.
Deployment, versioning, and rollback
OTA updates are non‑negotiable for long lifecycles. Key practices:
Security and safety
Security and safety intersect: ensure model compromises don’t create hazards.
Testing & validation
My testing pyramid for gateway inference:
Shadow mode is the single most valuable step: it exposes data drift, distribution shifts, and edge cases you won’t see in the lab.
Operational KPIs I track
On every deployment I monitor a small set of KPIs that tell me whether the pipeline is healthy:
Those KPIs are actionable: a rising rate of fallback actions typically points to model drift, sensor degradation, or communication issues.
Deploying AI next to PLCs is never purely a data science problem — it’s a systems engineering challenge. If you design the pipeline for determinism, observability, safe failure modes, and maintainable lifecycle, you’ll get models that deliver value on the shop floor rather than just in the lab. If you’d like, I can share a sample folder structure, systemd unit files, or a reference Dockerfile for a gateway image tailored to ONNX Runtime + MQTT/OPC UA connectivity.