I’ve spent years on the factory floor tuning vision systems until they behaved predictably — not just in a demo booth, but under shift changes, variable lighting, and the inevitable belt of parts that aren’t exactly to spec. One lesson keeps surfacing: a false positive from a vision inspection (i.e., a good part flagged as defective) is rarely a harmless nuisance. It can slow lines, add rework, create needless scrap, and distort your KPIs. In this post I’ll walk you through a practical way to measure the real throughput cost of false positives and how to set camera thresholds (or model decision thresholds) to maximize effective yield.
What I mean by “real throughput cost”
False positives (FP) have multiple downstream effects. Quantifying only the direct cost — the value of a scrapped part — is insufficient. You also need to account for:
When I evaluate FP cost, I model the operational impact in terms of time and money per FP event. That gives me a single metric (cost-per-FP) I can use to make threshold decisions that match business priorities.
Step 1 — Map the FP workflow and data sources
Start by sketching what happens after a part is flagged. Common paths include:
For each path, identify measurable inputs:
I make sure to collect data from the vision system (scores), the MES (part routing and timestamps), and the operator workstation (inspection time and disposition). Synchronised timestamps are crucial.
Step 2 — Compute per-event cost components
Transform the raw workflow into a per-FP cost model. Typical components:
Throughput delay is best expressed as lost throughput fraction × product margin per minute. For example, if an FP causes a 30-second manual inspection and the line produces 20 parts/minute, you effectively reduce throughput by 10 parts for that minute window if the line backs up — but in many cases the impact is fractional and distributed. I use queueing-derived approximations when precise simulation isn’t justified.
Simple cost table example
| Component | Assumption | Unit cost (£) |
|---|---|---|
| Scrap (false scrap) | Part cost = £5 | 5.00 |
| Manual inspection | 2 min × £0.40/min | 0.80 |
| Rework (minor) | 5 min × £0.40/min + £0.50 consumables | 2.50 |
| Throughput delay | 0.5 min of lost throughput × margin £1/min | 0.50 |
| Total per-FP (example) | 8.80 |
That total (£8.80) is illustrative — in your line the dominant term may be scrap cost or throughput delay. Build a table like the one above with your own numbers.
Step 3 — Link cost to decision thresholds
If your vision system gives a continuous score (e.g., defect probability), every threshold maps to a particular false positive rate (FPR) and false negative rate (FNR). What I do is:
Where N_good and N_bad are counts of good/bad parts in the production sample. The cost_per_FN (false negative) measures the cost of shipping or letting a defective part pass: warranty, scrap after field failure, recalls, safety risks, inspection downstream, etc. In many cases cost_per_FN >> cost_per_FP, so your optimal threshold prioritizes reducing FNs, but don’t assume that — quantify it.
Step 4 — Practical threshold selection
Once you have an expected cost curve across thresholds, pick the threshold that minimizes expected cost. I prefer doing this per product-family and per operating condition (night shift vs day, different operators, different lighting). Common practical refinements:
Step 5 — Validate with A/B tests on the line
Any offline optimization needs online validation. I run controlled A/B tests where a subset of lines uses the new threshold policy and the rest stay on the baseline. Monitor:
Run the A/B long enough to capture shift and supplier variability — typically 1–4 weeks depending on volume. Look for both the expected cost improvement and any unexpected operational side effects (operator fatigue, new error modes).
Operationalizing thresholds and monitoring
After deployment, thresholds are not “set-and-forget.” I implement guardrails:
For classic machine-vision rules (thresholding on brightness or edge metrics), I log the raw image statistics along with decisions. For ML systems, save both input images and model scores for later retraining and root-cause analysis.
Examples from the field
On one automotive wiring harness line I worked on, the vision classifier was tuned to aggressively reject anomalies, producing a high FP rate. The immediate symptom was many small manual inspections causing the line to slow by 8%. After calculating the cost-per-FP (primarily throughput loss and operator time), we loosened the threshold and introduced a manual inspection buffer for the middle band. Net effect: first-pass yield improved by 3% (less unnecessary rework), throughput increased 6%, and actual defect escapes did not meaningfully increase.
Another case was an electronics assembly line where FP cost was dominated by scrap (expensive RF filters). Here we tightened the threshold and added a second camera angle and a short reflow re-inspection step. The combination reduced false scrapping without introducing extra manual labor.
Tools and libraries I commonly use
If you want, I can provide a small spreadsheet template that computes expected cost over thresholds from your FPR/TPR curve and your cost inputs — it’s a fast way to see where your business-optimal threshold lies. Tell me the formats and a few numbers (part cost, labor rate, inspection time, bad-part rate) and I’ll prepare it.