Batangas State University · College of Engineering BS Computer Engineering · 2026

Enhancing steel
quality control through
computer vision and machine learning.

A multi-sensor defect detection system combining a USB webcam and thermal infrared camera, deployed as a web-based quality management platform.

Eugene Gabriel NarvadezSystem architect
Jazther HebradoHardware integration
Ana Marie RamosDataset & ML
Lynzy Anne RevillaCompliance & QA
Scroll to begin
CHAPTER 01 01

Introduction

Why this thesis exists, and what we set out to prove.

1.1 — Background & motivation 1.2 — The five objectives
1.1 · Background
Why automated steel inspection

Steel underpins safety-critical industries. Manual inspection cannot keep up.

Automotive frames, ship hulls, bridge members, aircraft components — defective steel propagates downstream into systems where failure has real consequences. Manual visual inspection fatigues within minutes, varies between inspectors, and physically cannot match modern production-line speed.

Scratches

Linear abrasions from foreign objects or interlayer friction during coiling — disrupt protective oxide layers.

Dents

Localized depressions from impact pressure — introduce stress concentrations that weaken load-bearing parts.

Cost of missing one

Premature failure of downstream products, rejected batches, customer-trust erosion, recall liability.

1.2 · Objectives
The five research objectives

Five deliverables. One integrated system.

Each objective maps to a distinct artifact — dataset, model, software, hardware, compliance. We'll revisit each one in Chapter 4 with the evidence of how it was met.

OBJ 1
Dataset
Acquire and verify a dataset for model training against industry standards.
OBJ 2
Detection
Detect scratches and dents through deep learning + optimal camera placement.
OBJ 3
Web application
Develop a web app integrable with a pre-existing manufacturing system.
OBJ 4
Prototype
Build a working prototype demonstrating end-to-end defect detection.
OBJ 5
Compliance
Evaluate against ISO 9001 / 10012 with accuracy meeting industry threshold.
CHAPTER 02 02

Problem & Solution

What's broken in current inspection, and the system we built to fix it.

2.1 — The problem 2.2 — Architecture (5-step pipeline) 2.3 — Dataset 2.4 — Model selection 2.5 — Visual detection · 2.6 — Thermal 2.7 — Soft voting · 2.8 — Severity score
2.1 · Problem
What manual inspection misses

The gap: existing CV literature is single-modality, lab-only, and not workflow-integrated.

Most published systems run a single CNN on a benchmark dataset under controlled lighting. None ship with an audit-grade record system; few survive real factory noise. We needed an inspection platform that works on consumer hardware, fuses two sensors for subsurface detection, and meets ISO 9001 / 10012 expectations out of the box.

2.2 · Architecture
What we built

KENSA — five-step inspection pipeline.

USB webcam runs YOLOv11. MLX90640 catches subsurface anomalies. Soft-voting ensemble fuses them. Steel IRIS records every verdict with an immutable audit trail.

STEP 1
Sheet entry
Loaded onto prototype conveyor at 5 cm/s, triggers inspection cycle.
STEP 2
Dual capture
USB webcam (RGB) + MLX90640 (32×24 thermal) acquire in parallel.
STEP 3
Per-modality inference
YOLOv11 detects bounding boxes; thermal evaluator flags anomalies.
STEP 4
Soft-voting fusion
Weighted ensemble (0.65 visual, 0.35 thermal) yields a fused probability.
STEP 5
Verdict + audit
Annotated record streamed to Steel IRIS, kanban workflow, immutable scan log.
2.3 · Dataset
Custom-collected

500 images. 682 bounding boxes. Two classes.

Built on the prototype rig, supplemented with a remapped NEU-DET subset. Roboflow handled annotation and augmentation: flips, rotations, contrast, synthetic noise.

0
Total images
0
Bounding boxes
350/100/50
Train / Val / Test split
0
Defect classes (dent, scratch)
2.4 · Model selection
Three YOLO variants benchmarked

YOLOv11 won. Same data, same hyperparameters.

YOLOv8
mAP@0.5
0.791
F1 score
0.776
YOLOv11
mAP@0.5
0.904
F1 score
0.889
YOLO26
mAP@0.5
0.847
F1 score
0.835
2.5 · Visual detection
YOLOv11 in action

Bounding-box detection across a 5–50 mm defect range.

Per-class F1: Dent 0.877 / Scratch 0.902. Inference 9.3 ms per image on the mini-PC.

scratch · 0.91
dent · 0.84
scratch · 0.88
2.6 · Thermal sensing
MLX90640 · 32×24 array

Thermal catches what vision can't.

A subsurface dent may not show a visible boundary under overhead LED, but it shifts local heat-conduction. Per-pixel EMA baseline + 2.5σ + 1.0°C + 3-frame persistence flags real anomalies, not sensor noise.

α=0.02
EMA smoothing
≥3
Frames of persistence
MLX90640 · 32×24 · 8 Hz
2.7 · Soft voting
Equation 9 — try the slider

Two modalities. One probability.

Webcam reaches 90% per-sensor accuracy; thermal 80%. Weights are calibrated empirically — the fused probability gates the verdict at τ = 0.50.

Webcam weight 0.65  /  0.35 thermal
Pwebcam = 0.82Pthermal = 0.41
w·P = 0.53
w·P = 0.14
Pfused0.68
Verdict: NO GOOD (above τ=0.50)
2.8 · Severity
Equation 10 — interactive calculator

Each signal capped. No single sensor dominates.

Move the sliders to see how the verdict bin shifts. The composite severity blends visual detections, thermal anomaly, hotspot count, and fusion hits into a 0–100 score.

YOLOv11 detections0 × 18 (max 50)
Thermal anomaly persistent
Hotspot count0 × 6 (max 20)
Fusion hits (visual ∩ thermal)0 × 8 (max 15)
0
0–14 PASS
15–39 INSPECT
40–69 POSSIBLE
70–100 CONFIRMED
PASS
0 + 0 + 0 + 0 = 0
CHAPTER 03 03

Results

What the system actually achieved on the held-out test set.

3.1 — Headline accuracy 3.2 — Full metrics breakdown
3.1 · Headline
Held-out test set · n=50

Fusion beat both single sensors.

0overall

Overall classification accuracy — exceeding the 90% literature threshold for prototype-level systems by 6 percentage points.

3.2 · Metrics
All metrics, side-by-side

The complete scorecard.

0
mAP@0.5 (YOLOv11)
0
Mean F1 score
0
Missed-detection rate (target ≤10%)
0
False-detection rate (target ≤10% · marginal)
0
F1 for Dent class
0
F1 for Scratch class
0
End-to-end pipeline latency (~9.4 fps)
0
YOLOv11 inference time per image
CHAPTER 04 04

Conclusions

How we met each of the five objectives — with the specific evidence.

4.1 — OBJ 1: Dataset (MET) 4.2 — OBJ 2: Detection (MET) 4.3 — OBJ 3: Web application (MET) 4.4 — OBJ 4: Prototype (MET) 4.5 — OBJ 5: ISO compliance (PARTIALLY MET) 4.6 — Honest limitations
4.1 · Objective 1
A custom-collected, augmented dataset that meets and exceeds the 90% literature threshold for prototype-level surface defect detection.
How we met it
500 images
Custom-collected on the prototype rig, supplemented with a remapped NEU-DET subset.
682 boxes
Bounding-box annotations across two defect classes (Dent, Scratch) via Roboflow.
350/100/50
Train / Validation / Test partition 425 NG · 75 Good
mAP 0.904
Industry-standard verification — exceeds the 90% literature threshold cited for surface defect detection.
Evidence · §4.1, Tables 8 / 9 / 10
4.2 · Objective 2
YOLOv11 selected after benchmarking three variants. Convergent mounting geometry adopted for consistent capture across both sensors.
How we met it
3 variants
YOLOv8 / YOLOv11 / YOLO26 trained under identical conditions for a fair comparison.
F1 0.877/0.902
YOLOv11 per-class scores for Dent / Scratch — best of the three. val loss 0.754 at epoch 100 · no train/val divergence
9.3 ms
Inference time per image — compatible with real-time conveyor-line inspection.
Convergent mount
Both sensors aimed at the inspection zone at comparable working distances — simplifies ensemble calibration.
Evidence · §4.2, Tables 10 / 11, Figure 4
4.3 · Objective 3
Steel IRIS is live at kirabase.net — full quality-management substrate with audit trail, role-based access, and approval workflow.
How we met it
3-tier stack
Vite/React on Vercel · Supabase PostgreSQL · Python hardware layer over HTTPS.
9 modules
Auth · Live inspection · Manual review · Analytics · Kanban · Steel sheets · Scan logs · Inspections · Equipment.
5 RBAC roles
Operator → Inspector → Quality Manager → Admin → Super Admin Enforced at the DB row level via Supabase RLS — can't be bypassed client-side.
Kanban approval
Quality Manager sign-off required before any sheet advances — supports ISO 9001 §8.6 release control.
Evidence · §4.5, Figures 9–20, Table 16
4.4 · Objective 4
A continuous-inspection rig integrating both sensors, YOLOv11, thermal evaluator, soft voting, and the web upload — running within real-time budget.
How we met it
106 ms
End-to-end pipeline latency per inspection cycle (~9.4 fps) — well within the 200 ms per-sheet budget.
Mini-PC + ESP32-S3
Separates compute (YOLO + fusion) from sensor acquisition (thermal serial) so neither blocks the other.
₱31,655
Total prototype budget — consumer-grade components, fully reproducible. Thermal cam · webcam · ESP32 · Mini-PC · diffused light · steel samples
Conveyor stop
Physical button + Live Inspection dashboard control — mitigates speed-induced accuracy degradation.
Evidence · §4.4, Figures 4–7, Tables 3–6
4.5 · Objective 5
~
96% accuracy clears the threshold by 6 points. 8 of 12 ISO clauses fully compliant. The remaining gaps are defined engineering tasks, not architectural failures.
Compliance audit · click any clause
9001 · 5.3
Compliant
Roles & authorities
9001 · 7.1.5
Partial
Measuring resources
9001 · 7.5
Compliant
Records control
9001 · 8.5.2
Compliant
Traceability
9001 · 8.6
Compliant
Release control
9001 · 8.7
Compliant
Nonconforming control
9001 · 9.1.3
Compliant
Data analysis
9001 · 10.2
Partial
Corrective action
10012 · 6.3
Compliant
Material resources
10012 · 7.1
Partial
Metrological confirmation
10012 · 7.2
Compliant
Measurement process
10012 · 7.3
Not compliant
Measurement uncertainty
Tap any clause above
See how KENSA implements (or doesn't yet implement) that requirement.
Evidence · §4.6, Tables 17 / 18
4.6 · Limits & next steps
Where the boundaries are

Honest limits. Defined fixes.

Every claim in this thesis is bounded by the conditions we actually tested. Every gap has an explicit next step.

L1

Test size n=50

Only 8 Good samples — a single FP moves FDR by 12.5 pts. Fix: expand to balanced n ≥ 200.

L2

Lab vs. factory

Trained at 5 cm/s, stable LED. Real factories have variable lighting, vibration, dust. Fix: site-specific re-calibration before deployment.

L3

Corner detection

Webcam underperforms at sheet corners due to barrel distortion and uneven edge illumination. Fix: lens correction + directional side lighting.

L4

Class coverage

Only two defect classes trained. Pitting, scale, laminar cracks excluded. Fix: dataset expansion + retraining.

L5

Measurement uncertainty

No formal R&R budget. Fix: per-modality repeatability/reproducibility study to close ISO 10012 §7.3.

L6

FDR 12.5% > 10%

One borderline surface-oxidation case (LIVE-003). Fix: fusion veto rule — reject NG when thermal Δ<1°C and YOLO conf<0.55.

CHAPTER 05 05

Demo

It's deployed. You can open it now.

5.1 — Steel IRIS live at kirabase.net 5.2 — Defense-prep resources
5.1 · Live demo
Steel IRIS · production

It's real. It's deployed.

React frontend on Vercel, Supabase PostgreSQL backend, Python hardware acquisition layer streaming from the Mini-PC. The same nine modules you saw earlier — running live.

Defense panel

Thank you. Questions?

We've prepared backing materials in case the panel wants to dig deeper.

Eugene Gabriel Narvadez
System architect
Jazther Hebrado
Hardware integration
Ana Marie Ramos
Dataset & ML
Lynzy Anne Revilla
Compliance & QA