一、技术选型解析
1.1 OCR引擎对比
引擎类型准确率复杂度成本场景Tesseract OCR85-92%中开源自定义训练、预算有限百度AI OCR98%+低按量付费高准确率商业项目PaddleOCR95-97%中开源兼顾准确率与定制化
选型:Tesseract 5.3.1+OpenCV 4.8.0,通过训练车牌字库提升准确率至95%+。
1.2 SpringBoot3.5核心特性
- 虚拟线程:spring.threads.virtual.enabled=true提升并发
- 环境变量加载:spring.config.import=env:APP_CONFIG简化部署
二、系统架构设计
2.1 分层架构
- Controller:图像上传/结果返回接口
- Service:预处理、OCR识别、车牌校验
- Common:OpenCV工具类、Tesseract配置
2.2 流程
输入→预处理→识别→校验→输出
三、核心功能实现
3.1 图像预处理(OpenCV)
java
@Service
public class PlateImagePreprocessService {
static { System.loadLibrary(Core.NATIVE_LIBRARY_NAME); }
public String preprocess(String path) {
// 灰度图→降噪→二值化→增强→形态学处理
Mat gray = Imgcodecs.imread(path, Imgcodecs.IMREAD_GRAYSCALE);
Mat denoised = new Mat();
Imgproc.fastNlMeansDenoising(gray, denoised, 15, 10, 25);
Mat binary = new Mat();
Imgproc.threshold(denoised, binary, 0, 255, Imgproc.THRESH_BINARY+Imgproc.THRESH_OTSU);
CLAHE clahe = Imgproc.createCLAHE(4.0, new Size(8,8));
Mat enhanced = new Mat();
clahe.apply(binary, enhanced);
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(2,2));
Mat finalImg = new Mat();
Imgproc.morphologyEx(enhanced, finalImg, Imgproc.MORPH_CLOSE, kernel);
String outPath = "processed_"+System.currentTimeMillis()+".png";
Imgcodecs.imwrite(outPath, finalImg);
return outPath;
}
}
3.2 OCR识别(Tesseract)
java
@Service
public class PlateRecognitionService {
private static final String TESS_PATH = "tessdata";
private static final String CHAR_SET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789浙粤京津沪渝苏浙皖闽赣鲁豫鄂湘粤琼川贵云陕甘青宁新蒙辽吉黑渝";
public String recognize(String imgPath) {
try (TessBaseAPI api = new TessBaseAPI()) {
api.Init(TESS_PATH, "chi_sim+eng");
api.SetVariable("tessedit_char_whitelist", CHAR_SET);
api.SetVariable("load_system_dawg", "0");
PIX img = leptonica.pixRead(imgPath);
api.SetImage(img);
String res = api.GetUTF8Text().getString().replaceAll("\\s+", "").trim();
leptonica.pixDestroy(img);
return res.length()<7||res.length()>8?"异常:"+res:res;
} catch (Exception e) {
throw new RuntimeException("识别失败:"+e.getMessage());
}
}
}
3.3 车牌校验
java
@Service
public class PlateValidationService {
private static final Set<String> PROVINCES = Set.of("京","津","冀","晋","蒙","辽","吉","黑","沪","苏","浙","皖","闽","赣","鲁","豫","鄂","湘","粤","桂","琼","川","贵","云","藏","陕","甘","青","宁","新","渝");
private static final Pattern NORMAL = Pattern.compile("^[\\u4e00-\\u9fa5][A-Z][A-Z0-9]{5}#34;);
private static final Pattern NEW_ENERGY = Pattern.compile("^[\\u4e00-\\u9fa5][A-Z][DF][A-Z0-9]{5}#34;);
public boolean validate(String plate) {
if (plate.length()!=7&&plate.length()!=8) return false;
if (!PROVINCES.contains(plate.substring(0,1))) return false;
return NORMAL.matcher(plate).matches()||NEW_ENERGY.matcher(plate).matches();
}
}
四、性能优化
4.1 模板匹配定位
java
public Mat locatePlate(Mat img) {
Mat template = Imgcodecs.imread("template.jpg", Imgcodecs.IMREAD_GRAYSCALE);
Mat result = new Mat();
Imgproc.matchTemplate(img, template, result, Imgproc.TM_CCOEFF_NORMED);
Rect rect = new Rect(Core.minMaxLoc(result).maxLoc, template.size());
return new Mat(img, rect);
}
4.2 Tesseract训练
- 标注500+样本
- 生成box文件:tesseract plate.font.exp0.tif plate.font.exp0 --psm 7 batch.nochop makebox
- 训练:lstmtraining --model_output plate_model --continue_from chi_sim.lstm --max_iterations 4000
五、部署与测试
5.1 Docker配置
Dockerfile:
dockerfile
FROM eclipse-temurin:17-jre-slim
WORKDIR /app
COPY target/*.jar app.jar
COPY tessdata /app/tessdata
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar", "--spring.threads.virtual.enabled=true"]
docker-compose.yml:
yaml
version: '3.8'
services:
plate-recog:
build: .
ports: ["8080:8080"]
environment: [TESSDATA_PREFIX=/app/tessdata]
5.2 Postman测试
- POST /api/plate/recognize
- FormData: file(车牌图片)
- 响应:{"code":200,"data":{"plateNumber":"渝A12345","processTime":"230ms"}}
六、避坑指南
- OpenCV库加载:System.setProperty("java.library.path", "/path/to/lib")
- Tesseract路径:设置TESSDATA_PREFIX环境变量
- 预处理参数:模糊图像用fastNlMeansDenoising(gray, denoised, 10, 7, 21)
- 并发安全:ThreadLocal隔离TessBaseAPI实例
- 倾斜校正:霍夫变换检测直线并旋转
感谢关注【AI码力】,获得更多Java实战秘籍!