Start » Filter Reference » Computer Vision » Optical Character Recognition » TrainOcr_SVM
Module: | OCR |
---|
Trains an OCR support vector machines classifier.
Name | Type | Range | Description | |
---|---|---|---|---|
inCharacterSamples | CharacterSampleArray | Training font created from sample regions | ||
inNormalizationSize | Size | The character size after normalization | ||
inNu | Real* | 0.0 - 1.0 | Trade-off between training accuracy and number of supported vectors | |
inKernelGamma | Real* | Gamma parameter for RBF kernel | ||
inRegularizationConstant | Real | 0.0 - | Preventing overfitting | |
inStopEpsilon | Real | Epsilon for stopping criterion | ||
inUseShrinkingHeuristics | Bool | Heuristics may speed up computations | ||
inCharacterSize | Size* | Size of fixed width font | ||
inRandomSeed | Integer* | 0 - + | Random seed used to train classifier | |
inCharacterFeatures | CharacterFeatures | Character features used to identify characters | ||
outOcrModel | OcrModel | Trained OcrSvmModel used to recognize characters | ||
outTrainingAccuracy | Real | The overall training score | ||
diagNormalizedCharacters | ImageArray | Images of normalized characters used to train classifier |
Description
This filter prepares a SVM classifier for the further OCR operations.
Filter requires a set of prepared CharacterSample which can be created using MakeCharacterSamples.
Parameter inCharacterSize defines the size of character cropping box. It is especially useful when characters are much bigger than normalization size. When it has Nil value the character is cropped to its bounding box.
The selection of too small normalization size may result in loss of character details. However, too large value of normalization size increases the classifier learning time. The best recognition results are obtained when the size of character is nearly the same as the normalization size.
The character classification depends on character features that are selected in the inCharacterFeatures parameter. At least one feature must be selected. By the default the feature Pixels is selected.
The table below contains the description of each available character feature:
Feature name | Description | Filter origin | Normalized |
---|---|---|---|
Pixels | Values of the image pixels after normalization. | False | |
NormalizedPixels | Values of the image pixels after normalization normalized to range <0, 1.0>. | True | |
Convexity | Ratio of the input region area to area of its convex hull. | RegionConvexity | True |
Circularity | Ratio of the region area to area of its bounding circle. | RegionCircularity | True |
NumberOfHoles | Number of holes found in the input region. | RegionHoles | True |
AspectRatio | Ratio of input region width to its height. | RegionBoundingBox | False |
Width | Region bounding box width. | RegionBoundingBox | False |
Height | Region bounding box height. | RegionBoundingBox | False |
AreaRatio | Ratio of the input region area to area of its bounding box. | True | |
DiameterRatio | Ratio of the input region diameter to diameter of its bounding box. | RegionDiameter | True |
Elongation | Ratio of longer axis of the approximating ellipse to the shorter one. | RegionElongation | False |
Orientation | Further details in the filter RegionOrientation documentation. | RegionOrientation | True |
Zoning4x4 | Normalized pixel values of region reduced to size 4x4 pixel. | True | |
HorizontalProjection | Values of normalized image projection normalized by region height. | ImageProjection | True |
VerticalProjection | Values of normalized image projection normalized by region height. | ImageProjection | True |
HoughCircles | Count of circles found in the normalized image. | True | |
Moment_11 | Character geometric moment type M11. | RegionMoment | False |
Moment_20 | Character geometric moment type M20. | RegionMoment | False |
Moment_02 | Character geometric moment type M02. | RegionMoment | False |
Remarks
To read more about how to use OCR technique, refer to Machine Vision Guide: Optical Character Recognition
Errors
This filter can throw an exception to report error. Read how to deal with errors in Error Handling.
List of possible exceptions:
Error type | Description |
---|---|
DomainError | At least a single feature must be selected in inCharacterFeatures in TrainOcr_SVM. |
DomainError | Invalid character sample in TrainOcr_SVM. |
DomainError | Invalid OcrSvmModel in TrainOcr_SVM. |
Complexity Level
This filter is available on Advanced Complexity Level.
Filter Group
This filter is member of TrainOcr filter group.
See Also
- TrainOcr_SVM – Trains an OCR support vector machines classifier.
- RecognizeCharacters – Classifies input regions into characters. Based on the Multi-Layer Perceptron model.