Loading [MathJax]/extensions/tex2jax.js
Back to FabImage Library website

You are here: Start » Function Reference » Computer Vision » Deep Learning » MergeCharactersIntoLines

MergeCharactersIntoLines


Header: FIL.h
Namespace: fil
Module: DL_OCR

Converts a output of DL_ReadCharacters to lines of text.

Syntax

void fil::MergeCharactersIntoLines
(
	const ftl::Array<fil::OcrResult>& inCharacters,
	float inMaxGap,
	float inMaxShift,
	float inMargin,
	int inMinLength,
	bool inFlatten,
	ftl::Array<fil::Rectangle2D >& outLines,
	ftl::Array <ftl::String >& outStrings,
	ftl::Array< ftl::Conditional<int> >& outMapping
)

Parameters

Name Type Range Default Description
Input value inCharacters const Array<OcrResult>& Output of DL_ReadCharacters
Input value inMaxGap float 0.0 - 10.0 0.25f Maximum horizontal gap between joint characters' boxes, denoted as fraction of 'A' char height
Input value inMaxShift float 0.0 - 1.0 0.25f Maximum vertical misalignment between joint character's boxes, denoted as fraction of 'A' char height
Input value inMargin float 0.0 - 10.0 Additional margin added to result, denoted as fraction of 'A' char height
Input value inMinLength int 1 - 200 1 Minimal number of chars to create line
Input value inFlatten bool False If True, it concatenates the words on the line into a single result string, otherwise each word is a separate result string
Output value outLines Array<Rectangle2D >& Minimal Box which cover all selected character boxes
Output value outStrings Array <String >& Text of merged characters
Output value outMapping ArrayConditional<int> >& Mapping between input characters and output lines, outMapping[i] stores the index line to which inCharacters[i] belongs. If outMapping[i] is NIL it means that inCharacters[i] has not been added to any line

Description

This tool takes the text contained in OcrResultArray from the FisFilter_DL_ReadCharacters and merges it into lines.

Hints

  1. Depending on the inMaxGap and inMaxShift values, we can get different number of lines. See the image below, where increasing the inMaxGap results in one line of text, whereas a smaller value will return two separate lines:

  2. Different values of the inMaxGap and inMaxShift parameters result in a different number of lines.
  3. The lines are sorted by the Y value, e.g.:


  4. The tool can also be used to get rid of false characters by setting a different value of the inMinLength parameter. In the image below setting the inMinLength to 2 resulted in filtering out single false characters returned by the FisFilter_DL_ReadCharacters tool.


See Also