Tools API Reference¶
AIMQ provides a set of built-in tools for document processing and storage operations.
OCR Tools¶
Image OCR {#image-ocr}¶
aimq.tools.ocr.image_ocr
¶
PDF Processor {#pdf-processor}¶
aimq.tools.ocr.processor
¶
OCR module for text extraction and processing from images.
This module provides functionality for extracting and processing text from images using the EasyOCR library. It includes utilities for handling text bounding boxes, merging overlapping detections, and creating debug visualizations.
Classes¶
OCRProcessor(languages=None)
¶
Processor for performing OCR on images using EasyOCR.
This class provides a high-level interface for performing OCR on images. It handles initialization of the EasyOCR reader, image preprocessing, text detection, and optional debug visualization.
Attributes:
Name | Type | Description |
---|---|---|
languages |
List of language codes for OCR |
|
_reader |
Lazy-loaded EasyOCR reader instance |
Initialize OCR processor with specified languages.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
languages
|
Optional[List[str]]
|
List of language codes (default: ['en']) |
None
|
Source code in src/aimq/tools/ocr/processor.py
Attributes¶
reader
property
¶
Get or initialize the EasyOCR reader.
Returns:
Type | Description |
---|---|
Reader
|
easyocr.Reader: Initialized EasyOCR reader instance |
Functions¶
process_image(image, save_debug_image=False)
¶
Process an image and return OCR results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image
|
Union[str, Path, Image, bytes]
|
The image to process. Can be one of: - Path to image file (str or Path) - PIL Image object - Bytes of image data |
required |
save_debug_image
|
bool
|
If True, includes debug image in output |
False
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict[str, Any]: OCR results including: - processing_time: Time taken to process in seconds - text: Extracted text content - debug_image: Optional base64 encoded debug image - detections: List of text detections with coordinates |
Raises:
Type | Description |
---|---|
ValueError
|
If image format is invalid or unreadable |
Source code in src/aimq/tools/ocr/processor.py
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 |
|
Functions¶
boxes_overlap(box1, box2)
¶
Check if two boxes overlap at all.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
box1
|
Dict[str, int]
|
Dictionary with x, y, width, height |
required |
box2
|
Dict[str, int]
|
Dictionary with x, y, width, height |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if boxes overlap |
Source code in src/aimq/tools/ocr/processor.py
group_text_boxes(detections, width_growth=0, height_growth=0)
¶
Group text boxes that are spatially related.
This function groups text boxes that are spatially related, starting with overlapping boxes. It can optionally expand boxes horizontally and vertically before grouping to capture nearby text that may be related.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
detections
|
List[Dict[str, Any]]
|
List of detection dictionaries containing text and bounding boxes |
required |
width_growth
|
int
|
Number of pixels to expand boxes horizontally |
0
|
height_growth
|
int
|
Number of pixels to expand boxes vertically |
0
|
Returns:
Type | Description |
---|---|
List[Dict[str, Any]]
|
List[Dict[str, Any]]: List of grouped text detections with merged bounding boxes |
Source code in src/aimq/tools/ocr/processor.py
merge_boxes(boxes)
¶
Merge a list of bounding boxes into a single box that encompasses all of them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
boxes
|
List[Dict[str, int]]
|
List of dictionaries with x, y, width, height |
required |
Returns:
Name | Type | Description |
---|---|---|
dict |
Optional[Dict[str, int]]
|
Merged bounding box or None if input is empty |