Tools API Reference¶
AIMQ provides a set of built-in tools for document processing and storage operations.
OCR Tools¶
Image OCR {#image-ocr}¶
aimq.tools.ocr.image_ocr
¶
Tool for performing OCR on images.
Classes¶
ImageOCR(**kwargs)
¶
Bases: BaseTool
Tool for performing OCR on images.
Initialize the OCR processor.
Source code in src/aimq/tools/ocr/image_ocr.py
Attributes¶
args_schema = ImageOCRInput
class-attribute
instance-attribute
¶
description = 'Extract text from images using OCR'
class-attribute
instance-attribute
¶
model_config = ConfigDict(arbitrary_types_allowed=True)
class-attribute
instance-attribute
¶
name = 'image_ocr'
class-attribute
instance-attribute
¶
processor = Field(default_factory=OCRProcessor)
class-attribute
instance-attribute
¶
Functions¶
ImageOCRInput
¶
Bases: BaseModel
Input for ImageOCR.
PDF Processor {#pdf-processor}¶
aimq.tools.ocr.processor
¶
OCR module for text extraction and processing from images.
This module provides functionality for extracting and processing text from images using the EasyOCR library. It includes utilities for handling text bounding boxes, merging overlapping detections, and creating debug visualizations.
Classes¶
OCRProcessor(languages=None)
¶
Processor for performing OCR on images using EasyOCR.
This class provides a high-level interface for performing OCR on images. It handles initialization of the EasyOCR reader, image preprocessing, text detection, and optional debug visualization.
Attributes:
Name | Type | Description |
---|---|---|
languages |
List of language codes for OCR |
|
_reader |
Lazy-loaded EasyOCR reader instance |
Initialize OCR processor with specified languages.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
languages
|
Optional[List[str]]
|
List of language codes (default: ['en']) |
None
|
Source code in src/aimq/tools/ocr/processor.py
Attributes¶
languages = languages or ['en']
instance-attribute
¶
reader
property
¶
Get or initialize the EasyOCR reader.
Returns:
Type | Description |
---|---|
Reader
|
easyocr.Reader: Initialized EasyOCR reader instance |
Functions¶
process_image(image, save_debug_image=False)
¶
Process an image and return OCR results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image
|
Union[str, Path, Image, bytes]
|
The image to process. Can be one of: - Path to image file (str or Path) - PIL Image object - Bytes of image data |
required |
save_debug_image
|
bool
|
If True, includes debug image in output |
False
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict[str, Any]: OCR results including: - processing_time: Time taken to process in seconds - text: Extracted text content - debug_image: Optional base64 encoded debug image - detections: List of text detections with coordinates |
Raises:
Type | Description |
---|---|
ValueError
|
If image format is invalid or unreadable |
Source code in src/aimq/tools/ocr/processor.py
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 |
|
Functions¶
boxes_overlap(box1, box2)
¶
Check if two boxes overlap at all.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
box1
|
Dict[str, int]
|
Dictionary with x, y, width, height |
required |
box2
|
Dict[str, int]
|
Dictionary with x, y, width, height |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if boxes overlap |
Source code in src/aimq/tools/ocr/processor.py
group_text_boxes(detections, width_growth=0, height_growth=0)
¶
Group text boxes that are spatially related.
This function groups text boxes that are spatially related, starting with overlapping boxes. It can optionally expand boxes horizontally and vertically before grouping to capture nearby text that may be related.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
detections
|
List[Dict[str, Any]]
|
List of detection dictionaries containing text and bounding boxes |
required |
width_growth
|
int
|
Number of pixels to expand boxes horizontally |
0
|
height_growth
|
int
|
Number of pixels to expand boxes vertically |
0
|
Returns:
Type | Description |
---|---|
List[Dict[str, Any]]
|
List[Dict[str, Any]]: List of grouped text detections with merged bounding boxes |
Source code in src/aimq/tools/ocr/processor.py
merge_boxes(boxes)
¶
Merge a list of bounding boxes into a single box that encompasses all of them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
boxes
|
List[Dict[str, int]]
|
List of dictionaries with x, y, width, height |
required |
Returns:
Name | Type | Description |
---|---|---|
dict |
Optional[Dict[str, int]]
|
Merged bounding box or None if input is empty |
Source code in src/aimq/tools/ocr/processor.py
Storage Tools¶
Supabase Storage¶
aimq.tools.supabase.read_file
¶
Tool for reading files from Supabase Storage.
Attributes¶
Classes¶
ReadFile
¶
Bases: BaseTool
Tool for reading files from Supabase Storage.
Attributes¶
args_schema = ReadFileInput
class-attribute
instance-attribute
¶
bucket = Field('{{bucket}}', description='The storage bucket template to read the file from')
class-attribute
instance-attribute
¶
description = 'Read a file from Supabase Storage'
class-attribute
instance-attribute
¶
formater = Field('mustache', description='The format to use for the template')
class-attribute
instance-attribute
¶
name = 'read_file'
class-attribute
instance-attribute
¶
path = Field('{{path}}', description='The path template to use for the file')
class-attribute
instance-attribute
¶
ReadFileInput
¶
Bases: BaseModel
Input for ReadFile.
Attributes¶
bucket = Field('files', description='The storage bucket to read the file from')
class-attribute
instance-attribute
¶
metadata = Field(None, description='Additional metadata to attach to the file')
class-attribute
instance-attribute
¶
path = Field(..., description='The path values to apply to the template path')
class-attribute
instance-attribute
¶
aimq.tools.supabase.write_file
¶
Tool for writing files to Supabase Storage.
Attributes¶
Classes¶
WriteFile
¶
Bases: BaseTool
Tool for writing files to Supabase Storage.
Attributes¶
args_schema = WriteFileInput
class-attribute
instance-attribute
¶
bucket = Field('{{bucket}}', description='The storage bucket template to read the file from')
class-attribute
instance-attribute
¶
description = 'Write a file to Supabase Storage'
class-attribute
instance-attribute
¶
formater = Field('mustache', description='The format to use for the template')
class-attribute
instance-attribute
¶
name = 'write_file'
class-attribute
instance-attribute
¶
path = Field('{{path}}', description='The path template to use for the file')
class-attribute
instance-attribute
¶
WriteFileInput
¶
Bases: BaseModel
Input for WriteFile.
Attributes¶
bucket = Field('files', description='The storage bucket to read the file from')
class-attribute
instance-attribute
¶
file = Field(..., description='The file to write')
class-attribute
instance-attribute
¶
metadata = Field(None, description='Additional metadata to attach to the file')
class-attribute
instance-attribute
¶
path = Field(None, description='The path values to apply to the template path')
class-attribute
instance-attribute
¶
Supabase Database¶
aimq.tools.supabase.read_record
¶
Tool for reading records from Supabase.
Attributes¶
Classes¶
ReadRecord
¶
Bases: BaseTool
Tool for reading records from Supabase.
Attributes¶
args_schema = ReadRecordInput
class-attribute
instance-attribute
¶
description = 'Read a record from Supabase'
class-attribute
instance-attribute
¶
name = 'read_record'
class-attribute
instance-attribute
¶
select = '*'
class-attribute
instance-attribute
¶
table = 'records'
class-attribute
instance-attribute
¶
ReadRecordInput
¶
Bases: BaseModel
Input for ReadRecord.
aimq.tools.supabase.write_record
¶
Tool for writing records to Supabase.
Attributes¶
Classes¶
WriteRecord
¶
Bases: BaseTool
Tool for writing records to Supabase.
WriteRecordInput
¶
Bases: BaseModel
Input for WriteRecord.