Tools API Reference¶

AIMQ provides a set of built-in tools for document processing and storage operations.

OCR Tools¶

Image OCR {#image-ocr}¶

`aimq.tools.ocr.image_ocr` ¶

Tool for performing OCR on images.

Classes¶

`ImageOCR(**kwargs)` ¶

Bases: BaseTool

Tool for performing OCR on images.

Initialize the OCR processor.

Source code in src/aimq/tools/ocr/image_ocr.py

def __init__(self, **kwargs):
    """Initialize the OCR processor."""
    super().__init__(**kwargs)
    self.processor = OCRProcessor()

Functions¶

`ImageOCRInput` ¶

Bases: BaseModel

Input for ImageOCR.

PDF Processor {#pdf-processor}¶

`aimq.tools.ocr.processor` ¶

OCR module for text extraction and processing from images.

This module provides functionality for extracting and processing text from images using the EasyOCR library. It includes utilities for handling text bounding boxes, merging overlapping detections, and creating debug visualizations.

Classes¶

`OCRProcessor(languages=None)` ¶

Processor for performing OCR on images using EasyOCR.

This class provides a high-level interface for performing OCR on images. It handles initialization of the EasyOCR reader, image preprocessing, text detection, and optional debug visualization.

Attributes:

Name	Type	Description
`languages`		List of language codes for OCR
`_reader`		Lazy-loaded EasyOCR reader instance

Initialize OCR processor with specified languages.

Parameters:

Name	Type	Description	Default
`languages`	`Optional[List[str]]`	List of language codes (default: ['en'])	`None`

Source code in src/aimq/tools/ocr/processor.py

def __init__(self, languages: Optional[List[str]] = None) -> None:
    """Initialize OCR processor with specified languages.

    Args:
        languages: List of language codes (default: ['en'])
    """
    self.languages = languages or ['en']
    self._reader = None

Attributes¶

`reader` `property` ¶

Get or initialize the EasyOCR reader.

Returns:

Type	Description
`Reader`	easyocr.Reader: Initialized EasyOCR reader instance

Functions¶

`process_image(image, save_debug_image=False)` ¶

Process an image and return OCR results.

Parameters:

Name	Type	Description	Default
`image`	`Union[str, Path, Image, bytes]`	The image to process. Can be one of: - Path to image file (str or Path) - PIL Image object - Bytes of image data	required
`save_debug_image`	`bool`	If True, includes debug image in output	`False`

Returns:

Type	Description
`Dict[str, Any]`	Dict[str, Any]: OCR results including: - processing_time: Time taken to process in seconds - text: Extracted text content - debug_image: Optional base64 encoded debug image - detections: List of text detections with coordinates

Raises:

Type	Description
`ValueError`	If image format is invalid or unreadable

Source code in src/aimq/tools/ocr/processor.py

def process_image(
    self, 
    image: Union[str, Path, Image.Image, bytes], 
    save_debug_image: bool = False,
) -> Dict[str, Any]:
    """Process an image and return OCR results.

    Args:
        image: The image to process. Can be one of:
            - Path to image file (str or Path)
            - PIL Image object
            - Bytes of image data
        save_debug_image: If True, includes debug image in output

    Returns:
        Dict[str, Any]: OCR results including:
            - processing_time: Time taken to process in seconds
            - text: Extracted text content
            - debug_image: Optional base64 encoded debug image
            - detections: List of text detections with coordinates

    Raises:
        ValueError: If image format is invalid or unreadable
    """
    start_time = time.time()

    # Convert input to a format EasyOCR can process
    if isinstance(image, (str, Path)):
        image_path = str(image)
        pil_image = Image.open(image_path)
    elif isinstance(image, bytes):
        image_stream = io.BytesIO(image)
        pil_image = Image.open(image_stream)
        image_path = None
    elif isinstance(image, Image.Image):
        pil_image = image
        image_path = None
    else:
        raise ValueError("Image must be a file path, PIL Image, or bytes")

    # Convert PIL Image to numpy array for EasyOCR
    if pil_image.mode != 'RGB':
        pil_image = pil_image.convert('RGB')
    np_image = np.array(pil_image)

    # Read the image with optimized parameters
    results = self.reader.readtext(
        np_image,
        paragraph=False,
        min_size=20,
        text_threshold=0.7,
        link_threshold=0.4,
        low_text=0.4,
        width_ths=0.7,
        height_ths=0.9,
        ycenter_ths=0.9,
    )

    # Format initial results
    detections = []
    for result in results:
        if len(result) == 2:
            bbox, text = result
            confidence = 1.0
        else:
            bbox, text, confidence = result

        x1, y1 = int(bbox[0][0]), int(bbox[0][1])
        x2, y2 = int(bbox[1][0]), int(bbox[1][1])
        x3, y3 = int(bbox[2][0]), int(bbox[2][1])
        x4, y4 = int(bbox[3][0]), int(bbox[3][1])

        detections.append({
            "text": str(text),
            "confidence": float(round(float(confidence), 3)),
            "bounding_box": {
                "x": x1,
                "y": y1,
                "width": x2 - x1,
                "height": y3 - y1
            }
        })

    # Group the detections
    grouped_detections = group_text_boxes(
        detections,
        width_growth=20,
        height_growth=1
    )

    end_time = time.time()
    output = {
        "processing_time": float(round(end_time - start_time, 2)),
        "detections": grouped_detections,
        "text": " ".join(d["text"] for d in grouped_detections)
    }

    if save_debug_image:
        debug_image = self._create_debug_image(pil_image, grouped_detections)
        # Convert debug image to bytes
        debug_bytes = io.BytesIO()
        debug_image.save(debug_bytes, format='PNG')
        output["debug_image"] = debug_bytes.getvalue()

    return output

Functions¶

`boxes_overlap(box1, box2)` ¶

Check if two boxes overlap at all.

Parameters:

Name	Type	Description	Default
`box1`	`Dict[str, int]`	Dictionary with x, y, width, height	required
`box2`	`Dict[str, int]`	Dictionary with x, y, width, height	required

Returns:

Name	Type	Description
`bool`	`bool`	True if boxes overlap

Source code in src/aimq/tools/ocr/processor.py

def boxes_overlap(box1: Dict[str, int], box2: Dict[str, int]) -> bool:
    """
    Check if two boxes overlap at all.

    Args:
        box1: Dictionary with x, y, width, height
        box2: Dictionary with x, y, width, height

    Returns:
        bool: True if boxes overlap
    """
    h_overlap = (
        box1['x'] < box2['x'] + box2['width'] and
        box2['x'] < box1['x'] + box1['width']
    )

    v_overlap = (
        box1['y'] < box2['y'] + box2['height'] and
        box2['y'] < box1['y'] + box1['height']
    )

    return h_overlap and v_overlap

`group_text_boxes(detections, width_growth=0, height_growth=0)` ¶

Group text boxes that are spatially related.

This function groups text boxes that are spatially related, starting with overlapping boxes. It can optionally expand boxes horizontally and vertically before grouping to capture nearby text that may be related.

Parameters:

Name	Type	Description	Default
`detections`	`List[Dict[str, Any]]`	List of detection dictionaries containing text and bounding boxes	required
`width_growth`	`int`	Number of pixels to expand boxes horizontally	`0`
`height_growth`	`int`	Number of pixels to expand boxes vertically	`0`

Returns:

Type	Description
`List[Dict[str, Any]]`	List[Dict[str, Any]]: List of grouped text detections with merged bounding boxes

Source code in src/aimq/tools/ocr/processor.py

def group_text_boxes(
    detections: List[Dict[str, Any]], 
    width_growth: int = 0, 
    height_growth: int = 0
) -> List[Dict[str, Any]]:
    """Group text boxes that are spatially related.

    This function groups text boxes that are spatially related, starting with
    overlapping boxes. It can optionally expand boxes horizontally and vertically
    before grouping to capture nearby text that may be related.

    Args:
        detections: List of detection dictionaries containing text and bounding boxes
        width_growth: Number of pixels to expand boxes horizontally
        height_growth: Number of pixels to expand boxes vertically

    Returns:
        List[Dict[str, Any]]: List of grouped text detections with merged bounding boxes
    """
    if not detections:
        return []

    def grow_box(box: Dict[str, int]) -> Dict[str, int]:
        """Helper to expand a box by the growth parameters"""
        return {
            'x': box['x'],
            'y': box['y'],
            'width': box['width'] + width_growth,
            'height': box['height'] + height_growth
        }

    groups = [[det] for det in detections]

    while True:
        merged = False
        new_groups = []
        used = set()

        for i, group1 in enumerate(groups):
            if i in used:
                continue

            merged_group = group1.copy()
            used.add(i)

            box1 = grow_box(merge_boxes([det['bounding_box'] for det in merged_group]))

            for j, group2 in enumerate(groups):
                if j in used:
                    continue

                box2 = merge_boxes([det['bounding_box'] for det in group2])

                if boxes_overlap(box1, box2):
                    merged_group.extend(group2)
                    used.add(j)
                    box1 = grow_box(merge_boxes([det['bounding_box'] for det in merged_group]))
                    merged = True

            new_groups.append(merged_group)

        if not merged:
            break

        groups = new_groups

    return [{
        "text": ' '.join(det['text'] for det in sorted(
            group,
            key=lambda d: (d['bounding_box']['y'], d['bounding_box']['x'])
        )),
        "confidence": float(round(
            sum(det['confidence'] for det in group) / len(group),
            3
        )),
        "bounding_box": merge_boxes([det['bounding_box'] for det in group])
    } for group in groups]

`merge_boxes(boxes)` ¶

Merge a list of bounding boxes into a single box that encompasses all of them.

Parameters:

Name	Type	Description	Default
`boxes`	`List[Dict[str, int]]`	List of dictionaries with x, y, width, height	required

Returns:

Name	Type	Description
`dict`	`Optional[Dict[str, int]]`	Merged bounding box or None if input is empty

Source code in src/aimq/tools/ocr/processor.py

def merge_boxes(boxes: List[Dict[str, int]]) -> Optional[Dict[str, int]]:
    """
    Merge a list of bounding boxes into a single box that encompasses all of them.

    Args:
        boxes: List of dictionaries with x, y, width, height

    Returns:
        dict: Merged bounding box or None if input is empty
    """
    if not boxes:
        return None

    min_x = min(box['x'] for box in boxes)
    min_y = min(box['y'] for box in boxes)
    max_x = max(box['x'] + box['width'] for box in boxes)
    max_y = max(box['y'] + box['height'] for box in boxes)

    return {
        'x': int(min_x),
        'y': int(min_y),
        'width': int(max_x - min_x),
        'height': int(max_y - min_y)
    }

Storage Tools¶

Supabase Storage¶

`aimq.tools.supabase.read_file` ¶

Tool for reading files from Supabase Storage.

Classes¶

`ReadFile` ¶

Bases: BaseTool

Tool for reading files from Supabase Storage.

`ReadFileInput` ¶

Bases: BaseModel

Input for ReadFile.

`aimq.tools.supabase.write_file` ¶

Tool for writing files to Supabase Storage.

Classes¶

`WriteFile` ¶

Bases: BaseTool

Tool for writing files to Supabase Storage.

`WriteFileInput` ¶

Bases: BaseModel

Input for WriteFile.

Supabase Database¶

`aimq.tools.supabase.read_record` ¶

Tool for reading records from Supabase.

Classes¶

`ReadRecord` ¶

Bases: BaseTool

Tool for reading records from Supabase.

`ReadRecordInput` ¶

Bases: BaseModel

Input for ReadRecord.

`aimq.tools.supabase.write_record` ¶

Tool for writing records to Supabase.

Classes¶

`WriteRecord` ¶

Bases: BaseTool

Tool for writing records to Supabase.

`WriteRecordInput` ¶

Bases: BaseModel

Input for WriteRecord.

Tools API Reference¶

OCR Tools¶

Image OCR {#image-ocr}¶

aimq.tools.ocr.image_ocr ¶

Classes¶

ImageOCR(**kwargs) ¶

Functions¶

ImageOCRInput ¶

PDF Processor {#pdf-processor}¶

aimq.tools.ocr.processor ¶

Classes¶

OCRProcessor(languages=None) ¶

Attributes¶

reader property ¶

Functions¶

process_image(image, save_debug_image=False) ¶

Functions¶

boxes_overlap(box1, box2) ¶

group_text_boxes(detections, width_growth=0, height_growth=0) ¶

merge_boxes(boxes) ¶

Storage Tools¶

Supabase Storage¶

aimq.tools.supabase.read_file ¶

Classes¶

ReadFile ¶

ReadFileInput ¶

aimq.tools.supabase.write_file ¶

Classes¶

WriteFile ¶

WriteFileInput ¶

Supabase Database¶

aimq.tools.supabase.read_record ¶

Classes¶

ReadRecord ¶

ReadRecordInput ¶

aimq.tools.supabase.write_record ¶

Classes¶

WriteRecord ¶

WriteRecordInput ¶

`aimq.tools.ocr.image_ocr` ¶

`ImageOCR(**kwargs)` ¶

`ImageOCRInput` ¶

`aimq.tools.ocr.processor` ¶

`OCRProcessor(languages=None)` ¶

`reader` `property` ¶

`process_image(image, save_debug_image=False)` ¶

`boxes_overlap(box1, box2)` ¶

`group_text_boxes(detections, width_growth=0, height_growth=0)` ¶

`merge_boxes(boxes)` ¶

`aimq.tools.supabase.read_file` ¶

`ReadFile` ¶

`ReadFileInput` ¶

`aimq.tools.supabase.write_file` ¶

`WriteFile` ¶

`WriteFileInput` ¶

`aimq.tools.supabase.read_record` ¶

`ReadRecord` ¶

`ReadRecordInput` ¶

`aimq.tools.supabase.write_record` ¶

`WriteRecord` ¶

`WriteRecordInput` ¶