DATA CURATION AND ENHANCEMENT SAMPLES.
Image and PDF files processing.
Images are handled and improved carefully, like other data.
Statuses Entities Trademarks Classifications Numbers Dates Cross-fields
Images are sometimes overlooked in favor of other data. Yet, images require the same treatment as any other field.
✓ | Enhance the image by increasing its brightness/contrast; removing artifacts; resizing, rotating, and leveling it; decreasing noise; increasing the sharpness; etc. |
✓ | Clean and insert data in EXIF fields. |
✓ | Insert watermarks. |
✓ | Remove empty, invalid, or broken images. Identify 'not available' placeholder images. |
✓ | Migrate image files into specific formats. |
✓ | Split multi-page images and files into separated parts and vice-versa. |
✓ | Identify non-related images or images that include information that belongs to other fields. |
✓ | Isolate images within PDF other files. |
Brightness, contrast and exposure:
Input: | Our output: |
Rotate and leveling:
Input: | Our output: |
Artifacts, noise and sharpness:
Input: | Our output: |
Resize and borders removal:
Input: | Our output: |
Empty, broken and invalid:
Input: | Our output: |
[Removed from dataset] |
Placeholders:
Input: | Our output: |
[Removed from dataset] |
Multi-part:
Input: | Our output: |
|