DATA CURATION AND ENHANCEMENT SAMPLES.

Image and PDF files processing.

Images are handled and improved carefully, like other data.

Statuses Entities Trademarks Classifications Numbers Dates Cross-fields

Image and PDF files processing.
Data curation details.

Images are sometimes overlooked in favor of other data. Yet, images require the same treatment as any other field.

  ✓  Enhance the image by increasing its brightness/contrast; removing artifacts; resizing, rotating, and leveling it; decreasing noise; increasing the sharpness; etc.
  ✓  Clean and insert data in EXIF fields.
  ✓  Insert watermarks.
  ✓  Remove empty, invalid, or broken images. Identify 'not available' placeholder images.
  ✓  Migrate image files into specific formats.
  ✓  Split multi-page images and files into separated parts and vice-versa.
  ✓  Identify non-related images or images that include information that belongs to other fields.
  ✓  Isolate images within PDF other files.

Brightness, contrast and exposure:

Input:Our output:
Raw image Curated image


Rotate and leveling:

Input:Our output:
Raw image Curated image


Artifacts, noise and sharpness:

Input:Our output:
Raw image Curated image


Resize and borders removal:

Input:Our output:
Raw image Curated image


Empty, broken and invalid:

Input:Our output:
Raw image [Removed from dataset]


Placeholders:

Input:Our output:
Raw image [Removed from dataset]


Multi-part:

Input:Our output:
Raw image

Raw image
Curated image

Do you need have other data improvement needs?

Tell us about your needs.