315766: Add batch picture import functionality#404
Conversation
- Introduced `BatchPictureImportForm` for uploading ZIP files containing images. - Implemented methods for processing and validating uploaded images, including handling duplicates and generating previews. - Added a new template for the import form to guide users through the upload and matching process. - Enhanced unit tests to cover the new functionality, ensuring correct handling of image imports and matching logic.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #404 +/- ##
===========================================
+ Coverage 96.93% 97.10% +0.17%
===========================================
Files 237 238 +1
Lines 9729 9927 +198
Branches 1072 1107 +35
===========================================
+ Hits 9431 9640 +209
+ Misses 143 133 -10
+ Partials 155 154 -1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
- Modified `compose.yaml` to include additional environment variables for API integration and debugging. - Updated service configurations for backend, celery worker, and health checks. - Enhanced unit tests in `test_batch_admin.py` to cover new functionalities, including reprocessing forms and batch picture imports. - Added helper functions for middleware and ZIP file uploads to streamline test setup and validation.
- Added `BatchPictureImportForm` to the imports in `__init__.py` for better accessibility. - Updated the `__all__` list to include the new form, ensuring it is publicly available for module users.
- Updated the import statement for `Base64ImageField` to reflect its new location in the `country_workspace.utils.flex_fields` module, ensuring consistency across the codebase.
… handling - Updated the `_guess_image_mimetype` method to return only valid image MIME types. - Added a filter to remove duplicate entries based on their keys during ZIP file extraction. - Modified the test for importing pictures to include the ZIP file in the request data. - Refactored the mock setup for the individual checker in the batch tests for clarity.
…fficiently - Removed the filtering of duplicate entries from the list of processed images, instead skipping duplicates during the ZIP file extraction process. - Updated the logic to ensure that only unique keys are processed, improving performance and clarity in handling image imports.
| "batch_id": obj.pk, | ||
| "match_field": form.cleaned_data["match_field"], | ||
| "target_field": form.cleaned_data["target_field"], | ||
| **preview, |
There was a problem hiding this comment.
Am I right that all matching images are base64-encoded and stored in the session until confirmation?
If I’m not mistaken, SESSION_ENGINE is set to django.contrib.sessions.backends.db. Could this result in oversized rows in django_session?
|
Do we need limits on archive size, file count, and total uncompressed size before reading and base64-encoding all entries? |
| current[target_field] = item["data_uri"] | ||
| if current != individual.flex_fields: | ||
| individual.flex_fields = current | ||
| individual.save(update_fields=["flex_fields"]) |
There was a problem hiding this comment.
Should importing pictures invalidate the existing validation status, for example by setting last_checked to None without triggering immediate revalidation?
| return entries, duplicates | ||
|
|
||
| def get_match_field_choices(self) -> list[tuple[str, str]]: | ||
| first_individual = CountryIndividual.objects.filter(batch=self.batch, removed=False).only("raw_data").first() |
There was a problem hiding this comment.
Are all individuals in a batch guaranteed to have the same raw_data keys? Otherwise, using only the first active individual may hide valid match fields that are present in other records.
BatchPictureImportFormfor uploading ZIP files containing images.AB#315766