-
Notifications
You must be signed in to change notification settings - Fork 4.2k
GH-49644: [Python] Support converting list of multi-dimensional arrays to FixedShapeTensor #50203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
84154db
e695d01
daed54f
832acee
e7b7f69
1f8307f
fb2b78d
290b330
7c1bbce
cf4aa91
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -908,13 +908,32 @@ class PyListConverter : public ListConverter<T, PyConverter, PyConverterTrait> { | |
|
|
||
| Status AppendNdarray(PyObject* value) { | ||
| PyArrayObject* ndarray = reinterpret_cast<PyArrayObject*>(value); | ||
| if (PyArray_NDIM(ndarray) != 1) { | ||
| return Status::Invalid("Can only convert 1-dimensional array values"); | ||
| } | ||
| if (PyArray_ISBYTESWAPPED(ndarray)) { | ||
| // TODO | ||
| return Status::NotImplemented("Byte-swapped arrays not supported"); | ||
| } | ||
| OwnedRef flattened; | ||
| if (PyArray_NDIM(ndarray) != 1) { | ||
|
aboderinsamuel marked this conversation as resolved.
|
||
| // GH-49644: a fixed-size list (e.g. fixed-shape-tensor storage) can be | ||
| // built from a multi-dimensional array, always flattened in C order | ||
| // regardless of the input's memory layout. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would remove the comment here. Maybe only mention the 0-dimensional case we are catching here, in short.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, replaced it with a short note about the 0-D / variable-sized-list case.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I missed this comment before. @AlenkaF how do you feel about allowing 0-dimensional cases through? I know they are useless, but allowing them could be helpful as it removes edge cases. E.g. your application returns arrays with dynamical array dimensions. And sometime those just come out as 0 dim which you consider semantically correct. Instead of having to now fight with PyArrow and introducing a workaround you just pass 0-dim and you're done. What do you think? |
||
| if (PyArray_NDIM(ndarray) < 2 || this->list_type_->id() != Type::FIXED_SIZE_LIST) { | ||
| return Status::Invalid( | ||
| "Can only convert 1-dimensional array values to a variable-sized list"); | ||
| } | ||
|
rok marked this conversation as resolved.
Outdated
aboderinsamuel marked this conversation as resolved.
|
||
| // Get an aligned, C-contiguous array (copying only if needed), then view | ||
| // it as 1-D so its values can be read directly in C order. | ||
|
rok marked this conversation as resolved.
Outdated
|
||
| PyObject* contiguous = | ||
| PyArray_CheckFromAny(value, nullptr, /*min_depth=*/0, /*max_depth=*/0, | ||
| NPY_ARRAY_C_CONTIGUOUS | NPY_ARRAY_ALIGNED, nullptr); | ||
| RETURN_IF_PYERROR(); | ||
| flattened.reset( | ||
| PyArray_Ravel(reinterpret_cast<PyArrayObject*>(contiguous), NPY_CORDER)); | ||
|
rok marked this conversation as resolved.
|
||
| Py_DECREF(contiguous); | ||
| RETURN_IF_PYERROR(); | ||
| value = flattened.obj(); | ||
| ndarray = reinterpret_cast<PyArrayObject*>(value); | ||
| } | ||
| const int64_t size = PyArray_SIZE(ndarray); | ||
| RETURN_NOT_OK(AppendTo(this->list_type_, size)); | ||
| RETURN_NOT_OK(this->list_builder_->ValidateOverflow(size)); | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.