Conversation
(and also support nested tasks)
|
Fixes #22 |
|
Also @shoyer, this brings us to the point where In [19]: np.array(d[[5, 3, 0]].sum(axis=0))
Out[19]: array([ 80, 83, 86, 89, 92, 95, 98, 101, 104])Which, I think, is likely sufficient for your common use cases. |
There was a problem hiding this comment.
do you really want to explicitly restrict array indexing to lists?
Assuming numpy is a hard dep of dask (which I think it is?) I would rather cast to ndarray for non integer/slices and then allow only 1d arrays of integers. For large arrays, using lists is going to be a bottleneck.
There was a problem hiding this comment.
We can do both. I was just at about my limit for complexity while I was building this and didn't want to think about other cases. Both of those sound good though.
|
Handling 1D boolean arrays is also pretty easy -- you can just convert them into integer arrays with np.nonzero. |
|
I've handled the edge cases (I think). Merging. This doesn't yet handle multi-list nor things like slicing with arrays. |
Set num_boost_round
OK, so we do a dual approach to achieve fancy indexing.
Given an index, like
We first do the normal
dask_slicesolution on the array with the slice replaced with an empty listWe then follow with the final list list. I suspect that we could repeat these for multiple lists and achieve Matlab style orthogonal indexing.
It mostly works
Example
The actual dask looks like the following
Some known problems
d[0]failscc @nevermindewe @shoyer