|
|
[Up to jsr166y.forkjoin Examples]
A mapping is an operation which operates on one or two elements and returns a new element. A filter is a boolean predicate which indicates whether a particular element is included in the result. A bounds operation selects an index based subarray of the input.
The ParallelArray API is designed so that these operations can be chained together. That is, the methods for applying these operations are not void methods but rather return objects which may be further processed. Thus, you can apply multiple mappings; the output of applying one mapping may be used as the input to the next mapping. Similarly, the output of one filter can feed into the next filter, and the output of a bounds can feed the next bounds.
ParallelArray:withMapping*() methods yields a ParallelArrayWithMappingParallelArrayWithMapping, calling one of the withMapping*() methods yields another ParallelArrayWithMapping instance.ParallelArrayWithMapping, you cannot apply filters without first calling all() which executes the pending filters and mappings to produce a ParallelArray.withFilter() methods yields a ParallelArrayWithFilterParallelArrayWithFilter you can apply additional filters with various withFilter() methods yields another ParallelArrayWithFilter instanceParallelArrayWithFilter you can apply mappings with various withMapping*() methods which alter the filtered values, which yields a ParallelArrayWithMapping instance.withBounds(int firstIndex, int upperBound) yields a ParallelArrayWithBoundsParallelArrayWithBounds you extract another ParallelArrayWithBounds instance via a withBounds() callParallelArrayWithBounds you can call all() which produces a ParallelArray. This example starts with with a ParallelDoubleArray which is filled with values emitted by the myGenerator instance. The mapping myMapping then alters the contents; a second mapping takes the output of the first mapping as input and outputs a new double value. The result is then stored in the intermediate ParallelDoubleArray. To compute the result, the intermediate results are filtered so that only those values for which myPredicate.op(double) returns true are included, and then those values are passed through the final mapping, myLastMapping.ParallelDoubleArray intermediate, result;
intermediate = ParallelDoubleArray.create(size, ParallelDoubleArray.defaultExecutor())
.replaceWithGeneratedValue(myGenerator)
.withMapping(myMapping)
.withMapping(myOtherMapping)
.all();
result = intermediate.withFilter(myFilter)
.withMapping(myLastMapping)
.all();
Doug Lea explains this restriction:ParallelDoubleArray result;
reslult = ParallelDoubleArray.create(size, ParallelDoubleArray.defaultExecutor)
.replaceWithGeneratedValue(myGenerator)
.withMapping(myMapping)
.withMapping(myOtherMapping)
.withFilter(myFilter) // Note: this filter operation is not allowed
.withMapping(myLastMapping)
.all();
The current internal support is set up to allow at most one (parallel) selection per expression, which I think is a good rule because it provides an understandable (and efficient) performance model -- it doesn't secretly require multiple temporary workspaces or multiple per-element evaluations of mapping functions. Bear in mind that fast parallel filtering is inherently a two-phase operation (sometimes with internal pipelined overlap): Find indices of all matching elements, and then when you know how many there are, create and put them in new array. (That's one downside of arrays as opposed to other data structures -- you need to know size before you can create.)
Note that the above case differs in that the all() call must evaluate mappings, and then put elements somewhere to collect into returned array in case they pass filter. It is possible to either do slower filtering to avoid multiple calls to mapping functions that would otherwise be needed in this case, or to create extra internal workspace to hold evaluated elements. But in keeping with desire for simple performance model, since there are two options I cruelly make you be explicit about it. To get the evaluate-and-store version, use all(). To do the slower-filter version, rewrite the filter so that it internally does the mapping needed to decide whether to keep the element.and Tim Peierls adds
There might be a good always-acceptable approach for this case though, which would make for a better argument for allowing it. I'll explore the possibilities.
Further complicating this story is that if all you want is a reduction, then these concerns don't even apply, and we might as well support it, but doing this leads to an even less regular API, so it is currently out just on those grounds.
It is by design. The return types of the methods restrict you to chains that can be performed efficiently in one recursive fork-join "pass". The calls to specify bounds, filters, or mappings just set things up; the real work starts when you call an "action" method like all, reduce, replace, etc.-- David Biesack
But in your example, since you aren't keeping the original array, you could avoid the intermediate allocation by using replaceWithMapping. In other words, convert this code:pa.withBounds(b).withMapping(m1).all() // allocates a new PA
into
.withFilter(f).withMapping(m2).all(); // allocates another new PApa.withBounds(b).replaceWithMapping(m1) // no new allocation
The latter allocates only one array besides pa, where the former allocates two additional.
.withFilter(f).withMapping(m2).all(); // allocates a new PA