什么时候应该优先使用流而不是传统循环以获得最佳性能?流是否利用了分支预测?-Java问题

When should streams be preferred over traditional loops for best performance? Do streams take advantage of branch-prediction?(什么时候应该优先使用流而不是传统循环以获得最佳性能?流是否利用了分支预测?)

本文介绍了什么时候应该优先使用流而不是传统循环以获得最佳性能?流是否利用了分支预测?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我刚刚阅读了有关 Branch-Prediction 的文章，并想尝试一下它如何与 Java 8 Streams 一起工作.

I just read about Branch-Prediction and wanted to try how this works with Java 8 Streams.

但是，Streams 的性能总是比传统循环差.

However the performance with Streams is always turning out to be worse than traditional loops.

int totalSize = 32768;
int filterValue = 1280;
int[] array = new int[totalSize];
Random rnd = new Random(0);
int loopCount = 10000;

for (int i = 0; i < totalSize; i++) {
    // array[i] = rnd.nextInt() % 2560; // Unsorted Data
    array[i] = i; // Sorted Data
}

long start = System.nanoTime();
long sum = 0;
for (int j = 0; j < loopCount; j++) {
    for (int c = 0; c < totalSize; ++c) {
        sum += array[c] >= filterValue ? array[c] : 0;
    }
}
long total = System.nanoTime() - start;
System.out.printf("Conditional Operator Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

start = System.nanoTime();
sum = 0;
for (int j = 0; j < loopCount; j++) {
    for (int c = 0; c < totalSize; ++c) {
        if (array[c] >= filterValue) {
            sum += array[c];
        }
    }
}
total = System.nanoTime() - start;
System.out.printf("Branch Statement Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

start = System.nanoTime();
sum = 0;
for (int j = 0; j < loopCount; j++) {
    sum += Arrays.stream(array).filter(value -> value >= filterValue).sum();
}
total = System.nanoTime() - start;
System.out.printf("Streams Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

start = System.nanoTime();
sum = 0;
for (int j = 0; j < loopCount; j++) {
    sum += Arrays.stream(array).parallel().filter(value -> value >= filterValue).sum();
}
total = System.nanoTime() - start;
System.out.printf("Parallel Streams Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

输出:

对于排序数组:

For Sorted-Array :

Conditional Operator Time : 294062652 ns, (0.294063 sec) 
Branch Statement Time : 272992442 ns, (0.272992 sec) 
Streams Time : 806579913 ns, (0.806580 sec) 
Parallel Streams Time : 2316150852 ns, (2.316151 sec)

对于未排序的数组:

For Un-Sorted Array:

Conditional Operator Time : 367304250 ns, (0.367304 sec) 
Branch Statement Time : 906073542 ns, (0.906074 sec) 
Streams Time : 1268648265 ns, (1.268648 sec) 
Parallel Streams Time : 2420482313 ns, (2.420482 sec)

我使用 List 尝试了相同的代码:
list.stream() 而不是 Arrays.stream(array)
list.get(c) 而不是 array[c]

I tried the same code using List:
list.stream() instead of Arrays.stream(array)
list.get(c) instead of array[c]

输出:

对于排序列表:

For Sorted-List :

Conditional Operator Time : 860514446 ns, (0.860514 sec) 
Branch Statement Time : 663458668 ns, (0.663459 sec) 
Streams Time : 2085657481 ns, (2.085657 sec) 
Parallel Streams Time : 5026680680 ns, (5.026681 sec)

对于未排序的列表

For Un-Sorted List

Conditional Operator Time : 704120976 ns, (0.704121 sec) 
Branch Statement Time : 1327838248 ns, (1.327838 sec) 
Streams Time : 1857880764 ns, (1.857881 sec) 
Parallel Streams Time : 2504468688 ns, (2.504469 sec)

我参考了一些博客这个 &this 建议流中存在相同的性能问题.

I referred to few blogs this & this which suggest the same performance issue w.r.t streams.

我同意使用流编程在某些情况下既好又容易的观点，但是当我们失去性能时，为什么我们需要使用它们?有什么我错过的吗?
在哪种情况下流执行等于循环?是否仅在您定义的函数花费大量时间的情况下，导致循环性能可以忽略不计?
在任何场景中，我都看不到利用 branch-prediction 的流(我尝试使用排序和无序流，但没有用.与正常相比，它对性能的影响增加了一倍以上流)?

I agree to the point that programming with streams is nice and easier for some scenarios but when we're losing out on performance, why do we need to use them? Is there something I'm missing out on?
Which is the scenario in which streams perform equal to loops? Is it only in the case where your function defined takes a lot of time, resulting in a negligible loop performance?
In none of the scenario's I could see streams taking advantage of branch-prediction (I tried with sorted and unordered streams, but of no use. It gave more than double the performance impact compared to normal streams)?

问题描述

推荐答案