Last Updated:

Complete guide to java stream api

java stream api

The main topic of this article is advanced data processing topics using new functionality added in Java 8 , the Stream API and the Collector API. To get the most out of this article, you should already be familiar with the core Java APIs, the Object and String classes, and the Collection API.

StreamAPI

 

Java package. util. stream consists of classes, interfaces, and many types that allow you to perform functional style operations on elements. Java 8 introduces the concept of a thread that allows a programmer to process data descriptively and rely on a multi-core architecture without having to write any special code.

What is Stream?

 

A stream is a sequence of objects obtained from a source on which aggregate operations can be performed.

From a purely technical point of view, a stream is a typed interface-flow T. This means that a stream can be defined for any kind of object, a stream of numbers, a flow of symbols, a flow of people, or even a flow of a city.

From a developer's perspective, it's a new concept that may just look like a collection, but it's actually very different from a collection. There are a few key definitions we need to go through to understand this notion of flow and why it differs from a collection.

The stream does not contain any data

The most common misconception I'd like to address first is that a stream doesn't contain any data. This is very important to keep in mind and understand.

there is no data in the stream but there is data that is stored in the collection

A collection is the structure in which its data is stored. A stream exists only to process data and retrieve it from a given source or move it to its destination. The source can be a collection, although it can also be an array or an I/O resource. The stream will connect to the source, consume the data, and process the elements in it in some way.

The thread must not change the source

A stream must not modify the data source it processes. It's not actually imposed by the compiler of the JVM itself, so it's just a contract. If I have to build my own thread implementation, I don't have to change the source of the data I'm processing. While this is perfectly normal, it is perfectly normal to modify the data in the stream.

Why? Because if we want to process this data in parallel, we're going to distribute it across all the cores of our processors, and we don't want to have any visibility or timing issues that could lead to poor performance or errors. Avoiding this kind of interference means that we don't have to change the source of the data while it's being processed.

The source can be unlimited

This means that the stream itself can process as much data as we want. Unlimited does not mean that the source must be infinite. In fact, a source may be finite, but we may not have access to the elements contained in that source.

Suppose the source is a simple text file. A text file has a known size, even if it is very large. Also assume that the elements of this source are actually strings of this text file.

Now we can know the exact size of this text file, but unless we open it and manually go through the contents, we will never know how many lines are in it. This is what unbounded means – we can't always know in advance the number of elements that the stream will process from the source.

These are the three definitions of flow. Thus, from these three definitions, we can see that the stream actually has nothing to do with collection. The Collection contains its own data. Collection can change the data it contains. And of course, the collection contains a known and finite amount of data.

Brief description of terminal methods of working with streams

 

A terminal operation in Java is a method applied to a thread as a final step. Additional threading operations are not allowed because a terminal operation never creates a thread object. A typical example of a terminal operation is the forEach method, which is often used to print elements of a stream object. Another useful example is the reduce method, which produces a single result (such as a sum) from a stream. Other good examples include min and max.

A static block is executed once when a program is loaded into memory. The program will display a list of integers and then apply several terminal operations. The first terminal operation is forEach, and this operation will print each element of the stream.

Next, the min operation is called. This method returns the minimum value stored as optional on the thread. Optional is a new class in Java 8 that provides a get method to return the value of a variable if it is not null.

The next operation is max. This method returns the maximum value stored on the thread. Finally, the reduce method is called. The parameter passed to this method is a lambda expression. In this case, the expression adds one item to the next item, accumulating a total that is returned as optional.

Create a flow

We can generate a stream using several methods:

  • stream(). The stream () method returns a sequential stream with the collection as the source. You can use any collection of objects as a source:
privateList<String>list = newArrays.asList(«Scott», «David», «Josh»);
list.stream();
  • parallelStream(). The parallelStream() method returns a parallel stream with the collection as the source:
privateList<String>list = newArrays.asList(«Scott», «David», «Josh»);
list.parallelStream().forEach(element ->method(element));

The problem with parallel threads is that when performing such an operation, the Java runtime environment splits the thread into several subthreads. It performs aggregate operations and combines the result.

In our case, it calls a method with each element on the thread in parallel.
Although this can be a double-edged sword, as performing heavy operations in this way can block other parallel threads as it blocks threads in the pool.

  • Stream.of(). You can use the staticof() method to create a stream from an array of objects or individual objects:
Stream.of(newEmployee(Gosha), newEmployee("Vlad"), newEmployee("Egor"));
  • Stream.builder(). Finally, you can use the static .builder() method to create a stream of objects:
Stream.builder<String>streamBuilder = Stream.builder();
streamBuilder.accept(«Gosha»);
streamBuilder.accept(«Vlad»);
streamBuilder.accept(«Egor»);
Stream<String>stream = streamBuilder.build();

By calling the method. build(), we package the received objects into a normal thread.

Filtering with flow

public class FilterExample {
public staticvoidmain(String[] args) {
List<String> fruits = Arrays.asList("Apple", "Banana", "Cherry", "Orange");
// Traditional approach
for(String fruit : fruits) {
if (!fruit.equals("Orange")) {
System.out.println(fruit + » ");

}
}

Stream approach
fruits.stream()
.filter(fruit -> !fruit.equals(«Orange»))
.forEach(fruit ->System.out.println(fruit));

}
}

The traditional approach to filtering a single fruit would be classic for each cycle. The second approach uses a thread to filter the flow elements corresponding to a given predicate into the New Thread returned by the method.

In addition, this approach uses the forEach () method, which performs an action on each element of the returned stream. You can replace this with something called a method reference. In Java 8, a method reference is a shortened syntax for a lambda expression that executes only one method.

The method of maintaining the syntax is very simple, and you can even replace the previous one:

lambdaexpression .filter(fruit -> !fruit.equals(«Orange»)) withit:
Object::method;

Let's update the example and use the method references and see what it looks like:

publicclassFilterExample{
public static voidmain(String[] args) {
List<String> fruits = Arrays.asList(«Apple», «Banana», «Cherry», «Orange»);

fruits.stream()
.filter(FilterExample::isNotOrange)
.forEach(System.out::println);
}
private static booleanisNotOrange(String fruit) {
return!fruit.equals(«Orange»);

}
}

Threads are easier and better to use with lambda expressions, and this example shows how simple and clean the syntax looks compared to the traditional approach.

Mapping to Stream

The traditional approach would be to iterate through the list using an extended for loop:

List<String>models = Arrays.asList(«BMW», «Audi», «Peugeot», «Fiat»);
System.out.print(«Imperative style: » + «\n»);
for(String car : models) {
if(!car.equals(«Fiat»)){
Car model = newCar(car);
System.out.println(model);

}
}

On the other hand, a more modern approach is to use a stream to display:

List<String>models = Arrays.asList(«Volkswagen», «Mercedes», «Lada», «Mazda»);
System.out.print(«Functional style: » + «\n»);
models.stream()
.filter(model -> !model.equals(Lada))
// .map(Car::new) // Method reference approach
// .map(model -> new Car(model)) // Lambda approach
.forEach(System.out::p rintln);

To illustrate the display, consider this class:

private String name;
publicCar(String model) {
this.name= model;
}
// getters and setters
@Override
publicString toString() {
return»name='» + name + «‘»;

}

It is important to note that the list of models is a list of lines, not a list of cars. Method. map() expects an object of type T and returns an object of type R. Essentially, we are turning a String into a type of car.

If you run this code, the imperative style and the functional style should return the same thing.

Collecting with Stream

 

Sometimes you want to convert a stream to a collection or map. Using utility class collectors and the functions they offer:

List<String>models = Arrays.asList(«Volkswagen», «Mercedes», «Lada», «Mazda»);
List<Car>carList = models.stream()
.filter(model -> !model.equals(Fiat"))
.map(Car::new)
.collect(Collectors.toList());

Flow match

The classic task is to classify objects according to certain criteria. We can do this by matching the necessary information with the information about the object and checking whether this is really what we need:

List<Car>models = Arrays.asList(newCar(«Volkswagen»2013), newCar(«Mercedes», 2017), newCar(«Lada», 2014));
boolean all = models.stream().allMatch(model ->model.getYear() >2010);
System.out.println(«Are all of the models newer than 2012?: » + all);
boolean any = models.stream().anyMatch(model ->model.getYear() >2016);
System.out.println(«Are there any models newer than 2016: » + any);
boolean none = models.stream().noneMatch(model ->model.getYear() <2010);
System.out.println(«Is there a car older than 2010: » + none);
  1. allMatch () - Returns true if all elements of this stream match the provided predicate.
  2. anyMatch () — returns true if any element of this stream matches the provided predicate.
  3. noneMatch () - Returns true if no element of this stream matches the provided predicate.

In the previous code example, all predicates specified are satisfied, and all of them will return true.

Why does the order of execution in the stream have a value

The beauty of Javastreams lies in the ability to combine several operations into a "pipeline". It can replace most of the for loops in your code, especially those that simply move data from one data structure to another (for example, from List<YourObject> to Map<String, YourObject>. But you have to remember one thing: each step in the thread will be called until the element is rejected.

Filtering first, we're going to limit map/sorted operations to a minimum: filter 5 times, map 2 times, sort 1 time, and forEach 2 times, which saves us a total of 10 operations. In this example, it may not seem like much of a deal, but usually we are dealing with more than 5 points, and the card transaction can be expensive, so doing less is always better.

If necessary, prepare the data for easier filtering. Filter first, if possible. Fewer items equals fewer operations along the way. If you can't filter first, try using cheaper operations first, filter, and then more expensive ones.

Disadvantages of JavaAPIStream

Parallel streams can really slow you down.

Java 8 promises parallelism as one of the most anticipated new features. Method. parallelStream () implements this in collections and streams. It breaks them down into subtasks, which are then run in separate threads for processing, they can go to different cores and then merge when they are finished. All this happens under the hood using the fork/join framework. Okay, sounds cool, this should speed up large data sets in multi-core environments, right?

No, it can actually make your code run slower if it's not used correctly. We ran about 15% slower on this benchmark, but it could have been worse. Let's say we already run several threads and use .parallelStream() in some of them, adding more and more threads to the pool. This could easily turn into something more than our cores can handle, and slow things down due to increased context switching.

Why Jigsaw?

The goal of the puzzle is to make Java modular and break the JRE to interoperable components. The motivation behind this comes primarily from the desire to have a better, faster, and stronger embedded Java. I try to avoid mentioning the Internet of Things, but I said it there. Reducing the size of cans, increasing productivity and increasing safety are some of the promises of this ambitious project.

So where is he? Jigsaw has just entered Phase 2, passed the research phase, and is now shifting to quality design and manufacturing implementation, says Mark Reinhold, Oracle's chief Java architect. The project was originally planned to be completed in Java 8 and was delayed until Java 9, expected to be one of its flagship new features.

Problems that still exist

Verified Exceptions. No one likes boilerplate code, and that's one of the reasons why lambdas have become so popular. When thinking about patterned exceptions, whether you need to logically catch or have something to do with a verified exception, you still need to catch it. Even if it's something that will never happen, like this exception that will never work:

Functional programming

Functional programming has been possible with Java before, although it is rather inconvenient. Java 8 improves this, in part, with lambdas. This is highly welcome, but not the kind of huge shift that was depicted earlier. Definitely more elegant than in Java 7, but some of the curves back are still needed to be truly functional.

One of the most vehement reviews on the subject comes from Pierre-Ivasomont, where in a series of posts he takes a close look at the differences between functional programming paradigms and how they are implemented in Java.

So, Java or Scala? The adoption of more functional modern paradigms in Java is a sign of approval for Scala, which has been playing with lambdas for some time. Lambdas do make a lot of noise, but there are a lot more features like traits, lazy evaluation, and immutable objects that make a big difference.

Default methods are distracting

The default methods allow you to implement the default function in the interface itself. It's certainly one of the coolest new features in Java 8, but it gets in the way we're used to doing things somewhat. So why was this introduced? And what not to do with it?

The main motivation behind the default methods was that if at some point we needed to add a method to an existing interface, we could do so without rewriting the implementation. Makes it compatible with older versions.

Functional Interfaces in Java 8 Threads

 

Some of the most commonly used functional interfaces in the Java 8 Stream API methods are:
Function and Bifunction: A function is a function that takes one type of argument and returns another type of argument.

Function<T, R> is a generalized form, where T is the input type of the function and R is the result type of the function.

For processing primitive types, there are special function interfaces - ToIntFunction, ToLongFunction, ToDoubleFunction, ToIntBiFunction, ToLongBiFunction, ToDoubleBiFunction, LongToIntFunction, LongToDoubleFunction, IntToLongFunction, IntToDoubleFunction, etc.

JavaStream: Intermediate and Terminal Operations

 

The JavaStream API operations that return a new stream are called intermediate operations. In most cases, these operations are lazy, so they start producing new thread elements and send it to the next operation. Intermediate operations are never operations that produce the final result. Commonly used intermediate operations are filter and map.

The Java 8 Stream API is an operation that returns a result or produces a side effect. Once a terminal method is called on a thread, it consumes a thread, and after that we can't use it.

Terminal operations are impatient in nature, meaning they process all the elements in the flow before returning the result. The terminal methods forEach, toArray, min, max, findFirst, anyMatch, allMatch, etc. You can identify terminal methods by return type, they will never return a stream.

Java stream short-circuit operations

An intermediate operation is called a short circuit if it can produce a finite thread for an infinite thread. For example, limit() and skip() are two intermediate short-circuit operations.


A terminal operation is called a short circuit if it can complete in a finite time for an infinite thread. For example, anyMatch, allMatch, noneMatch, findFirst, and findAny are terminal short-circuit operations.

What's New in Java 13

  • JEP 350 Dynamic CDS Archives

Java 10 introduced the JEP 310 class of applications— data exchange. This JEP simplifies the process of creating CD archives. This command creates a CD archive file from .jar.

$ java -XX:ArchiveClassesAtExit=hello.jsa -cp hello.jar Hello

This command runs the file. jar with an existing CD archive.

$ bin/java -XX:SharedArchiveFile=hello.jsa -cphello.jarHello

Class Data Sharing (CDS) improves startup performance by archiving class data once and reusing it so that the JVM doesn't have to recreate it again.

  • JEP 351 ZGC: Uncommitted Unused Memory

Java 11 introduced the JEP 333:Z garbage collector (experimental); it provides a short pause time when clearing the memory of the heap. However, it did not return unused heap memory to the operating system, even if it had been unused for a long time.

  • JEP-353 overrides legacy socket API

Basic java implementations. net. Socket and java.net.ServerSocket is an ancient, JDK 1.0-descendant mixture of legacy Java and C code that is difficult to maintain and debug. This JEP introduces new base implementations for the socket APIs, which are the default implementation in Java 13.

Prior to Java 13, it used PlainSocketImpl for SocketImpl. This JEP is enhanced by ZGC to return unused dynamic memory to the operating system.

ServerSocket.java
public class ServerSocketimplementsjava.io.Closeable{
/**
* The implementation of this Socket.
*/
privateSocketImplimpl;

}

Java 13 introduced the new NioSocketImpl class as a replacement for PlainSocketImpl. However, if something goes wrong, we can still revert to the old PlainSocketImpl implementation by setting the system property jdk.net.usePlainSocketImpl.

Consider a simple example of a socket:

importjava.io.IOException;
importjava.net.ServerSocket;
importjava.net.Socket;
public class JEP353 {
public static void main(String[] args) {
try(ServerSocketserverSocket = newServerSocket(8888)){
boolean running = true;
while(running){
Socket clientSocket = serverSocket.accept();
//do something with clientSocket

}

} catch (IOException e) {
e.printStackTrace();

}
}
}

Inference

Most people today use Java 8. Not everyone uses threads, though. Just because they represent a new approach to programming and represent a touch of functional-style programming along with lambda expressions for Java doesn't necessarily mean it's a better approach. They just offer a new way to do things.

It is up to developers to decide whether to rely on functional or imperative programming. With a sufficient level of exercise, a combination of both principles can help you improve your software.