Apache Camel: integrating systems with Java

Standard

Hi, dear readers! Welcome to my blog. On this post, we will talk about Apache Camel, a robust solution for deploying system integrations across various technologies, such as REST, WS, JMS, JDBC, AWS Products, LDAP, SMTP, Elasticsearch etc.

So, let’s get start!

Origin

Apache Camel was created on Apache Service Mix. Apache Service Mix was a project powered by the Spring Framework and implemented following the JBI specification. The Java Business Integration specification specifies a plug and play platform for systems integrations, following the EIP (Enterprise Integration Patterns) patterns.

Terminology

Exchange

Exchanges – or MEPs(Message Exchange Patterns) – are like frames where we transport our data across the integrations on Camel. A Exchange can have 2 messages inside, one representing the input and another one representing the output of a integration.

The output message on Camel is optional, since we could have a integration that doesn’t have a response. Also, a Exchange can have properties, represented as key-value entries, that can be used as data that will be used across the whole route (we will see more about routes very soon)

Message

Messages are the data itself that is transferred inside a Camel route. A Message can have a body, which is the data itself and headers, which are, like properties on a Exchange, key-value entries that can be used along the processing.

One important aspect to keep in mind, however, is that along a Camel route our Messages are changed – when we convert the body with a Type converter, for instance – and when this happens, we lose all our headers. So, Message headers must be seen as ephemeral data, that will not be used through the whole route. For that type of data, it is better to use Exchange properties.

The Message body can de made of several types of data, such as binaries, JSON, etc.

Camel context

The Camel context is the runtime container where Camel runs it. It initializes type converters, routes, endpoints, EIPs etc.

A Camel context has 3 possible status: started, suspended and stopped. When started, the context will serve the routes processing as normal.

When on suspended status, the Camel context will stop the processing – after the Exchanges already on processing are completed – , but keep all the caches, resources etc still loaded. A suspended context can be restarted.

Finally, there’s the stop status. When stopped, the context will stop the processing like the suspended status, but also will release all the resources caches etc, making a complete shutdown. As with the suspended status, Camel will also guarantee that all the Exchanges being processing will be finished before the shutdown.

Route

Routes on Camel are the heart of the processing. It consists of a flow, that start on a endpoint, pass through a stream of processors/convertors and finishes on another endpoint. it is possible to chain routes by calling another route as the final endpoint of a previous route.

A route can also use other features, such as EIPs, asynchronous and parallel processing.

Channel

When Camel executes a route, the controller in which it executes the route is called Channel.

A Channel is responsible for chaining the processors execution, passing the Exchange from one to another, alongside monitoring the route execution. It also allow us to implement interceptors to run any logic on some route’s events, such as when a Exchange is going to a specific Endpoint.

Processor

Processors are the primary extension points on Camel. By creating classes that extend the org.apache.camel.Processor interface, we create programming units that we can use to include our own code on a Camel route, inside a convenient execute method.

Component

A Component act like a factory to instantiate Endpoints for our use. We don’t directly use a Component, we reference instead by defining a Endpoint URI, that makes Camel infer about the Component that it needs to be using in order to create the Endpoint.

Camel provides dozens of Components, from file to JMS, AWS Connections to their products and so on.

Registry

In order to utilize beans from IoC systems, such as OSGi, Spring and JNDI, Camel supplies us with a Bean Registry. The Registry’s mission is to supply the beans referred on Camel routes with the ones create on his associated context, such as a OSGi container, a Spring context etc

Type converter

Type converters, as the name implies, are used in order to convert the body of a message from one type to another. The uses for a converter are varied, ranging from converting a binary format to a String to converting XML to JSON.

We can create our own Type Converter by extending the org.apache.camel.TypeConverter  interface. After creating our own Converter by extending the interface, we need to register it on the Type’s Converter Registry.

Endpoint

A Endpoint is the entity responsible for communicating a Camel Route in or out of his execution process. It comprises several types of sources and destinations as mentioned before, such as SQS, files, Relational Databases and so on. A Endpoint is instantiated and configured by providing a URI to a Camel Route, following the pattern below:

component:option?option1=value1&option2=value2

We can create our own Components by extending the org.apache.camel.Endpoint interface. When extending the interface, we need to override 3 methods, where we supply the logic to create a polling consumer Endpoint, a passive consumer Endpoint and a producer Endpoint.

Lab

So, without further delay, let’s start our lab! On this lab, we will create a route that polls access files from a access log style file, sends the logs to a SQS and backups the file on a S3.

Setup

The setup for our lab is pretty simple: It is a Spring Boot application, configured to work with Camel. Our Gradle.build file is as follows:

apply plugin: 'java'
apply plugin: 'eclipse'
apply plugin: 'org.springframework.boot'
apply plugin: 'maven'
apply plugin: 'idea'

jar {
    baseName = 'apache-camel-handson'
    version = '1.0'
}

project.ext {
    springBootVersion = '1.5.4.RELEASE'
    camelVersion = '2.18.3'

}

sourceCompatibility = 1.8
targetCompatibility = 1.8

repositories {
    mavenLocal()
    mavenCentral()
}

bootRun {
    systemProperties = System.properties
}


dependencies {

    compile group: 'org.apache.camel', name: 'camel-spring-boot-starter', version: camelVersion
    compile group: 'org.apache.camel', name: 'camel-commands-spring-boot', version: camelVersion
    compile group: 'org.apache.camel',name: 'camel-aws', version: camelVersion
    compile group: 'org.apache.camel',name: 'camel-mail', version: camelVersion
    compile group: 'org.springframework.boot', name: 'spring-boot-autoconfigure', version: springBootVersion
    

}
group 'com.alexandreesl.handson'
version '1.0'

buildscript {
    repositories {
        mavenLocal()
        maven {
            url "https://plugins.gradle.org/m2/"
        }
        mavenCentral()
    }
    dependencies {
        classpath("org.springframework.boot:spring-boot-gradle-plugin:1.5.4.RELEASE")
    }
}

And the Java main file is a simple Java Spring Boot Application file, as follows:

package com.alexandreesl.handson;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.EnableAutoConfiguration;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.ComponentScan;

/**
 * Created by alexandrelourenco on 28/06/17.
 */

@ComponentScan(basePackages = {"com.alexandreesl.handson"})
@SpringBootApplication
@EnableAutoConfiguration
public class ApacheCamelHandsonApp {

    public static void main(String[] args) {
        SpringApplication.run(ApacheCamelHandsonApp.class, args);
    }

}

We also configure a configuration class, where we will register a type converter that we will create on the next section:

package com.alexandreesl.handson.configuration;

import org.apache.camel.CamelContext;
import org.apache.camel.spring.boot.CamelContextConfiguration;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class CamelConfiguration {


    @Bean
    public CamelContextConfiguration camelContextConfiguration() {

        return new CamelContextConfiguration() {

            @Override
            public void beforeApplicationStart(CamelContext camelContext) {
               

            }

            @Override
            public void afterApplicationStart(CamelContext camelContext) {

            }

        };

    }

}

We also create a configuration which will create a AmazonS3Client and AmazonSQSClient, that will be used by the AWS-S3 and AWS-SQS Camel endpoints:

package com.alexandreesl.handson.configuration;

import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.internal.StaticCredentialsProvider;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.sqs.AmazonSQSClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;

@Configuration
public class AWSConfiguration {


    @Autowired
    private Environment environment;

    @Bean(name = "s3Client")
    public AmazonS3Client s3Client() {
        return new AmazonS3Client(staticCredentialsProvider()).withRegion(Regions.fromName("us-east-1"));
    }

    @Bean(name = "sqsClient")
    public AmazonSQSClient sqsClient() {
        return new AmazonSQSClient(staticCredentialsProvider()).withRegion(Regions.fromName("us-east-1"));
    }

    @Bean
    public StaticCredentialsProvider staticCredentialsProvider() {
        return new StaticCredentialsProvider(new BasicAWSCredentials("<access key>", "<secret access key>"));
    }

}

PS: this lab assumes that the reader is familiar with AWS and already has a account. For the lab, a bucket called “apache-camel-handson” and a SQS called “MyInputQueue” were created.

Configuring the route

Now that we have our Camel environment set up, let’s begin creating our route. First, we create a type converter called “StringToAccessLogDTOConverter” with the following code:

package com.alexandreesl.handson.converters;

import com.alexandreesl.handson.dto.AccessLogDTO;
import org.apache.camel.Converter;
import org.apache.camel.TypeConverters;

import java.util.StringTokenizer;

/**
 * Created by alexandrelourenco on 30/06/17.
 */

public class StringToAccessLogDTOConverter implements TypeConverters {

    @Converter
    public AccessLogDTO convert(String row) {

        AccessLogDTO dto = new AccessLogDTO();

        StringTokenizer tokens = new StringTokenizer(row);

        dto.setIp(tokens.nextToken());
        dto.setUrl(tokens.nextToken());
        dto.setHttpMethod(tokens.nextToken());
        dto.setDuration(Long.parseLong(tokens.nextToken()));

        return dto;

    }

}

Next, we change our Camel configuration, registering the converter:

package com.alexandreesl.handson.configuration;

import com.alexandreesl.handson.converters.StringToAccessLogDTOConverter;
import org.apache.camel.CamelContext;
import org.apache.camel.spring.boot.CamelContextConfiguration;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class CamelConfiguration {


    @Bean
    public CamelContextConfiguration camelContextConfiguration() {

        return new CamelContextConfiguration() {

            @Override
            public void beforeApplicationStart(CamelContext camelContext) {

                camelContext.getTypeConverterRegistry().addTypeConverters(new StringToAccessLogDTOConverter());

            }

            @Override
            public void afterApplicationStart(CamelContext camelContext) {

            }

        };

    }

}

Our converter reads a String and converts to a DTO, with the following attributes:

package com.alexandreesl.handson.dto;

/**
 * Created by alexandrelourenco on 30/06/17.
 */
public class AccessLogDTO {

    private String ip;

    private String url;

    private String httpMethod;

    private long duration;

    public String getIp() {
        return ip;
    }

    public void setIp(String ip) {
        this.ip = ip;
    }

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }

    public String getHttpMethod() {
        return httpMethod;
    }

    public void setHttpMethod(String httpMethod) {
        this.httpMethod = httpMethod;
    }

    public long getDuration() {
        return duration;
    }

    public void setDuration(long duration) {
        this.duration = duration;
    }

    @Override
    public String toString() {

        StringBuffer buffer = new StringBuffer();
        buffer.append("[");
        buffer.append(ip);
        buffer.append(",");
        buffer.append(url);
        buffer.append(",");
        buffer.append(httpMethod);
        buffer.append(",");
        buffer.append(duration);
        buffer.append("]");

        return buffer.toString();

    }
}

Finally, we have our route, defined on our RouteBuilder:

package com.alexandreesl.handson.routes;

import com.alexandreesl.handson.dto.AccessLogDTO;
import org.apache.camel.LoggingLevel;
import org.apache.camel.spring.SpringRouteBuilder;
import org.springframework.context.annotation.Configuration;

/**
 * Created by alexandrelourenco on 30/06/17.
 */

@Configuration
public class MyFirstCamelRoute extends SpringRouteBuilder {


    @Override
    public void configure() throws Exception {

        from("file:/Users/alexandrelourenco/Documents/apachecamelhandson?delay=1000&charset=utf-8&delete=true")
                .setHeader("CamelAwsS3Key", header("CamelFileName"))
                .to("aws-s3:arn:aws:s3:::apache-camel-handson?amazonS3Client=#s3Client")
                .convertBodyTo(String.class)
                .split().tokenize("\n")
                    .convertBodyTo(AccessLogDTO.class)
                    .log(LoggingLevel.INFO, "${body}")
                    .to("aws-sqs://MyInputQueue?amazonSQSClient=#sqsClient");

    }
}

On the route above, we define a file endpoint that will poll for files on a folder, each 1 second and remove the file if the processing is completed successfully. Then we send the file to Amazon using S3 as a backup storage.

Next, we split the file using a splitter, that generates a string for each line of the file. For each line we convert the line to a DTO, log the data and finally we send the data to a SQS.

Now that we have our code done, let’s run it!

Running

First, we start our Camel route. To do this, we simply run the main Spring Boot class, as we would do with any common Java program.

After firing up Spring Boot, we would receive on our console the output that the route was successful started:

. ____ _ __ _ _
 /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/ ___)| |_)| | | | | || (_| | ) ) ) )
 ' |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot :: (v1.5.4.RELEASE)

2017-07-01 12:52:02.224 INFO 3042 --- [ main] c.a.handson.ApacheCamelHandsonApp : Starting ApacheCamelHandsonApp on Alexandres-MacBook-Pro.local with PID 3042 (/Users/alexandrelourenco/Applications/git/apache-camel-handson/build/classes/main started by alexandrelourenco in /Users/alexandrelourenco/Applications/git/apache-camel-handson)
2017-07-01 12:52:02.228 INFO 3042 --- [ main] c.a.handson.ApacheCamelHandsonApp : No active profile set, falling back to default profiles: default
2017-07-01 12:52:02.415 INFO 3042 --- [ main] s.c.a.AnnotationConfigApplicationContext : Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@475e586c: startup date [Sat Jul 01 12:52:02 BRT 2017]; root of context hierarchy
2017-07-01 12:52:03.325 INFO 3042 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.apache.camel.spring.boot.CamelAutoConfiguration' of type [org.apache.camel.spring.boot.CamelAutoConfiguration$$EnhancerBySpringCGLIB$$72a2a9b] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2017-07-01 12:52:09.520 INFO 3042 --- [ main] o.a.c.i.converter.DefaultTypeConverter : Loaded 192 type converters
2017-07-01 12:52:09.612 INFO 3042 --- [ main] roperties$SimpleAuthenticationProperties :

Using default password for shell access: b738eab1-6577-4f9b-9a98-2f12eae59828




2017-07-01 12:52:15.463 WARN 3042 --- [ main] tarterDeprecatedWarningAutoConfiguration : spring-boot-starter-remote-shell is deprecated as of Spring Boot 1.5 and will be removed in Spring Boot 2.0
2017-07-01 12:52:15.511 INFO 3042 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup
2017-07-01 12:52:15.519 INFO 3042 --- [ main] o.s.c.support.DefaultLifecycleProcessor : Starting beans in phase 0
2017-07-01 12:52:15.615 INFO 3042 --- [ main] o.a.camel.spring.boot.RoutesCollector : Loading additional Camel XML routes from: classpath:camel/*.xml
2017-07-01 12:52:15.615 INFO 3042 --- [ main] o.a.camel.spring.boot.RoutesCollector : Loading additional Camel XML rests from: classpath:camel-rest/*.xml
2017-07-01 12:52:15.616 INFO 3042 --- [ main] o.a.camel.spring.SpringCamelContext : Apache Camel 2.18.3 (CamelContext: camel-1) is starting
2017-07-01 12:52:15.618 INFO 3042 --- [ main] o.a.c.m.ManagedManagementStrategy : JMX is enabled
2017-07-01 12:52:25.695 INFO 3042 --- [ main] o.a.c.i.DefaultRuntimeEndpointRegistry : Runtime endpoint registry is in extended mode gathering usage statistics of all incoming and outgoing endpoints (cache limit: 1000)
2017-07-01 12:52:25.810 INFO 3042 --- [ main] o.a.camel.spring.SpringCamelContext : StreamCaching is not in use. If using streams then its recommended to enable stream caching. See more details at http://camel.apache.org/stream-caching.html
2017-07-01 12:52:27.853 INFO 3042 --- [ main] o.a.camel.spring.SpringCamelContext : Route: route1 started and consuming from: file:///Users/alexandrelourenco/Documents/apachecamelhandson?charset=utf-8&delay=1000&delete=true
2017-07-01 12:52:27.854 INFO 3042 --- [ main] o.a.camel.spring.SpringCamelContext : Total 1 routes, of which 1 are started.
2017-07-01 12:52:27.854 INFO 3042 --- [ main] o.a.camel.spring.SpringCamelContext : Apache Camel 2.18.3 (CamelContext: camel-1) started in 12.238 seconds
2017-07-01 12:52:27.858 INFO 3042 --- [ main] c.a.handson.ApacheCamelHandsonApp : Started ApacheCamelHandsonApp in 36.001 seconds (JVM running for 36.523)

PS: Don’t forget it to replace the access key and secret with your own!

Now, to test it, we place a file on the polling folder. For testing, we create a file like the following:

10.12.64.3 /api/v1/test1 POST 123
10.12.67.3 /api/v1/test2 PATCH 125
10.15.64.3 /api/v1/test3 GET 166
10.120.64.23 /api/v1/test1 POST 100

We put a file with the content on the folder and after 1 second, the file is gone! Where did it go?

If we check the Amazon S3 bucket interface, we will see that the file was created on the storage:

 

Screen Shot 2017-07-01 at 13.12.39

And if we check the Amazon SQS interface, we will see 4 messages on the queue, proving that our integration is a success:

Screen Shot 2017-07-01 at 13.38.12

If we check the messages, we will see that Camel correctly parsed the information from the file, as we can see on the example bellow:

[10.12.64.3,/api/v1/test1,POST,123]

Implementing Error Handling

On Camel, we can implement logic designed for handling errors. These are done by defining routes as well, which inputs are the exceptions fired by the routes.

On our lab, let’s implement a error handling. First, we add a option on the file endpoint that makes the file to be moved to a .error folder when a error occurs, and then we send a email to ourselves to alert of the failure. we can do this by changing the route as follows:

package com.alexandreesl.handson.routes;

import com.alexandreesl.handson.dto.AccessLogDTO;
import org.apache.camel.LoggingLevel;
import org.apache.camel.spring.SpringRouteBuilder;
import org.springframework.context.annotation.Configuration;

/**
 * Created by alexandrelourenco on 30/06/17.
 */

@Configuration
public class MyFirstCamelRoute extends SpringRouteBuilder {


    @Override
    public void configure() throws Exception {

        onException(Exception.class)
                .handled(false)
                .log(LoggingLevel.ERROR, "An Error processing the file!")
                .to("smtps://smtp.gmail.com:465?password=xxxxxxxxxxxxxxxx&username=alexandreesl@gmail.com&subject=A error has occurred!");

        from("file:/Users/alexandrelourenco/Documents/apachecamelhandson?delay=1000&charset=utf-8&delete=true&moveFailed=.error")
                .setHeader("CamelAwsS3Key", header("CamelFileName"))
                .to("aws-s3:arn:aws:s3:::apache-camel-handson?amazonS3Client=#s3Client")
                .convertBodyTo(String.class)
                .split().tokenize("\n")
                    .convertBodyTo(AccessLogDTO.class)
                    .log(LoggingLevel.INFO, "${body}")
                    .to("aws-sqs://MyInputQueue?amazonSQSClient=#sqsClient");

    }
}

Then, we restart the route and feed up a file like the following, that will cause a parse exception:

10.12.64.3 /api/v1/test1 POST 123
10.12.67.3 /api/v1/test2 PATCH 125
10.15.64.3 /api/v1/test3 GET 166
10.120.64.23 /api/v1/test1 POST 10a

After the processing, we can see the console and watch how the error was handled:

2017-07-01 14:18:48.695  INFO 3230 --- [           main] o.a.camel.spring.SpringCamelContext      : Apache Camel 2.18.3 (CamelContext: camel-1) started in 11.899 seconds2017-07-01 14:18:48.695  INFO 3230 --- [           main] o.a.camel.spring.SpringCamelContext      : Apache Camel 2.18.3 (CamelContext: camel-1) started in 11.899 seconds2017-07-01 14:18:48.699  INFO 3230 --- [           main] c.a.handson.ApacheCamelHandsonApp        : Started ApacheCamelHandsonApp in 35.612 seconds (JVM running for 36.052)2017-07-01 14:18:52.737  WARN 3230 --- [checamelhandson] c.amazonaws.services.s3.AmazonS3Client   : No content length specified for stream data.  Stream contents will be buffered in memory and could result in out of memory errors.2017-07-01 14:18:53.105  INFO 3230 --- [checamelhandson] route1                                   : [10.12.64.3,/api/v1/test1,POST,123]2017-07-01 14:18:53.294  INFO 3230 --- [checamelhandson] route1                                   : [10.12.67.3,/api/v1/test2,PATCH,125]2017-07-01 14:18:53.504  INFO 3230 --- [checamelhandson] route1                                   : [10.15.64.3,/api/v1/test3,GET,166]2017-07-01 14:18:53.682 ERROR 3230 --- [checamelhandson] route1                                   : An Error processing the file!2017-07-01 14:19:02.058 ERROR 3230 --- [checamelhandson] o.a.camel.processor.DefaultErrorHandler  : Failed delivery for (MessageId: ID-Alexandres-MacBook-Pro-local-52251-1498929510223-0-9 on ExchangeId: ID-Alexandres-MacBook-Pro-local-52251-1498929510223-0-10). Exhausted after delivery attempt: 1 caught: org.apache.camel.InvalidPayloadException: No body available of type: com.alexandreesl.handson.dto.AccessLogDTO but has value: 10.120.64.23 /api/v1/test1 POST 10a of type: java.lang.String on: Message[ID-Alexandres-MacBook-Pro-local-52251-1498929510223-0-9]. Caused by: Error during type conversion from type: java.lang.String to the required type: com.alexandreesl.handson.dto.AccessLogDTO with value 10.120.64.23 /api/v1/test1 POST 10a due java.lang.NumberFormatException: For input string: "10a". Exchange[ID-Alexandres-MacBook-Pro-local-52251-1498929510223-0-10]. Caused by: [org.apache.camel.TypeConversionException - Error during type conversion from type: java.lang.String to the required type: com.alexandreesl.handson.dto.AccessLogDTO with value 10.120.64.23 /api/v1/test1 POST 10a due java.lang.NumberFormatException: For input string: "10a"]. Processed by failure processor: FatalFallbackErrorHandler[Pipeline[[Channel[Log(route1)[An Error processing the file!]], Channel[sendTo(smtps://smtp.gmail.com:465?password=xxxxxx&subject=A+error+has+occurred%21&username=alexandreesl%40gmail.com)]]]]
Message History---------------------------------------------------------------------------------------------------------------------------------------RouteId              ProcessorId          Processor                                                                        Elapsed (ms)[route1            ] [route1            ] [file:///Users/alexandrelourenco/Documents/apachecamelhandson?charset=utf-8&del] [      9323][route1            ] [convertBodyTo2    ] [convertBodyTo[com.alexandreesl.handson.dto.AccessLogDTO]                      ] [      8370][route1            ] [log1              ] [log                                                                           ] [         1][route1            ] [to1               ] [smtps://smtp.gmail.com:xxxxxx@gmail.com&subject=A error ha                    ] [      8366]
Stacktrace---------------------------------------------------------------------------------------------------------------------------------------
org.apache.camel.InvalidPayloadException: No body available of type: com.alexandreesl.handson.dto.AccessLogDTO but has value: 10.120.64.23 /api/v1/test1 POST 10a of type: java.lang.String on: Message[ID-Alexandres-MacBook-Pro-local-52251-1498929510223-0-9]. Caused by: Error during type conversion from type: java.lang.String to the required type: com.alexandreesl.handson.dto.AccessLogDTO with value 10.120.64.23 /api/v1/test1 POST 10a due java.lang.NumberFormatException: For input string: "10a". Exchange[ID-Alexandres-MacBook-Pro-local-52251-1498929510223-0-10]. Caused by: [org.apache.camel.TypeConversionException - Error during type conversion from type: java.lang.String to the required type: com.alexandreesl.handson.dto.AccessLogDTO with value 10.120.64.23 /api/v1/test1 POST 10a due java.lang.NumberFormatException: For input string: "10a"] at org.apache.camel.impl.MessageSupport.getMandatoryBody(MessageSupport.java:107) ~[camel-core-2.18.3.jar:2.18.3] at org.apache.camel.processor.ConvertBodyProcessor.process(ConvertBodyProcessor.java:91) ~[camel-core-2.18.3.jar:2.18.3] at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77) [camel-core-2.18.3.jar:2.18.3] at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:542) [camel-core-2.18.3.jar:2.18.3] at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:197) [camel-core-2.18.3.jar:2.18.3]

If we look to the folder, we will see that a .error folder was created and the file was moved to the folder:

Screen Shot 2017-07-01 at 14.25.18

And if we check the mailbox, we will see that we received the failure alert:

Screen Shot 2017-07-01 at 14.27.40

Conclusion

And so we conclude our tour through Apache Camel. With a easy-to-use architecture and dozens of components, it is a highly pluggable and robust option on integration developing. Thank you for following me on this post, until next time.

Advertisements

Refactoring:improving the design of existing code (book review)

Standard

Hi, Dear readers! Welcome to my blog. On this post, I will review a famous book of Martin Fowler, which focus on a refactoring techniques. But after all, why refactoring matters?

Definition

According to Fowler, a refactoring consists of modifying code in order to improve his readability and capacity to change, without changing his behavior. When refactoring, our objectives is to make the code easier to be read by humans and also improving his structure and design, making changes motivated by business rules easier to implement. Other benefits are that a cleaner code makes it easier to spot bugs, alongside fastening the development of new code on top of a well organized production code.

When refactor?

Fowler defends that refactoring should be done on 3 situations:

  • When you add a new functionality;
  • When you find a bug;
  • When you do a code review;

On this situations, you are forced to make changes on the code structure, making ideal situations for refactoring.

Pitfalls

When refactoring, there is some common pitfalls that could hinder the refactoring. The most common ones are the databases and the interfaces from the code.

Database schemas could be hard to change, specially if the database is old with millions of rows. This produces a splash effect on the code that manipulates the database, making more difficult to make changes on the code. Also the interfaces (APIs, libraries or even a single class inside a component) could be a challenge to refactoring, since a change on a interface could cascade to a change on lots of client’s code.

In order to solve this problem, the better approach for databases is to isolate the database’s logic on his own layer, allowing the “dirtier” code to be evolved in a more controlled manner. As for the interfaces issue, the better approach is to allow the old and new interfaces to coexist, while a migration work is conducted.

When not refactor?

According to Fowler, there is one situation when you shouldn’t refactor: when the code is so bad, that it is better to be written from scratch. This is a difficult rule to be measured as to when the code is bad enough to be rewritten. Some good hints could be if the code is infested by bugs or if it is identified that it has so much refactoring points, that fixing it up could end up rewritten most of the code.

Refactoring and performance

Sometimes, when refactoring, we could incur on refactorings that cause some performance degradation. Of course that it is up to the business to measure up how much this degradation is unbearable to meet the requirements, but as a general rule, we can assume that a more organized code is a easier code to fine tune. So, if we refactor first and improve his readability and design, it will be easier to make a performance tune later.

Unit testings

Another key point defended by Fowler is the need to develop unit tests for the code. With unit tests, we can develop refactorings in small steps (“baby steps”), receiving rapid feedback from the tests, so if anything breaks we can easily and fast make fixes during the refactoring process.

Refactoring catalog

Here there are some brief descriptions of some of the refactoring patterns that I found more interesting. Complete descriptions with examples can be found on Fowler’s book, that you can find on the links at the end of this post.

Extract Method

This refactoring consists of taking some code that can be grouped together and extract the code as his own method, this way improving readability.

Introduce Explaining Variable

This refactoring consists of taking a big and complex conditional and simplifying by turning his operators onto variables, this way making the conditional more self explanatory.

Replace Method with Method Object

This refactoring consists of a situation when you have a method that is better to have some code extracted to his own method, but it refers to a lot of variables that hinders the operation. On this case, this refactoring applies, consisting of taking all the variables and the method and moving to a new object, making a easier environment to make the extract method refactoring.

Move Method

This refactoring consists of moving a method from one class to another. This makes sense when the old class has less uses for his own method then the class he is moved to.

Extract Class

This refactoring consists of when you have a class that it is doing work that could be better organized if divided in two. On this case, we move the common behavior and data (methods and fields) that could form a new class and move it, making a delegation from the old one.

Remove Middle Man

This refactoring consists of when a class has lots of delegating methods to another class, which introduces unnecessary code. On this case, we create a accessor to the instance of the object itself, making it so the callers can call the methods from the class themselves, so after creating the accessor we remove all delegating methods.

Consolidate Conditional Expression

This refactoring consists of when you have several conditionals that returns the same value. On this case, we refactor the conditionals by creating a single one, commonly by creating a method, making the code more clear and simple.

Remove Control Flag

This refactoring consists of when you have a conditional flag that controls the behavior inside a loop. By using control commands like break and continue, we can remove the control flag, simplifying the code.

Replace Conditional with Polymorphism

This refactoring consists of when you have a method on a class that has conditional behavior depending on the type of the object. On this case, we extract each leg of the conditional and create a subclass around the different behaviors, until the method turns out to be empty, in which case we turn the method to abstract on the now superclass of the hierarchy. We may have to change the constructor of the class on a factory method.

Introduce Nul Object

This refactoring consists of when we have various null checks for data on the callers of a object. On this case, we create a object to represent null, that returns all the default data that should be used when the data was null. This way, we don’t have to make checks for null on the callers anymore, since the behavior on the null object will cover the check’s circumstances.

Preserve Whole Object

This refactoring consists of when you have a method call that is preceded by calling several data accessors to get lots of data from other object to pass as parameters for the call. On this case, we change the method to pass the object itself, removing the calls from the data accessors by moving them to inside the method itself. This refactoring not only simplifies the code on the caller’s side, but also simplifies changes if the method needs more data from the object passed by parameter on the future.

Replace Constructor with Factory Method

This refactoring consists of when we want to include more behavior on a constructor then it normally has it. This is specially true on class hierarchies, where the object construction must reflect the corresponding subclass depending on the type of the object. On this case, we change the constructors to a more restrictive access (private or protected)  and create a static factory method at the top of the hierarchy, allowing the dynamic creation of the objects.

Replace Error Code with Exception

This refactoring comes in handy when we have code that returns error codes when something breaks. Error codes are common on languages such as Unix and C, but on Java, we have a much more powerful tool: exceptions. With exceptions, we can easily separate the code that fix the errors from the normal code. So in this case, our best approach is to change the error code’s return to exception throws, which make a much more readable and organized structure.

Replace Exception with Test

This refactoring occurs when we have a code on a try-catch block, that has on the catch block some code that could be moved to be performed before the error occurs. This is typically found when we have predicted errors that can occur on some cases, but we use the catch block as part of our program’s logic. By changing the logic on the catch block to a test (if) before the code that breaks, we can remove the try -catch altogether, making a better readable and consistent code, that doesn’t rely on errors to work.

Extract Subclass

This refactoring consists of when we have some methods and fields that are used only on some instances of the class. On this case, we move the methods and fields to a new subclass, where they could be better organized and maintained.

Extract Superclass

This refactoring is opposed to the previous one, since we create a superclass instead of a subclass. If we have identified common behavior from two different classes, we create a superclass with the common behavior from the two and make both of them subclasses from the created superclass.

Form Template Method

This refactoring consists of when we have two classes that has methods with equal or very similar logic, that needs to be called on a certain order. On this case, we equalize the interfaces of the methods of both classes to be equal and creates a superclass where we move the common methods from both classes and create a orchestration method for the order of the calls. This improves the code on reusability and hierarchy organization.

Conclusion

And so we conclude our introduction to Martin Fowler’s Refactoring book. With good didactic and good examples, the book is a must read that I highly recommend!

Thank you for following me on this post, until next time.

Buy the book now!

Curator: Implementing purge routines on your Elasticsearch cluster

Standard

Hi, dear readers! Welcome to my blog. On this post, we will learn how to use the Curator project to create purge routines on a Elasticsearch cluster.

When we have a cluster crunching logs and other data types from our systems, it is necessary to configure process that manages this data, doing actions like purges and backups. For this purpose, the Curator project comes in handy.

Curator is a Python tool, that allows several types of actions. On this post, we will focus on 2 actions, purge and backup. To install Curator, we can use pip, like the command bellow:

sudo pip install elasticsearch-curator

Once installed, let’s begin preparing our cluster to make the backups, by a backup repository. A backup repository is a Elasticsearch feature, that process backups and save them on a persistent store. On this case, we will configure the backups to be stored on a Amazon S3 bucket. First, let’s install AWS Cloud plugin for Elasticsearch, by running the following command on each of the cluster’s nodes:

bin/plugin install cloud-aws

And before we restart our nodes, we configure the AWS credentials for the cluster to connect to AWS, by configuring them on the elasticsearch.yml file:

cloud:
  aws:
    access_key: <access key>
    secret_key: <secret key>

Finally, let’s configure our backup repository, using Elasticsearch REST API:

PUT /_snapshot/elasticsearch_backups
{
 “type”: “s3”,
 “settings”: {
 “bucket”: “elastic-bckup”,
 “region”: “us-east-1”
 }
}

On the command above, we created a new backup repository, called “elasticsearch-backups”, also defining the bucket where the backups will be created. With our repository created, let’s create our YAMLs to configure Curator.

The first YAML is “curator-config.yml”, where we configure details such as the cluster address. A configuration example could be as follows:

client:
  hosts:
    — localhost
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  aws_key:
  aws_secret_key:
  aws_region:
  ssl_no_validate: False
  http_auth:
  timeout: 240
  master_only: False
logging:
  loglevel: INFO
  logfile:
  logformat: default
  blacklist: [‘elasticsearch’, ‘urllib3’]

The other YAML is “curator-action.yml”, where we configure a action list to be executed by Curator. On the example, we have indexes of data from Twitter, with the prefix “twitter”, where we first create a backup from indexes that are more then 2 days old and after the backup, we purge the data:

actions:
 1:
   action: snapshot
   description: >-
     Make backups of indices older then 2 days.
   options:
     repository: elasticsearch_backups
     name: twitter-%Y.%m.%d
     ignore_unavailable: False
     include_global_state: True
     partial: False
     wait_for_completion: True
     skip_repo_fs_check: False
     timeout_override:
     continue_if_exception: False
     disable_action: False
   filters:
   — filtertype: age
     source: creation_date
     direction: older
     unit: days
     unit_count: 2
     exclude:
  2:
    action: delete_indices
    description: >-
      Delete indices older than 2 days (based on index name).
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    — filtertype: pattern
      kind: prefix
      value: twitter-
      exclude:
    — filtertype: age
      source: name
      direction: older
      timestring: ‘%Y.%m.%d’
      unit: days
      unit_count: 2
      exclude:

With the YAMLs configured, we can execute Curator, with the following command:

curator — config curator-config.yml curator-action.yml

The command will generate a log from the actions performed, showing that our configurations were a success:

2016–08–27 16:14:36,576 INFO Action #1: snapshot
2016–08–27 16:14:40,814 INFO Creating snapshot “twitter-2016.08.27” from indices: [u’twitter-2016.08.14', u’twitter-2016.08.25']
2016–08–27 16:15:34,725 INFO Snapshot twitter-2016.08.27 successfully completed.
2016–08–27 16:15:34,725 INFO Action #1: completed
2016–08–27 16:15:34,725 INFO Action #2: delete_indices
2016–08–27 16:15:34,769 INFO Deleting selected indices: [u’twitter-2016.08.14', u’twitter-2016.08.25']
2016–08–27 16:15:34,769 INFO — -deleting index twitter-2016.08.14
2016–08–27 16:15:34,769 INFO — -deleting index twitter-2016.08.25
2016–08–27 16:15:34,860 INFO Action #2: completed
2016–08–27 16:15:34,861 INFO Job completed.

That’s it! Now it is just schedule this script to execute from time to time – once per day, for example – and we will have automated backups and purges.

Thank you for following me on this post, until next time.

Docker: using containers to implement a Microservices architecture

Standard

Hello, dear readers! Welcome to another post from my blog. On this post, we will talk about the virtualization technology called Docker, which gained a ton of popularity this days, specially when we talk about Microservices architectures. But after all, what it is virtualization?

Virtualization

Virtualization, as the name implies, consists in the use of software to emulate any part of a environment, ranging from hardware components to a entire OS. In this world, there is several traditional technologies such as VMWare, Virtualbox, etc. This technologies, however, use hypervisors, which creates a intermediate layer responsible for isolating the virtual machine from the physical machine.

On containers, however, we don’t have this hypervisor layer. Instead, containers run directly on top of the kernel itself of the host machine they are running. This leads to a much more lightweight virtualization, allowing us to run a lot more VMs on a single host then we could do with a hypervisor.

Of course that using virtualization with hypervisors over containers has his benefits too, such as more flexibility – since you can’t for example run a windows container from a linux machine – and security, since this shared kernel approach leads to a scenario where if a attacker invades the kernel, he automatically get access to all the containers provided by that kernel, although we can say the same if the attacker invades the hypervisor of a machine running some VMs. This is a very heated debate, which we can find a lot of discussion by the Internet.

The fact is that containers are here to stay and his lightweight implementation provides a very good foundation for a lot of applications, such as a Microservices architecture. So, without further delay, let’s begin do dive in on Docker, one of the most popular container engines on the market today.

Docker Architecture

Docker was developed by Docker inc (formerly DotCloud inc) as a open-source engine for deployment of applications on containers. By providing a logical layer that manages the lifecycle of the containers, we can focus our development on the applications themselves, leaving the implementation of the container management to Docker.

The Docker architecture consists of a client-server model, where we have a Docker client, that could be the command line one provided by Docker, or a consumer of the RESTFul API also provided on the toolset, and the Docker server, also known as Docker daemon, which receives requests to create/start/stop containers, been responsible for the container management. The diagram bellow illustrates this architecture:

docker-architecture

On the image above we can see the Docker clients connecting to the daemon, making requests to start/stop/create containers etc. We can also see that the daemon is communicating with a image repository and using images while processing the requests. What are those images for? That’s what we will find out on the next section.

Containers & Images

When we talk about containers on Docker, we talk about processes running from instructions that were set on images. We can understand images as the building blocks from which containers are build, that way organizing the building of containers on a Docker environment.

This images are distributed on repositories, also known as a docker registry, which in turn are versioned using git. The main repository for the distribution of Docker images is, of course, the Docker Hub, managed by Docker itself.

A interesting thing is that the code behind the Docker Hub is open, so it is perfectly possible to host your own image repository. To know more about this, please visit:

github.com/docker/distribution

Docker Union File System

If the reader is familiar with traditional virtualization with hypervisors, he could be thinking: “How Docker manages the file system so the images didn’t get ‘dirt’ from the files generated on the executions of all the created containers?”. This could be even worse when we think that images can be inherited from other images, making a whole image tree.

The answer for this is how Docker utilizes the file system, with a file system called Union File System. On this system, each image is layered as a read-only layer. The layers are then overlapped on top of each other and finally on top of the chain a read-write layer is created, for the use of the container. This way, none of the image’s file systems are altered, making the images clean for use across multiple containers.

Using Docker in other OS

Docker is designed for use on Linux distributions. However, for use as a development environment, Docker inc released a Docker Toolbox that allows it to run Docker both on OSX and Windows environments as well.

In order to do this, the toolbox install virtualbox inside his distribution, as well as a micro linux VM, that only has the minimum kernel packages necessary for the launch of containers. This way, when we launch the terminal of the toolbox, the launcher starts the VM on virtualbox, starting Docker inside the VM.

One important thing to notice is that, when we launch Docker this way, we can’t use localhost anymore when we want to access Docker from the same host it is running, since it is binded to a specific ip. We can see the ip Docker is bind to by the initialization messages, as we can see on the image bellow, from a Docker installation on OSX:

Another thing to notice is that, when we use docker on OSX or Windows, we don’t need to use sudo to execute the commands, while on Linux we will need to execute the commands as root. Another way to use Docker that removes the need to use sudo is by having a group called ‘docker’ on the machine, which will make docker apply the permissions necessary to run docker without sudo to the group. Notice, however, that users from this group will have root permissions, so caution is required.

Docker commands

Well, now that we get the concepts out of the way, let’s start using Docker!

As said before, Docker uses image repositories to resolve the images it needs to run the containers. Let’s begin by running a simple container, to make a famous ‘hello world!’, Docker style.

To do this, let’s open a terminal – or the Docker QuickStart Terminal on other OS – and type:

docker run ubuntu echo 'hello, Docker!'

This is the run command, which creates a new container each time is called. On the command above,   we simply asked Docker to create a new container, using the image ubuntu – if it doesn’t have already, Docker will search Docker Hub for the latest version of the image and download it. If we wanted to use a specific version, we could specify like this: ‘ubuntu:14.04’ – from Docker Hub. At the end, we specify a command for the container to run, on this case, a simple “echo ‘hello world!’ “.

The image bellow show the command in action, downloading the image and executing:

To list the containers created, we run the command ‘docker ps’. If we run the command this way now, however, we won’t see anything, because the command by default only shows the active containers and on our case, our container simply stopped after running the command. If we run the command with the flag -a, we can then see the stopped containers, like the image bellow:

However, there’s a problem: our hello world container was just a test, so we didn’t want to keep with the container. Let’s correct this by removing the container, with the command:

docker rm 2f134296daa6

PS: The hash you can see on the command is part of the ID of the container, that you can see when you list the containers with docker ps.

Now, what about if we wanted to run again our hello world container, or any other container that we need just once, without the need to manually remove the container afterwards? We can do this using the –rm flag, running like this:

docker run --rm ubuntu echo 'hello, Docker!'

If we run docker ps -a again, after running the command above, we can see that there’s no containers  from our previous execution, proving that the flag worked correctly.

Let’s now do a little more complex example, starting a container with a Tomcat server. First, let’s search for the name of the image we want on Docker Hub. To do this, open a browser and type it:

https://hub.docker.com

Once inside the Docker Hub, let’s search for the image typing ‘tomcat’ on the search box on the top right corner of the site. We will be send to a screen like the one bellow. You can see the images with some classifiers like ‘official’ and ‘automated’.

Official means that the image is maintained by Docker itself, while automated means that the image is maintained by a CI workload.On the section ‘Publishing on Docker Hub’ we will understand more about the options to publish our own images to the Docker Hub.

From the Docker Hub we get that the name of the official image is ‘tomcat’, so let’s use this image. To use it, we simply run the following command, which will start a new container with the tomcat image:

docker run --name mytomcat -p 8888:8080 tomcat:8.0

You can see that we used some new flags on this command. First, we used the flag –name, which makes a name for our container, so we can refer to this name when running commands afterwards.

The -p flag is used to bind a port from the host machine to a port on the container. On the next sections, we will talk about creating our own images and we will see that we can expose ports from the container to be accessed by clients, like in this case that we exposed the port tomcat will serve on the container (8080) to the port 8888 of the host.

Lastly, when declaring the image we will use, we specified the version, 8.0, meaning that we want to use Tomcat 8.0. Note that we didn’t passed any command to the container, since the image is already configured to start the Tomcat server after the building. After running the command, we can see that tomcat is running inside our container:

The problem is that now our terminal is occupied by the tomcat process so we can’t issue more commands. Let’s press Crtl+C and type the docker ps command. The container is not active! the reason for this is that, by default, Docker don’t put the containers to run in background when we create a new container. To do this, we use the -d flagso on the previous command, all we had to do is include this flag to make the container start on background.

Let’s start the container again with the command docker start:

docker start mytomcat

After the start, we can see that the container is running, if we run docker ps:

Before we open the server on the browser, let’s just check if the container is exposed on the port we defined. To do this, we use the command:

docker port mytomcat

This command will return the following result:

8080/tcp -> 0.0.0.0:8888

Which means that the container is exposed on the port 8888, as defined. If we open the browser on localhost:8888 – or the ip binded by Docker on a non-linux environment – we will receive the following screen:

Excellent! Now we have our own dockerized Tomcat! However, by starting the container in background, we couldn’t see the logging of the server, to see if there wasn’t any problem on the startup of the web server. let’s see the last lines of the log of our container by entering the command:

docker logs mytomcat

This will show the last inputs of our container on the stdout. We could also use the command:

docker attach mytomcat

This command has the same effect of the tail command on a file, making it possible to watch the logs of the container. Take notice that after attaching to a container, pressing Ctrl+C will kill him!

Before ending our container, let’s see more detailed information about our container, such as the Java version, network mappings etc. To do this, we run the following command:

docker inspect mytomcat

This will produce a result like the following, in JSON format:

[

{

“Id”: “14598a44a3b4fe2a5b987ea365b2c83a0399f4fda476ad1754acee23c96fcc22”,

“Created”: “2016-01-03T17:59:49.632050403Z”,

“Path”: “catalina.sh”,

“Args”: [

“run”

],

“State”: {

“Status”: “running”,

“Running”: true,

“Paused”: false,

“Restarting”: false,

“OOMKilled”: false,

“Dead”: false,

“Pid”: 4811,

“ExitCode”: 0,

“Error”: “”,

“StartedAt”: “2016-01-03T17:59:49.728287604Z”,

“FinishedAt”: “0001-01-01T00:00:00Z”

},

“Image”: “af28fa31b54b2e45d53e80c5a7cbfd2693f198fdb8ba53d44d8a432832ad1012”,

“ResolvConfPath”: “/mnt/sda1/var/lib/docker/containers/14598a44a3b4fe2a5b987ea365b2c83a0399f4fda476ad1754acee23c96fcc22/resolv.conf”,

“HostnamePath”: “/mnt/sda1/var/lib/docker/containers/14598a44a3b4fe2a5b987ea365b2c83a0399f4fda476ad1754acee23c96fcc22/hostname”,

“HostsPath”: “/mnt/sda1/var/lib/docker/containers/14598a44a3b4fe2a5b987ea365b2c83a0399f4fda476ad1754acee23c96fcc22/hosts”,

“LogPath”: “/mnt/sda1/var/lib/docker/containers/14598a44a3b4fe2a5b987ea365b2c83a0399f4fda476ad1754acee23c96fcc22/14598a44a3b4fe2a5b987ea365b2c83a0399f4fda476ad1754acee23c96fcc22-json.log”,

“Name”: “/mytomcat”,

“RestartCount”: 0,

“Driver”: “aufs”,

“ExecDriver”: “native-0.2”,

“MountLabel”: “”,

“ProcessLabel”: “”,

“AppArmorProfile”: “”,

“ExecIDs”: null,

“HostConfig”: {

“Binds”: null,

“ContainerIDFile”: “”,

“LxcConf”: [],

“Memory”: 0,

“MemoryReservation”: 0,

“MemorySwap”: 0,

“KernelMemory”: 0,

“CpuShares”: 0,

“CpuPeriod”: 0,

“CpusetCpus”: “”,

“CpusetMems”: “”,

“CpuQuota”: 0,

“BlkioWeight”: 0,

“OomKillDisable”: false,

“MemorySwappiness”: -1,

“Privileged”: false,

“PortBindings”: {

“8080/tcp”: [

{

“HostIp”: “”,

“HostPort”: “8888”

}

]

},

“Links”: null,

“PublishAllPorts”: false,

“Dns”: [],

“DnsOptions”: [],

“DnsSearch”: [],

“ExtraHosts”: null,

“VolumesFrom”: null,

“Devices”: [],

“NetworkMode”: “default”,

“IpcMode”: “”,

“PidMode”: “”,

“UTSMode”: “”,

“CapAdd”: null,

“CapDrop”: null,

“GroupAdd”: null,

“RestartPolicy”: {

“Name”: “no”,

“MaximumRetryCount”: 0

},

“SecurityOpt”: null,

“ReadonlyRootfs”: false,

“Ulimits”: null,

“LogConfig”: {

“Type”: “json-file”,

“Config”: {}

},

“CgroupParent”: “”,

“ConsoleSize”: [

0,

0

],

“VolumeDriver”: “”

},

“GraphDriver”: {

“Name”: “aufs”,

“Data”: null

},

“Mounts”: [],

“Config”: {

“Hostname”: “14598a44a3b4”,

“Domainname”: “”,

“User”: “”,

“AttachStdin”: false,

“AttachStdout”: false,

“AttachStderr”: false,

“ExposedPorts”: {

“8080/tcp”: {}

},

“Tty”: false,

“OpenStdin”: false,

“StdinOnce”: false,

“Env”: [

“PATH=/usr/local/tomcat/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin”,

“LANG=C.UTF-8”,

“JAVA_VERSION=7u91”,

“JAVA_DEBIAN_VERSION=7u91-2.6.3-1~deb8u1”,

“CATALINA_HOME=/usr/local/tomcat”,

“TOMCAT_MAJOR=8”,

“TOMCAT_VERSION=8.0.30”,

“TOMCAT_TGZ_URL=https://www.apache.org/dist/tomcat/tomcat-8/v8.0.30/bin/apache-tomcat-8.0.30.tar.gz”

],

“Cmd”: [

“catalina.sh”,

“run”

],

“Image”: “tomcat:8.0”,

“Volumes”: null,

“WorkingDir”: “/usr/local/tomcat”,

“Entrypoint”: null,

“OnBuild”: null,

“Labels”: {},

“StopSignal”: “SIGTERM”

},

“NetworkSettings”: {

“Bridge”: “”,

“SandboxID”: “37fe977f6a45f4495eae95260a67f0e99a8168c73930151926ad521d211b4ac6”,

“HairpinMode”: false,

“LinkLocalIPv6Address”: “”,

“LinkLocalIPv6PrefixLen”: 0,

“Ports”: {

“8080/tcp”: [

{

“HostIp”: “0.0.0.0”,

“HostPort”: “8888”

}

]

},

“SandboxKey”: “/var/run/docker/netns/37fe977f6a45”,

“SecondaryIPAddresses”: null,

“SecondaryIPv6Addresses”: null,

“EndpointID”: “09477abf22e8019e2f8d3e0deeb18d9362136954a09e336f4a93286cb3b8b027”,

“Gateway”: “172.17.0.2”,

“GlobalIPv6Address”: “”,

“GlobalIPv6PrefixLen”: 0,

“IPAddress”: “172.17.0.3”,

“IPPrefixLen”: 16,

“IPv6Gateway”: “”,

“MacAddress”: “02:42:ac:11:00:03”,

“Networks”: {

“bridge”: {

“EndpointID”: “09477abf22e8019e2f8d3e0deeb18d9362136954a09e336f4a93286cb3b8b027”,

“Gateway”: “172.17.0.2”,

“IPAddress”: “172.17.0.3”,

“IPPrefixLen”: 16,

“IPv6Gateway”: “”,

“GlobalIPv6Address”: “”,

“GlobalIPv6PrefixLen”: 0,

“MacAddress”: “02:42:ac:11:00:03”

}

}

}

}

]

Now, let’s stop our container, by entering:

docker stop mytomcat

Finally, let’s remove the container, since we won’t use him anymore on this practice:

docker rm mytomcat

Let’s also remove the image, since we won’t use it either on our next steps:

docker rmi tomcat:8.0

And that concludes our breaking course on Docker’s commands! There are also other useful commands as well, of course, such as:

  • docker build: Used to build a image from a Dockerfile (see next sections);
  • docker commit: Creates a image from a container;
  • docker push: Pushes the image for a registry (Docker Hub by default);
  • docker exec: Submit a command for a running container;
  • docker export: Export the file system of a container as a tar file;
  • docker images: list the images inside Docker;
  • docker kill: force kill a running container;
  • docker restart: restart a container;
  • docker network: manages docker networks (see next sections);
  • docker volume: manages docker volumes (see next sections);

Creating your own images

On the previous section, we used a image from the Docker Hub, that creates a container with a Tomcat Web Server. This image is implemented by a script with building instructions called Dockerfile. On our lab we will create a Dockerfile, but for now, let’s just examine the Dockerfile from the tomcat’s image in order to learn some of the instructions available:

FROM java:7-jre
ENV CATALINA_HOME /usr/local/tomcat

ENV PATH $CATALINA_HOME/bin:$PATH

RUN mkdir -p “$CATALINA_HOME”

WORKDIR $CATALINA_HOME
# see https://www.apache.org/dist/tomcat/tomcat-8/KEYS

RUN gpg –keyserver pool.sks-keyservers.net –recv-keys \

05AB33110949707C93A279E3D3EFE6B686867BA6 \

07E48665A34DCAFAE522E5E6266191C37C037D42 \

47309207D818FFD8DCD3F83F1931D684307A10A5 \

541FBE7D8F78B25E055DDEE13C370389288584E7 \

61B832AC2F1C5A90F0F9B00A1C506407564C17A3 \

79F7026C690BAA50B92CD8B66A3AD3F4F22C4FED \

9BA44C2621385CB966EBA586F72C284D731FABEE \

A27677289986DB50844682F8ACB77FC2E86E29AC \

A9C5DF4D22E99998D9875A5110C01C5A2F6059E7 \

DCFD35E0BF8CA7344752DE8B6FB21E8933C60243 \

F3A04C595DB5B6A5F1ECA43E3B7BBB100D811BBE \

F7DA48BB64BCB84ECBA7EE6935CD23C10D498E23
ENV TOMCAT_MAJOR 8

ENV TOMCAT_VERSION 8.0.30

ENV TOMCAT_TGZ_URL https://www.apache.org/dist/tomcat/tomcat-$TOMCAT_MAJOR/v$TOMCAT_VERSION/bin/apache-tomcat-$TOMCAT_VERSION.tar.gz
RUN set -x \

&& curl -fSL “$TOMCAT_TGZ_URL” -o tomcat.tar.gz \

&& curl -fSL “$TOMCAT_TGZ_URL.asc” -o tomcat.tar.gz.asc \

&& gpg –verify tomcat.tar.gz.asc \

&& tar -xvf tomcat.tar.gz –strip-components=1 \

&& rm bin/*.bat \

&& rm tomcat.tar.gz*
EXPOSE 8080

CMD [“catalina.sh”, “run”]

As we can see, the first instruction is called FROM. This instruction delimits the base image upon the image will be constructed. On this case, the java:7-jre image will create a basic Linux environment, with Java 7 configured.

Then we see some ENV instructions. This commands set environment variables on the OS, as part of Tomcat’s configuration. We can also see a WORKDIR instruction, which defines the directory that, from that point, Docker will use to run the commands.

We also see some RUN instructions. This instructions, as the name implies, run commands on the container.On the case of our image, this instructions make tomcat’s installation process.

Finally we see the EXPOSE instruction, which exposes the 8080 port for the host. Finally, we see the CMD instruction, which defines the start of the server. One important distinction of the CMD command from the RUN ones is that, while the RUN instructions only executes on the build phase, the CMD instruction is executed on the container’s startup, making it the command to be executed on the creation and start of a container. For this reason, it is only allowed to have one CMD command per Dockerfile.

One important side note about the CMD command is that, when starting a container, it allows to override the command on the Dockerfile, so it could lead to security holes when used. For this reason, it is recommended to use the ENTRYPOINT instruction instead, that like the CMD instruction, it defines the command the container will run at startup, but on this case, it will not permit the command to be overridden, making it more secure.

We could also use ENTRYPOINT and CMD combined, making it possible to restrict the user to only pass flags and/or arguments to the running command.

Of course, that is not the only instructions we can use on a Dockerfile. Other instructions are:

  • MAINTAINER: Defines the author of the image;
  • LABEL: Adds metadata to a image, in a key=value format;
  • ADD: Adds a file, folder or remote file URL to a folder inside the container. The current directory from which this instruction point it out is the same directory of the Dockerfile;
  •  COPY: Analogous to ADD, this instruction also copies files and folders from the host to the container. The two major differences are that COPY doesn’t allow the use of URLs and COPY doesn’t uncompress known files, like ADD can;
  • VOLUME: Creates a volume. On the section “creating the base image” we will learn in more detail about what are volumes on Docker;
  • USER: Defines the name of the user used to run the RUN, CMD and ENTRYPOINT instructions;
  • ARG: Defines build-time variables, that could be used to pass parameters for the image’s buildings, using the build-arg flag;
  • ONBUILD: Defines a command to be triggered by other images, when the image is used as a base image for other images;
  • STOPSIGNAL: Defines the code for the container to stop, as a number or in the SIGNAME format, for instance SIGKILL.

Publishing on Docker Hub

As we saw on the previous section, in order to create our own images, all we have to do is create a “Dockerfile” file, with the instructions necessary for the building. However, in order to distribute our images, we need to register them on a image registry, like the Docker Hub, for example.

The simplest way to publish images are with the commit and push commands. For example, if we had  made changes to our Tomcat container and wanted to save those changes as a new image, we could do this with this command:

docker commit mytomcat alexandreesl/mytomcat

Where the second argument is the <repository ID>/<image name>. In order to push the images to Docker Hub, the repository ID must be equals to the username of our Docker Hub account.

After committing the changes, all we have to do is push the changes to the Hub, using the following command:

docker push alexandreesl/mytomcat

And that’s it! After the push, if we see our Docker Hub account, we can see that our image was successfully created:

NOTE: before pushing, you could have to run the command docker login to register your credentials from the Docker Hub

Another way of publishing is by linking our images with a git repository, this way the Docker Hub will rebuild the image each time a new version of the Dockerfile is pushed to the repository. Is this method of publishing that generates automated build images, since this way Docker can establish a CI Workload with the git repository. We will see this method in action on the next section, ‘Creating the base image’.

Practice: Using Docker to implement a microservices architecture

On our lab, we will take a step forward from my previous post about microservices with Spring Boot (haven’t see it? you can find it here, I am really grateful if you read that post as well!), by using a Service Registry called Eureka, designed by Netflix.

On the Service Registry pattern, we have a registry where microservices can register/unregister and also find the addresses of a service dynamically, this way decoupling the bounds between them. We will dockerize (run inside a container) the services of that lab and make them use Eureka, which will also run inside a container. In order to integrate with Eureka, we will modify our applications to use the Spring Cloud project, which according to the project’s description:

Spring Cloud provides tools for developers to quickly build some of the common patterns in distributed systems (e.g. configuration management, service discovery, circuit breakers, intelligent routing, micro-proxy, control bus, one-time tokens, global locks, leadership election, distributed sessions, cluster state). Coordination of distributed systems leads to boiler plate patterns, and using Spring Cloud developers can quickly stand up services and applications that implement those patterns. They will work well in any distributed environment, including the developer’s own laptop, bare metal data centres, and managed platforms such as Cloud Foundry.

Also, to “Springfy” even more our example, we will exchange our implementation from the previous post that uses pure jax-rs to the RestController implementation, present on the Spring Web project.

Creating the base image

Well, so let’s begin our practice! First, we will create a Docker Network to accommodate the containers.

If we see the network adapters from the host machine when the Docker Daemon is up, we will see that he creates a bridge adapter on the host, subsequently appending the containers on network interfaces inside the bridge adapter. This forms a subnet where the containers can see each other and also Docker facilitates the work for us, by mapping the container’s IPs with their names, on the /etc/hosts files inside.

On our scenario, we will use this feature so the microservices can easily find our Eureka registry, by mapping the address to the container’s name. Of course that on a real scenario we would have a cluster of Eurekas with a load balancer on separate hosts, but for simplicity’s sake of our lab, we will just use one Eureka instance.

So, in order to create a network to accommodate our architecture, first we create a network, by running the following command:

docker network create microservicesnet

After running the command we will see a hash’s ID indicating our network was created. We can see the details of our network by running the following command:

docker network inspect microservicesnet

Which will produce a result like the following:

[

{

“Name”: “microservicesnet”,

“Id”: “e8d401a00de26b74f4f2461c13dbf848c14f37127b3337a3e40465eff5897910”,

“Scope”: “local”,

“Driver”: “bridge”,

“IPAM”: {

“Driver”: “default”,

“Config”: [

{}

]

},

“Containers”: {},

“Options”: {}

}

]

Notice that, for now, the containers object is empty. This is why we didn’t add any containers to the network yet, but this will soon change.

Now we will create the Dockerfile. Create a folder in the directory of your choice and create a file called ‘Dockerfile’ (without any extension). On my case, I will push my Dockerfile to a Github repository, in order to create the image on Docker Hub as a automated build image. You can find the repositories with the source for this lab at the end of the post.

So, without further delay, let’s code the Dockerfile. The code for our image will be the following:

FROM java:8-jre

MAINTAINER Alexandre Lourenco <alexandreesl@example.com>

VOLUME [ "/data" ]

WORKDIR /data

EXPOSE 8080
ENTRYPOINT [ "java" ]
CMD ["-?"]

 

As we can see, it is a simple Dockerfile. We use Java’s 8 official image as the base,  define a volume and set his folder as our Workdir, expose the 8080 port which is the default for Spring Boot and finally we combine the entrypoint and cmd commands, meaning that anything we pass when we get up a container will be treated as parameters for the Java command. But what is this volume we have speaked of?

The reader must remember what we saw about the Union File System and how each image is layered as a read-only layer. The volumes on Docker is a technique that bypasses the Union File System, by defining a mount point on the Containers that points for a shared space on the host machine. The uses of this technique are in order to provide a place where the container can read/write information that needs to be persisted, alongside the need to share data between containers.

On our scenario, we will use the volume to point to a place where the jars of our microservices will be generated, so when a container runs a microservice, it runs the last version of the software. On a real-world scenario, this place could be a result of a CI workload, for example from a Jenkins job.

Now that we have our image, let’s build the image to test his correctness:

docker build -t alexandreesl/microservices-spring-boot-base .

At the end, we will see a message that our building was successful, validating the Dockerfile. Out of curiosity, if you see the building process, you will notice that Docker created several temporary containers for each instruction of our script, maintaining a status of the last successful instruction executed, so if we have to rebuild the image after a failure, we can restart from the failed point. We can disable this feature with the flag -no-cache if we want.

Now, let’s register the image on Docker Hub. I have created a Github repository here, so I will register the image directly on the site. To do this, we log in on our account and click on the “create automated build” menu item, as shown on the screen bellow:

After clicking, if we haven’t linked a github account yet, we will be prompted to do so. after making the linking, we will see a page with the list of our git repositories. We select the one with the Dockerfile and finally create the image, entering the name and a description for the image, as the screen bellow:

After clicking on the create button, we have completed our step, successfully registering our image on the Docker Hub! You can see my image on this link.

Lastly, just to test if our Docker Hub upload was really successful, we will remove the image from the local cache – that we added with our docker build command – and download using the docker pull command. First, let’s remove the image with the command:

docker rmi alexandreesl/microservices-spring-boot-base

And after, let’s pull the image with the command:

docker pull alexandreesl/microservices-spring-boot-base

After the download, we will have the image from the Docker Hub repository.

Preparing the service registry image

For the service registry image, we actually don’t have to do any coding, since we will use a already made image. We will start the image on the “Launching the containers” section, but the reader can see the image’s page to satisfy his curiosity here.

Preparing the services to use the registry

Now, Let’s prepare the services for the registering/deregistering of microservices, alongside service discovery when a service has dependencies.

To focus on the explanation, I will omit some details like the pom configuration, since the reader can easily get the configuration on the Github repository from the lab. On this lab, we have 3 microservices: a Customer service, a Product service and a Order service. The Order service has dependencies on the other 2 services, in order to mount a Order.

On the 3 projects we will have the same configuration for Spring Boot’s main class, called Application,  as follows:

@SpringBootApplication
@Configuration
@EnableAutoConfiguration
@EnableEurekaClient
public class Application {

public static void main(String[] args) {
 SpringApplication.run(Application.class, args);
 }
}

The key line here is the annotation @EnableEurekaClient, which configures the Spring Boot’s application to work with Eureka. Just with these annotation alone, we configure Spring Boot to connect to Eureka at startup and register itself with, send heartbeats during his lifecycle to keep the registration alive and finally make a unregistration when the process is terminated. Also, it instantiate a RestTemplate with a Ribbon load balancer, also made by Netflix, that can easily lookup service addresses from the registry. All of this with just one annotation!

In order to connect to Eureka, we have to provide the registry’s address. this is made by configuring a YAML file, called application.yml, that we put it on the resources source folder. We also configure a application name on this file, in order to inform Eureka about what Application’s ID we would like to use for the microservice.

So, in order to make this configuration, we create the files on our projects. seeing a example, for the Order’s service,  we have the following YAML configuration:

spring:
 application:
 name: OrderService
eureka:
 client:
 serviceUrl:
 defaultZone: http://eureka:8761/eureka/

Notice that when we configure Eureka’s address, we used “eureka” as the host name. This is the name of the container that we will use in order to deploy the Eureka server on our Docker Network, so on the /etc/hosts files of our containers, this name will be mapped to Eureka’s container IP, making it possible to point it out dynamically.

Now let’s see how our Microservices were implemented. For the Customer service, we have the following code:

@RestController
@RequestMapping("/")
public class CustomerRest {

private static List<Customer> clients = new ArrayList<Customer>();

static {
Customer customer1 = new Customer();
 customer1.setId(1);
 customer1.setName("Cliente 1");
 customer1.setEmail("customer1@gmail.com");

Customer customer2 = new Customer();
 customer2.setId(2);
 customer2.setName("Cliente 2");
 customer2.setEmail("customer2@gmail.com");

Customer customer3 = new Customer();
 customer3.setId(3);
 customer3.setName("Cliente 3");
 customer3.setEmail("customer3@gmail.com");

Customer customer4 = new Customer();
 customer4.setId(4);
 customer4.setName("Cliente 4");
 customer4.setEmail("customer4@gmail.com");

Customer customer5 = new Customer();
 customer5.setId(5);
 customer5.setName("Cliente 5");
 customer5.setEmail("customer5@gmail.com");

clients.add(customer1);
 clients.add(customer2);
 clients.add(customer3);
 clients.add(customer4);
 clients.add(customer5);
}

@RequestMapping(method = RequestMethod.GET, produces = MediaType.APPLICATION_JSON_VALUE)
 public List<Customer> getClientes() {
 return clients;
 }

@RequestMapping(value = "customer/{id}", method = RequestMethod.GET, produces = MediaType.APPLICATION_JSON_VALUE)
 public Customer getCliente(@PathVariable long id) {

Customer cli = null;

for (Customer c : clients) {

if (c.getId() == id)
 cli = c;

}

return cli;
 }
}

As we can see, nothing out of normal, very simple usage of Spring’s REST Controllers. Now, on to the Product service:

@RestController
@RequestMapping("/")
public class ProductRest {

private static List<Product> products = new ArrayList<Product>();

static {

Product product1 = new Product();
 product1.setId(1);
 product1.setSku("abcd1");
 product1.setDescription("Produto1");

Product product2 = new Product();
 product2.setId(2);
 product2.setSku("abcd2");
 product2.setDescription("Produto2");

Product product3 = new Product();
 product3.setId(3);
 product3.setSku("abcd3");
 product3.setDescription("Produto3");

Product product4 = new Product();
 product4.setId(4);
 product4.setSku("abcd4");
 product4.setDescription("Produto4");

products.add(product1);
 products.add(product2);
 products.add(product3);
 products.add(product4);

}

@RequestMapping(method = RequestMethod.GET, produces = MediaType.APPLICATION_JSON_VALUE)
 public List<Product> getProdutos() {
 return products;
 }

@RequestMapping(value = "product/{id}", method = RequestMethod.GET, produces = MediaType.APPLICATION_JSON_VALUE)
 public Product getProduto(@PathVariable long id) {

Product prod = null;

for (Product p : products) {

if (p.getId() == id)
 prod = p;

}

return prod;
 }

}

Again, nothing unusual on the code. Now, let’s see the Order service, where we will see the registry been used for consuming microservices:

@RestController
@RequestMapping("/")
public class OrderRest {

private static long id = 1;

// Created automatically by Spring Cloud
 @Autowired
 @LoadBalanced
 private RestTemplate restTemplate;

private Logger logger = Logger.getLogger(OrderRest.class);

@RequestMapping(value = "order/{idCustomer}/{idProduct}/{amount}", method = RequestMethod.GET, produces = MediaType.APPLICATION_JSON_VALUE)
 public Order submitOrder(@PathVariable long idCustomer, @PathVariable long idProduct, @PathVariable long amount) {

Order order = new Order();

 Map map = new HashMap();

 map.put("id", idCustomer);

ResponseEntity<Customer> customer = restTemplate.exchange("http://CUSTOMERSERVICE/customer/{id}",
 HttpMethod.GET, null, Customer.class, map);

 map = new HashMap();

 map.put("id", idProduct);

ResponseEntity<Product> product = restTemplate.exchange("http://PRODUCTSERVICE/product/{id}", HttpMethod.GET,
 null, Product.class, map);

order.setCustomer(customer.getBody());
order.setProduct(product.getBody());
order.setId(id);
order.setAmount(amount);
order.setOrderDate(new Date());

logger.warn("The order " + id + " for the client " + customer.getBody().getName() + " with the product "
 + product.getBody().getSku() + " with the amount " + amount + " was created!");

id++;

return order;
 }
}

The first thing we notice is the autowired RestTemplate, that also has a@LoadBalanced annotation. This RestTemplate is automatically instantiated by Spring Cloud, creating a interface which we could use to communicate with Eureka. The annotation informs that we want to use Ribbon to make the load balance, in case we have a cluster of instances of the same service.

The exchange method we use inside of the submitOrder method is where the magic happens. When we make our calls using this method, internally a lookup is made with the Eureka Server, the addresses of the service are passed to Ribbon in order to load balance the calls, and them the call is made. Notice that we used names like “CUSTOMERSERVICE” as the host address on our URIs. This pattern tells the framework the ID of the application we really want to call, been replaced at runtime with one of the IP addresses of the service we want to call.

And that concludes our quick explanation of the Java code involved on our lab. Again, if the reader want to get the full code, just head for the Github repositories at the end of this post. In order to run the lab, I recommend that you clone the whole repository and execute a mvn package command on each of the 3 projects. If you don’t have Maven, you can get it from here.

Launching the containers

Now that we are almost finished with the lab, it is time for the fun part: run our containers! To do this, we will use docker-compose. With docker-compose, we can start/stop/kill etc full stacks of containers, without having to instantiate everything by hand. In order to make our docker-compose stack, let’s create a YAML file called docker-compose.yml and include the following configuration:

eureka:
 image: springcloud/eureka
 container_name: eureka
 ports:
 - "8761:8761"
 net: "microservicesnet"
customer1:
 image: alexandreesl/microservices-spring-boot-base
 container_name: customer1
 hostname: customer1
 net: "microservicesnet"
 ports:
 - "8080:8080"
 volumes:
 - /Users/alexandrelourenco/Applications/git/docker-handson:/data
 command: -jar /data/Customer-backend/target/Customer-backend-1.0.jar
product1:
 image: alexandreesl/microservices-spring-boot-base
 container_name: product1
 hostname: product1
 net: "microservicesnet"
 ports:
 - "8081:8080"
 volumes:
 - /Users/alexandrelourenco/Applications/git/docker-handson:/data
 command: -jar /data/Product-backend/target/Product-backend-1.0.jar
order1:
 image: alexandreesl/microservices-spring-boot-base
 container_name: order1
 hostname: order1
 net: "microservicesnet"
 ports:
 - "8082:8080"
 volumes:
 - /Users/alexandrelourenco/Applications/git/docker-handson:/data
 command: -jar /data/Order-backend/target/Order-backend-1.0.jar

Some things to notice from our configuration:

  • In all the containers we configured our “microservicesnet” network, in order to use the Docker Network feature to resolve our needs;
  • On the MicroService’s containers, we also defined the hostname, to force Spring Cloud’s registration control to properly register the correct alias for the service’s addresses;
  • We have exposed Eureka’s port to the physical host at 8761, so we can see the web interface from outside the Docker environment;
  • On the volumes sections, I have mapped the base folder where my Maven projects from the microservices are placed, binding to the /data folder inside the container, which is defined as the workdir on the image we created previously. This way, on the command section, when we map the location of the microservice’s Spring Boot jar, we map the location from the /data folder;

Finally, after saving our file, we run the script by simply running the following command, at the same folder of the YAML file:

docker-compose up

We will see some information about our containers  getting up and at the end we will see some log information from all the containers mixed, with the name of the container as a prefix for identification:

After waiting some moments we will see our containers are up. Let’s see if the microservices are up? Let’s open a browser and point it to the port 8761, using localhost or the ip used by Docker in case you are in OSX/Windows:

Excellent! Not only has Eureka booted up successfully, but also all of our services have registered with it. Let’s now toy a little with our stack, to test it out.

Let’s begin by making a search for a customer of ID 1 on the Customer’s service:

curl -XGET 'http://<your ip>:8080/customer/1'

This will produce a JSON response like the following:

{"id":1,"name":"Cliente 1","email":"customer1@gmail.com"}

Next, let’s test out the Product service, with a call for the details of the Product of ID 4:

curl -XGET 'http://<your ip>:8081/product/4'

This time it will produce a response like the following:

{"id":4,"sku":"abcd4","description":"Produto4"}

Finally, the moment of true: let’s call the Order service, that utilises our other 2 services, to see if the service’s addresses are resolved with Eureka’s help. To test it, let’s make the following call:

curl -XGET 'http://<your ip>:8082/order/2/1/4'

If everything goes well, we will see a response like:

{"id":1,"amount":4,"orderDate":1452216090392,"customer":{"id":2,"name":"Cliente 2","email":"customer2@gmail.com"},"product":{"id":1,"sku":"abcd1","description":"Produto1"}}

Success! But can we have further proof? If we see the logs from the Order1 container, we can see some logs showing Ribbon in action, searching for the services we need on the call and initializing load balancers, in order to serve our needs:

Let’s make one last test before closing in: let’s add one more instance of the Product’s service, to see if the new instance is registered under the same application ID. To do this, let’s run the following command. Notice that we didn’t specify the port to expose the service, since we just want to add up the instance to load balance the internal calls from the Order service:

docker run --rm --name product2 --hostname=product2 --net=microservicesnet -v /Users/alexandrelourenco/Applications/git/docker-handson:/data alexandreesl/microservices-spring-boot-base -jar /data/Product-backend/target/Product-backend-1.0.jar

If we open the Eureka web interface again, we will see that the second address is mapped:

Finally, to test the balance, let’s make a call of the previous URL of the Order service. After the call, we will see that our service is aware of both addresses, proving that the balance is implemented, as we can see on the log’s fragment bellow:

[2016-01-08 14:14:09.703] boot – 1  INFO [http-nio-8080-exec-1] — ChainedDynamicProperty: Flipping property: PRODUCTSERVICE.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647

[2016-01-08 14:14:09.709] boot – 1  INFO [http-nio-8080-exec-1] — BaseLoadBalancer: Client:PRODUCTSERVICE instantiated a LoadBalancer:DynamicServerListLoadBalancer:{NFLoadBalancer:name=PRODUCTSERVICE,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:null

[2016-01-08 14:14:09.714] boot – 1  INFO [http-nio-8080-exec-1] — ChainedDynamicProperty: Flipping property: PRODUCTSERVICE.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647

[2016-01-08 14:14:09.718] boot – 1  INFO [http-nio-8080-exec-1] — DynamicServerListLoadBalancer: DynamicServerListLoadBalancer for client PRODUCTSERVICE initialized: DynamicServerListLoadBalancer:{NFLoadBalancer:name=PRODUCTSERVICE,current list of Servers=[product1:8080, product2:8080],Load balancer stats=Zone stats: {defaultzone=[Zone:defaultzone; Instance count:2; Active connections count: 0; Circuit breaker tripped count: 0; Active connections per server: 0.0;]

},Server stats: [[Server:product2:8080; Zone:defaultZone; Total Requests:0; Successive connection failure:0; Total blackout seconds:0; Last connection made:Thu Jan 01 00:00:00 UTC 1970; First connection made: Thu Jan 01 00:00:00 UTC 1970; Active Connections:0; total failure count in last (1000) msecs:0; average resp time:0.0; 90 percentile resp time:0.0; 95 percentile resp time:0.0; min resp time:0.0; max resp time:0.0; stddev resp time:0.0]

, [Server:product1:8080; Zone:defaultZone; Total Requests:0; Successive connection failure:0; Total blackout seconds:0; Last connection made:Thu Jan 01 00:00:00 UTC 1970; First connection made: Thu Jan 01 00:00:00 UTC 1970; Active Connections:0; total failure count in last (1000) msecs:0; average resp time:0.0; 90 percentile resp time:0.0; 95 percentile resp time:0.0; min resp time:0.0; max resp time:0.0; stddev resp time:0.0]

]}ServerList:org.springframework.cloud.netflix.ribbon.eureka.DomainExtractingServerList@2561f098

[2016-01-08 14:14:09.738] boot – 1  INFO [http-nio-8080-exec-1] — ConnectionPoolCleaner: Initializing ConnectionPoolCleaner for NFHttpClient:PRODUCTSERVICE

And that concludes our lab. As we can see, we builded a robust stack to deploy our microservices with their dependencies, with little effort.

Clustering container hosts

In all of our examples, we are always working with a single Docker host, our own machine on the case. On a real production environment, we can have some dozens or even hundreds of Docker hosts. In order to make all the hosts to work together, it is necessary to implement some layer that manages the hosts, making them appear as a single cluster to the consumers.

This is the goal of Docker Swarm. With Docker Swarm, we can connect multiple Docker hosts across a network, using them as a single cluster by using Docker Swarm’s interface.

For further information about Docker Swarm alongside a good example of utilization, I suggest consulting the excellent “The Docker Book”, which I describe on the next section.

References

This post was inspired by my studies on Docker with The Docker Book, written by James Turnbull. It is a excellent source of information, that I highly recommend to buy it! You can find the book available to purchase online on:

www.dockerbook.com

Conclusion

And this concludes our tour on the world of Docker. With a simple interface, Docker allows us to use all the power of the container world, allowing us to quickly deploy and escalate our applications. Never it was so simple to deploy our applications! Thank you for following me on another post, until next time.

Continue reading

TDD: Designing our code, test-first

Standard

Hi, dear readers! Welcome to my blog. On this post, we will talk about TDD, a methodology that preaches about focusing first on tests, instead of our production code. I have already talked about test technologies on my post about mockito, so in this post we will focus more on the theoretical aspects of this subject. So, without further delay, let’s begin talking about TDD!

Test-driven development

TDD, also known as Test-driven development, is a development technique created by Kent Beck, which also created JUnit, a highly known Java test library.

TDD, as the name implies, dictates that the development must be guided by tests, that must be written before even the “real” code that implements the requirements is written! Let’s see in more detail how the steps of TDD work on the next section.

TDD steps

As we can see on the picture above, the following steps must be met to proceed in using the TDD paradigm:

  1. Write a test (that fails): Represented by the red circle, it means that before we implement the code – a method on a class, for example – we implement a test case, like a JUnit method for instance, and invoke our code, in this case a simple empty method. Of course, on this scenario the test will fail, since there is no implementation, which leads to our next step;
  2. Write code just enough to pass: Represented by the green circle, it means that we write code on our method just enough to make our test case pass with success. The principle behind this step is that we write code with simplicity in mind, in other words, keeping it simple, like the famous kiss principle;
  3. Refactor the code, improving his quality: Represented by the blue circle, it means that, after we make our code to pass the test, we analyse our code, looking for chances to improve his quality. Is there duplicate code? Remove the duplications! Is there hard coded constants? Consider changing to a enum or a properties table, for instance.

A important thing to notice is that the focus of the third step and the previous one is to not implement more functionality then the test code is testing. The reason for this is pretty obvious: since our focus is to implement the tests for our scenarios first, any functionality that we implement without the test reflecting the scenario is automatically a code without test coverage;

After we conclude the third step, the construction returns to the first step, reinitiating the cycle. On the next iteration, we implement another test case for our method, representing another scenario. We then code the minimum code to implement the new scenario and conclude the process by refactoring the code. This keeps going until all the scenarios are met, leaving at the end not only our final code, but all the test cases necessary to effective test our method, with all the functionalities he is intended to implement.

The reader may be asking: “but couldn’t we get the same by coding the tests after we write our production code?”. The answer is yes, we could get the same results by adding our tests at the end of the process, but then we wouldn’t have something called feedback from our code.

Feedback

When we talk about feedback, we mean the perception we have about how our code will perform. If we think about, the test code is the first client of our code: he will prepare the data necessary for the invocation, invoke our code and then collect the results. By implementing the tests first, we receive more rapidly this feedbacks from our implementation, not only on the sphere of correctness, but also on design levels.

For instance, imagine that we are building a test case and realize that the test is getting difficult to implement because of the complex class structure we have to populate just to make the invocation of our test’s target. This could mean that our class model is too complex and a refactoring is necessary.

By getting this feedback when we are just beginning the development – remember that we always code just enough to the test to pass! – it makes a lot more cheap and less complex to refactor the model then if we receive this feedback just at the end of the construction, when a lot of time and effort was already made! By this point of view, we can easily see the benefits of the TDD approach.

Types of tests

When we talked about tests on the previous sections, we talked about a type of test called unit test. Unit tests are tests focused on testing a single program unit, like a class, not focusing on testing the access to external resources, for example. Of course, there are other types of tests we can implement, as follows:

Unit tests: Unit tests have their focus to test individually each program unit that composes the system, without external interferences.

Integration tests: Integration tests also have the focus to test program units, but in this case to test the integration of the system with external resources, like a database, for example. A good example of a integration test is test cases for a DAO class, testing the code that implement insertions, updates, etc.

System tests: System tests, as the name implies, are tests that focus on testing the whole system, across all his layers. In a web application, for example, that means a automated test that turns on a server, execute some HTTP requests to the web application and analyse the responses. A good example of technology that could be used to test the web tier is a tool called Selenium.

Acceptance tests: Acceptance tests, commonly, are tests made by the end user of the system. The focus of this kind of test is to measure the quality of the implementation of requirements specified by the user. Other requirements such as usability are also evaluated by this kind of test.

A important thing to notice is that this kind of test is also referred as a automated system test, with the objective of testing the requirement as it is, for example:

  • Requirement: the invoice must be inserted on the system by the web interface;
  • Test: create a system test that executes the insertion by the web interface and verifies if the data is correctly inserted;

This technique, called ATDD (Acceptance Test-Driven Development) preaches that first a acceptance test must be created, and then the TDD technique is applied, until the acceptance test is satisfied. The diagram bellow shows this technique in practice:

Mock objects and unit tests

When we talk about unit tests, as said before, it is important to isolate the program unit we are testing, avoiding the interference from other tiers and dependencies that the program unit uses. On the Java world, a good framework we can use to create mocks is Mockito, which we talked about on my previous post.

With mocks, following the principles of TDD, we can, for example, create just the interfaces our code depends on and mock that interfaces, this way already establishing the communication channel that will be created, without leaving our focus from the program unit we are creating.

Another benefit of this approach is on the creation of the interfaces themselves, since our focus is always to implement the minimum necessary for the tests to pass, the resulting interfaces will be simple and concise, improving their quality.

Considerations

When do I use mocks?

A important note about mocks is that not always is good to use them. Taking for example a DAO class, that essentially is just a class that implement code necessary to interact with a database, for instance, the use of mocks won’t bring any value, since the code of the class itself is too simple to benefit from a unit test. On this cases, typically just a integration test is enough, using for example in-memory databases such as HSQLDB to act as the database.

Should I code more then one test case at a time?

In general, is considered a bad practice to write more then one test case at once before running and getting the fail results. The reason for this is that the point of the technique is to test one functionality at a time, which of course is “ruined” by the practice of coding more then one test at once.

How much of a functionality do I code on the test?

On the TDD terminology, we can also call the “amounts” of functionality we code at each iteration as baby steps. There is no universal convention of how much must be implement in each of this baby steps, but it is a good advice to follow common sense. If the developer is experienced and the functionality is simple, it could be even possible to implement almost the whole code on just one iteration.

If the developer is less experienced and/or the functionality is more complex, it makes more sense to spend more time with more baby steps, since it will create more feedbacks for the developer, making it easier to implement the complex requirements.

Should I test private code?

Private code, as the name implies, are code – like a method, for instance – that are accessible only inside the same program unit, like a class, for example. Since the test cases are normally implemented on another program unit (class), the test code can’t access this code, which in turn begs the question: should I make any code to test that private code?

Generally speaking, a common consensus is that private code is intended to implement little things, like a portion of code that is duplicated across multiple methods, for example. In that scenario, if you have for instance a private method on a Java class that it is enormous with lots of functionality, then maybe it means that this method should be made public, maybe even moved to a different class, like a utility class.

If that is not the case, then it is really necessary to design test cases to efficiently test the method, by invoking him indirectly by his public consumer. Talking specifically on the Java World, one way to test the code without the “barrier” of the public consumer is by using reflection.

My target code has too much test cases, is this normal?

When implementing the test cases for a target production code – a method, for example – the developer could end up on a scenario that lots of test cases are created just to test the whole functionality that single code composes. When this kind of situation happens, it is a good practice that the developer analyse if the code doesn’t have too much functionality implemented on itself, which leads to what in OO we call as low cohesion.

To avoid this situation, a refactoring from the code is needed, maybe splitting the code on more methods, or even classes on Java’s case.

Conclusion

And this concludes our post about TDD. By shifting the focus from implementing the code first to implementing the tests first, we can easily make our code more robust, simple and testable.

In a time that software is more important then ever – been on places like airplanes, cars and even inside human beans -, testing efficiently the code is really important, since the consequences of a bad code are getting more and more devastating. Thank you for following me on another post, until next time.

Java 8: Knowing the new features – the new Date API

Standard

Hi, dear readers! Welcome to my blog. On this post, the last on the series, we talk about the new library for Date & Time manipulation, which was inspired by the Joda Time library.

So, without further delay, let’s begin our journey through this feature!

Manipulating Dates & Time on Java

It is a old complain on the Java community how the Java APIs for manipulating Dates has his issues, like limitations, difficult  to work with, etc. Thinking on this, the Java 8 comes with a new API that brings simplicity and strength to the way we work with datetimes on Java. Let’s start by learning how to create instances of the new classes.

To create a new Date instance (without time), representing the current date, all we have to do is:

LocalDate date = LocalDate.now();

To create a new Time instance, based at the time the instance was created, we do this:

LocalTime time = LocalTime.now();

And finally, to create a datetime, in other words, a date and time representation, we use this:

LocalDateTime dateTime = LocalDateTime.now();

The instance above have not timezone information, using only the local timezone. If it is needed to use a specific timezone, we created a class called ZonedDateTime. For example, if we wanted to create a instance from our timezone and them change to Sidney’s timezone, we could do like this:

ZonedDateTime zonedDateTime = ZonedDateTime.now();

System.out.println("Time at my timezone: " + zonedDateTime);

zonedDateTime = zonedDateTime.withZoneSameInstant(ZoneId
.of("Australia/Sydney"));

System.out.println("Time at Sidney: " + zonedDateTime);

The code above print the following at my location:

Time at my timezone: 2015-06-04T14:42:30.850-03:00[America/Sao_Paulo]
Time at Sidney: 2015-06-05T03:42:30.850+10:00[Australia/Sydney]

Another way of instantiating this classes is for a predefined date and/or time. We can do this like the following:

 date = LocalDate.of(2015, Month.DECEMBER, 25);

 dateTime = LocalDateTime.of(2015, Month.DECEMBER, 25, 10, 30);

With all those classes is really simple to add and/or remove days, months or years to a date, or the same to a time object. the code bellow illustrate this simplicity:

System.out.println("Date before adding days: " + date);

date = date.plusDays(10);

System.out.println("Date after adding days: " + date);

date = date.plusMonths(6);

System.out.println("Date after adding months: " + date);

date = date.plusYears(5);

System.out.println("Date after adding years: " + date);

date = date.minusDays(7);

System.out.println("Date after subtracting days: " + date);

date = date.minusMonths(6);

System.out.println("Date after subtracting months: " + date);

date = date.minusYears(10);

System.out.println("Date after subtracting years: " + date);

time = time.plusHours(12);

System.out.println("Time after adding hours: " + time);

time = time.plusMinutes(30);

System.out.println("Time after adding minutes: " + time);

time = time.plusSeconds(120);

System.out.println("Time after adding seconds: " + time);

time = time.minusHours(12);

System.out.println("Time after subtracting hours: " + time);

time = time.minusMinutes(30);

System.out.println("Time after subtracting minutes: " + time);

time = time.minusSeconds(120);

System.out.println("Time after subtracting seconds: " + time);

Running the above code, it prints:

Date before adding days: 2015-12-25
Date after adding days: 2016-01-04
Date after adding months: 2016-07-04
Date after adding years: 2021-07-04
Date after subtracting days: 2021-06-27
Date after subtracting months: 2020-12-27
Date after subtracting years: 2010-12-27
Time after adding hours: 09:28:24.380
Time after adding minutes: 09:58:24.380
Time after adding seconds: 10:00:24.380
Time after subtracting hours: 22:00:24.380
Time after subtracting minutes: 21:30:24.380
Time after subtracting seconds: 21:28:24.380

One important thing to notice is that in all methods we had to “catch” the return of the operations. The reason for this is that, opposite to the old classes we used like the Calendar one, the instances on the new date API are immutable, so they always return a new value. This is useful for scenarios with concurrent access for example, since the instances wont carry states.

Another simplicity is on the way we get the values from a date or time. On the old days, when we wanted to get a year or month from a Calendar, for example, we would need to use the generic get method, with a indication of the field we would want, like Calendar.YEAR. With the new API, we could use specific methods with ease, like the following:

System.out.println("For the date: " + date);

System.out.println("The year from the date is: " + date.getYear());

System.out.println("The month from the date is: " + date.getMonth());

System.out.println("The day from the date is: " + date.getDayOfMonth());

System.out.println("The era from the date is: " + date.getEra());

System.out.println("The day of the week is: " + date.getDayOfWeek());

System.out.println("The day of the year is: " + date.getDayOfYear());

After we run the code above, the following result will be produced:

For the date: 2010-12-27
The year from the date is: 2010
The month from the date is: DECEMBER
The day from the date is: 27
The era from the date is: CE
The day of the week is: MONDAY
The day of the year is: 361

Another simple thing to do is comparing dates with the API. If we code the following:

  // comparing dates
  LocalDate today = LocalDate.now();
  LocalDate tomorrow = today.plusDays(1);

   System.out.println("Is today before tomorrow? "
			+ today.isBefore(tomorrow));

   System.out.println("Is today after tomorrow? "
			+ today.isAfter(tomorrow));

   System.out.println("Is today equal tomorrow? "
			+ today.isEqual(tomorrow));

On the code above, as expected, only  the first print will print true.

One interesting feature of the new API is the locale support. On the code bellow, for example, we print the month of a date in different languages:

System.out.println("English: "+today.getMonth().getDisplayName(TextStyle.FULL, Locale.ENGLISH));
		
System.out.println("Portuguese: "+today.getMonth().getDisplayName(TextStyle.FULL, Locale.forLanguageTag("pt")));
		
System.out.println("German: "+today.getMonth().getDisplayName(TextStyle.FULL, Locale.GERMAN));
		
System.out.println("Italian: "+today.getMonth().getDisplayName(TextStyle.FULL, Locale.ITALIAN));
		
System.out.println("Japanese: "+today.getMonth().getDisplayName(TextStyle.FULL, Locale.JAPANESE));
		
System.out.println("Chinese: "+today.getMonth().getDisplayName(TextStyle.FULL, Locale.CHINESE));

Running the above code, on my current date, we will get the following result:

English: June
Portuguese: Junho
German: Juni
Italian: giugno
Japanese: 6月
Chinese: 六月

Formatting dates is also a easy task with the new API. If we wanted to format a date to a “dd/MM/yyyy” format, all we have to do is pass a DateTimeFormatter with the desired format:

System.out.println(today.format(DateTimeFormatter
			.ofPattern("dd/MM/yyyy")));

One very common requirement we encounter from time to time is the need to calculate the time between two dates. With the new API, we can calculate this very easily, with the ChronoUnit class:

 LocalDateTime oneDate = LocalDateTime.now();
 LocalDateTime anotherDate = LocalDateTime.of(1982, Month.JUNE, 21, 20,
 00);

 System.out.println("Days between the dates: "
 + ChronoUnit.DAYS.between(anotherDate, oneDate));

 System.out.println("Months between the dates: "
 + ChronoUnit.MONTHS.between(anotherDate, oneDate));

 System.out.println("Years between the dates: "
 + ChronoUnit.YEARS.between(anotherDate, oneDate));

System.out.println("Hours between the dates: "
 + ChronoUnit.HOURS.between(anotherDate, oneDate));

 System.out.println("Minutes between the dates: "
 + ChronoUnit.MINUTES.between(anotherDate, oneDate));

 System.out.println("Seconds between the dates: "
 + ChronoUnit.SECONDS.between(anotherDate, oneDate));

On my current day (08/06/2015), the above code produced:

Days between the dates: 12040
Months between the dates: 395
Years between the dates: 32
Hours between the dates: 288962
Minutes between the dates: 17337771
Seconds between the dates: 1040266275

One thing to note is that, if we use the same methods with the objects exchanged, we will receive negative numbers. If our logic needs the calculations to be always positive, we could use the classes Period and Duration to calculate the time between the dates, which have the methods isNegative() and negated() to produce this desired effect.

One final feature we will visit of the new API is the concept of invalid dates. When we were using a Calendar,  if we tried to input the date of February, 30, on a year the month goes to 28 days, the Calendar will adjust the date to March, 2, in other words, it will go past the date inputted, without throwing any errors. This is not always the desired effect, since sometimes this could lead to unpredictable behaviors. On the new API, if we try for example to do the following:

LocalDate invalidDate = LocalDate.of(2014, Month.FEBRUARY, 30);
		
System.out.println(invalidDate);

We will receive a invalid date exception, ensuring a easier way to treat this kind of bug:

Exception in thread "main" java.time.DateTimeException: Invalid date 'FEBRUARY 30'
	at java.time.LocalDate.create(LocalDate.java:431)
	at java.time.LocalDate.of(LocalDate.java:249)
	at com.alexandreesl.handson.DateAPIShowcase.main(DateAPIShowcase.java:174)

References

This series was inspired by a book from the publisher “Casa do Código”, which was used by me on my studies. Unfortunately the book is on Portuguese, but it is a good source for developers who want to quickly learn about the new features of Java 8:

Java 8 prático

Conclusion

And that concludes our series about the new features  of the Java 8. Of course, there is other subjects we didn’t talked about, like the end of the PermGen, that it was replaced by another memory technique called metaspace. If the reader wants to know more about this, this article is very interesting on the subject. However, with this series, the reader can have a good base to start developing on Java 8.

On a programming language like Java, it is normal to have changes from time to time. For a language with so many years, it is impressive how Java can still evolve, reflecting the new tendencies from the more modern languages. Will it Java continue like this forever? Only time will tell….

Thank you for following me on another post from my blog, until next time.

Continue reading

Java 8: Knowing the new features – Streams

Standard

Hi, dear readers! Welcome to my blog. On this post, the second on the series, we talk about streams, a new way to manipulate collections.

So, without further delay, let’s begin our journey through this feature!

Streams

Streams was introduced on Java 8 as a way to create a new form of manipulating Collections. Normally, when we use a Collection, we prepare a list of items, make several operations by this collection, like filtering, sums, etc and finally we use a final result, which could be evaluated as a single operation. That is exactly the goal of the streams API: allow us to program our Collection’s logic like a single operation, using the functional programming paradigm.

So, let’s get started with the preparations for the examples.

First, we create a Client class, which we will use as the POJO for our examples:

public class Client {

private String name;

private Long phone;

private String sex;

private List<Order> orders;

public List<Order> getOrders() {
return orders;
}

public void setOrders(List<Order> orders) {
this.orders = orders;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

public Long getPhone() {
return phone;
}

public void setPhone(Long phone) {
this.phone = phone;
}

public String getSex() {
return sex;
}

public void setSex(String sex) {
this.sex = sex;
}

public void markClientSpecial() {

System.out.println("The client " + getName() + " is special! ");

}

}

Our Client class this time has a reference to another POJO, the Order class, which we will use to enrich our examples:

public class Order {

private Long id;

private String description;

private Double total;

public Long getId() {
return id;
}

public void setId(Long id) {
this.id = id;
}

public String getDescription() {
return description;
}

public void setDescription(String description) {
this.description = description;
}

public Double getTotal() {
return total;
}

public void setTotal(Double total) {
this.total = total;
}

}

Finally, for all the examples, we will use a single Collection’s data, so we create a Utility class to populate our data:

public class CollectionUtils {

public static List<Client> getData() {

List<Client> list = new ArrayList<>();

List<Order> orders;

Order order;

Client clientData = new Client();

clientData.setName("Alexandre Eleuterio Santos Lourenco");
clientData.setPhone(33455676l);
clientData.setSex("M");

list.add(clientData);

orders = new ArrayList<>();

order = new Order();

order.setDescription("description 1");
order.setId(1l);
order.setTotal(32.33);
orders.add(order);

order = new Order();

order.setDescription("description 2");
order.setId(2l);
order.setTotal(42.33);
orders.add(order);

order = new Order();

order.setDescription("description 3");
order.setId(3l);
order.setTotal(72.54);
orders.add(order);

clientData.setOrders(orders);

clientData = new Client();

clientData.setName("Lucebiane Santos Lourenco");
clientData.setPhone(456782387l);
clientData.setSex("F");

list.add(clientData);

orders = new ArrayList<>();

order = new Order();

order.setDescription("description 4");
order.setId(4l);
order.setTotal(52.33);
orders.add(order);

order = new Order();

order.setDescription("description 2");
order.setId(5l);
order.setTotal(102.33);
orders.add(order);

order = new Order();

order.setDescription("description 5");
order.setId(6l);
order.setTotal(12.54);
orders.add(order);

clientData.setOrders(orders);

clientData = new Client();

clientData.setName("Ana Carolina Fernandes do Sim");
clientData.setPhone(345622189l);
clientData.setSex("F");

list.add(clientData);

orders = new ArrayList<>();

order = new Order();

order.setDescription("description 6");
order.setId(7l);
order.setTotal(12.43);
orders.add(order);

order = new Order();

order.setDescription("description 7");
order.setId(8l);
order.setTotal(98.11);
orders.add(order);

order = new Order();

order.setDescription("description 8");
order.setId(9l);
order.setTotal(130.22);
orders.add(order);

clientData.setOrders(orders);

return list;

}

}

So, let’s begin with the examples!

To use the stream API, all we have to to is use the stream() mehod on the Collection’s APIs to get a stream already prepared for our use. The Stream interface use the default methods feature, so we don’t need to implement the interface methods. Another good point on this approach is that consequently all Collections already has support for the Streams feature, so if the reader has that favorite framework for collections (like the commons one from Apache), all you have to do is upgrading the JVM of your projects and the support is added!

The first thing to notice about streams is that they don’t change the Collection. That means that if we do something like this:

public class StreamsExample {

public static void main(String[] args) {

List<Client> clients = CollectionUtils.getData();

clients.stream().filter(
c -> c.getName().equals("Alexandre Eleuterio Santos Lourenco"));

clients.forEach(c -> System.out.println(c.getName()));

}

}

And run the code, we will see that the Collection will still print the 3 clients from our Collection’s test data, not just the one we filtered on our stream! This is a important concept to keep it in mind, since it means we don’t have to populate multiple collections with different data to execute different logic.

So, how we could print the result of our previous filter? All we have to do is link the methods, like this:

.

.

.

clients.stream()
.filter(c -> c.getName().equals(
"Alexandre Eleuterio Santos Lourenco"))
.forEach(c -> System.out.println(c.getName()));

if we run our code again, we will see that now the code only prints the elements we filtered. On this example, as said before, we didn’t received the list we filtered. If we needed to retrieve the Collection formed by the transformations we made on our Streams, we can use the collect method. This method receives 3 functional interfaces as the parameters, but fortunately Java 8 already comes with another interface, called Collectors, that supply common implementations for the interfaces we need to supply to the collect method. Using this features, we could retrieve the Collection coding like this:

.

.

.

List<Client> filteredList = clients
.stream()
.filter(c -> c.getName().equals(
"Alexandre Eleuterio Santos Lourenco"))
.collect(Collectors.toList());

filteredList.forEach(c -> System.out.println(c.getName()));

On our previous examples, we retrieved the whole Client objects on our filtering. But and if we wanted to retrieve a List with the names of the Clients that has orders with total > 90 and print on the console? We could do this:

.

.

.

System.out.println("USING THE MAP METHOD!");

clients.stream()
.filter(c -> c.getOrders().stream()
.anyMatch(o -> o.getTotal() > 90))
.map(Client::getName)
.forEach(System.out::println);

The code above could seen a little strange at first, but if we imagine the size of the code we would do to make the same with traditional Java code – iterating by multiple Collections, creating another collection with just the names and iterating again for the prints – we can see that the new features really help to make a more simple and cleaner code. We also see the use of the anyMatch method, which receives a predicate as parameter and returns true or false if any of the elements on the stream succeeds on the predicate.

Besides the all-purpose map method, there’s also another implementations specific for integers, longs and doubles. The reason for this is to prevent the called “boxing effect” where the primitive values would be wrapped and unwrapped on the operations, which will cause a performance overhead, and since we already informed the type of value we are working with, this implementations provide some interesting methods that return things like the average or the max value of our mapping. Let’s see a example. Imagine that we want to retrieve the max total from the orders on each client and print the name and the total on the console. We could do like this:

.

.

.
clients.stream().forEach(
c -> System.out.println("Name: "
+ c.getName()
+ " Highest Order Total: "
+ c.getOrders().stream().mapToDouble(Order::getTotal)
.max().getAsDouble()));

The reader may notice that the max method’s return is not the primitive itself, but a Object. This object is a OptionalDouble, that together with other classes like the java.util.Optional, it supplies a implementation that allow us to provide a default behavior for the cases in which the operation been used with the Optional – in our case, the max() method – has some null element among the values. For example, if we want in our previous operation that the max returns 0 in case any of the elements was null, we could modify the code as follows:

.

.

.

clients.stream().forEach(
c -> System.out.println("Name: "
+ c.getName()
+ " Highest Order Total: "
+ c.getOrders().stream().mapToDouble(Order::getTotal)
.max().orElse(0)));

One interesting behavior of the streams is their lazy behavior. That means that when we create a flow – also called a pipe – of streams operations, the operations will always execute only at the time they are really needed to produce the final result. We can see this behavior using one method called peek(). Let’s see a example that clearly shows this behavior:

.

.

.

clients.stream()

.filter(c -> c.getName().equals(
"Alexandre Eleuterio Santos Lourenco"))
.peek(System.out::println);

System.out.println("*********** SECOND PEEK TEST ******************");

clients.stream()
.filter(c -> c.getName().equals(
"Alexandre Eleuterio Santos Lourenco"))
.peek(System.out::println)
.forEach(c -> System.out.println(c.getName()));

If we run the example above, we can see that on the first stream the peek method doesn’t print anything. That’s because the filter operation it was not executed, since we didn’t do anything with the stream after the filtering. On the second stream, we used the foreach operation afterwards, so the peek method will print a toString() of all the objects inside the filtered stream.

On our previous examples, we see the max method, which returns the max value from a stream of numbers. That type of operation, that returns a single result from a stream, is called a reduce operation. We can make our own reduce operations, just providing a initial value and the operation itself, using the reduce method. For example, if we wanted to subtract the values from the stream:

.

.

.

clients.stream().forEach(
c -> System.out.println("Name: "
+ c.getName()
+ " TOTAL SUBTRACTED: "
+ c.getOrders().stream().mapToDouble(Order::getTotal)
.reduce(0, (a, b) -> a - b)));

This is a really useful feature to keep in mind when the default arithmetic operations don’t suffice.

Parallel Streams

At last, let’s talk about the last subject on our streams’s journey: parallel streams. When using parallel streams, we run all the operations we see previously with parallel processing mode, instead of just the main thread as usual. The jdk will choose the number of threads, how to break the segments of processing and how to join the parts to the final result. The reader may be asking “what do I have to pass to help the jdk on this settings?” the answer is: nothing! That’s right, all we have to do to use parallel streams is change the beginning of our commands, like the example bellow:

.

.

.
clients.parallelStream()
.filter(c -> c.getOrders().stream()
.anyMatch(o -> o.getTotal() > 90)).map(Client::getName)
.forEach(System.out::println);

As we can see, all we have to do is change from stream() to parallelStream(). One important thing to keep in mind is when to use parallel streams. Since there is a payload of preparing the thread pool and managing the segmentation and joining of the results, unless we have a really big volume of data to use or a really heavy operation to do with the data, we normally will use single thread streams.

Other features

Of course, there is more features we could talk on this post, like the sort method, that as the name implies, make sorting of the items on our streams. Another really powerful feature is on the Collectors’s methods, which has impressive transformation options such as grouping, partitioning, joining and so on. However, with this post we made a very good start with the usage of the feature, sowing the way for his adoption.

Conclusion 

And so we conclude another part of our series. As we can easily see, streams is a very powerful tool, which can help us a lot on keeping a really short code when processing our collections. That is one of the keys – or maybe the master key – of the Java 8 philosophy. For years, the Java scenario was plagued with “accusations” of not being a simple language, since it is so verbose, specially with the appearance of languages like Python or Ruby, for example. With this new features, maybe the burden of “being complex” for Java will finally begone. I thank the reader for following me on another post and invite you to please return to the last part of our series, when we will talk about the last of our pillars, the new Date API. Until next time.

Continue reading