adangel.org - Java by Comparison and PMD

The book
The structure
Summary

This is now a different kind of blog post. I want to present a little bit the book “Java by Comparison” by Simon Harrer, Jörg Lenhard, and Linus Dietz. But not just present it, instead evaluate, which examples could be automated to be detected by static code analysis and maybe which PMD rules already exist.

The book

The book itself is not new, it is from 2018. It contains 70 examples, which are presented in a concise form: The left page shows the example with the “bad” code and the right page shows the improved code. So you can see the example side-by-side. Additionally, there are lots of explanations of course.

I like the approach taken, to explain common programming “rules” and best practices with examples. That is much more catchy than just a written down explanation.

The book is available from the Pragmatic Bookshelf and has an own page: https://pragprog.com/titles/javacomp/java-by-comparison/.

The structure

Organized by wider topics. I’ll follow the same structure. Maybe sometimes I’ll combine two examples.

Small code cleanups

This first chapter is about small code style cleanups, like superfluous code or inconvenient written code, that is hard to read.

“Avoid Unnecessary Comparisons” are conditional expressions, that are compared against a boolean, such as if (a == true) .... This can be written more concise as just if (a), no need for this extra comparison. It turns out, there is already a PMD rule for that: SimplifyBooleanExpressions. The reasoning behind this is: readability and code clutter.
“Avoid Negations” makes the point, that an expression is easier to read and understand, if there is no negation involved. E.g. instead of if (!isEmpty()) use better if (isNotEmpty()). There is no direct PMD rule that finds such code, but it could be easily written as an XPath expression. E.g. //UnaryExpression[@Operator = '!']. There is however a related rule, that is ConfusingTernary. This makes the point, if you have an if-else-expression and you use a negated condition, just swap them around and use a positive condition.
“Return Boolean Expressions Directly” is about avoiding clutter again. Instead of return true inside a if condition, you can directly return the condition and avoid the if statement entirely. There is already a PMD rule: SimplifyBooleanReturns.
“Simplify Boolean Expressions” is about readability and easier understanding. There is a PMD rule with the same name, but that rule does something different, see above. If the boolean expression is a bit more complicated, it makes sense to introduce some local variables to give the parts of the expression a name and make it easier to comprehend. This has currently no equivalent PMD rule. However, we could count the number of parts the expression consists of (literally how many && and || we have). The suggestion would be to split this up and introduce named local variables or methods. It would be similar like CyclomaticComplexity but focused on a single expression.
“Avoid NullPointerException in Conditionals” talks about the issue of forgetting to check the input parameters of public (or protected or default) methods against null before using them. There are some related PMD rules, that help to avoid such NPEs: LiteralsFirstInComparisons, BrokenNullCheck, EqualsNull (which is a bit silly), MisplacedNullCheck, UnusedNullCheckInEquals. But there is no rule that e.g. checks whether a method’s parameter is used without a null check. This rule seems to be possible, but might yield a lot of false positives. It might be a code style rule, as the rule of thumb would be then: always check parameters for null before using them.
“Avoid Switch Fallthrough” is self-explanatory. And there is already a PMD rule available: ImplicitSwitchFallThrough.
“Always Use Braces” is a code-style topic. PMD used to have multiple rules for each statement that checked, whether braces are used. Since PMD 6.2.0, there is one rule for that: ControlStatementBraces.
“Ensure Code Symmetry” is an interesting one. This is about readability and understandability of the code. The examples shows a if-else-if structure, where some branches do deny access and other branches grant access. In order to make the code more symmetrical, these two different concerns should be split up and separated, so that when reading the code, one has to understand only one part at a time and not both concerns at once. Detecting such an asymmetry sounds not so easy. We could use some heuristics like (in case of if statements) the number of statements in each branch or the accessed variables in each branch. It’s definitively a rule, that would only give a hint, that something could be improved in the code, but there are no hard indicators.

Improved code style

The next chapter is about even better code style.

“Replace Magic Numbers with Constants” is a simple rule that helps in understanding the code. If you use only literals for the numbers, that you would need to explain the meaning of the number. This can be improved by defining a constant with a name for the number and use the constant. Now the number has a name and a meaning and the code is easier to understand. Surprisingly, PMD doesn’t have such a rule. I know however, that Checkstyle has one, e.g. MagicNumber.
“Favor Enums Over Integer Constants” is a follow-up on the previous example - enums might even be better in some circumstances. There is no rule in PMD for that. The rule would need to be a bit more intelligent than just pointing to constants (static final) - it would need to identify the usages of these constants. In the example in the book, there is a method which takes an input parameter and compares this against a bunch of constants and only doing something, if the input parameter matches one of the constants. This essentially limits the possible values for the input parameter by using the constants, which makes it a perfect case for enums. That means, a PMD rule would need to identify such a method, which probably would only work reliably in simple cases such as the given example. It could look a bit broader and consider the cohesion between constants and methods and if there is a single method where a bunch of constants are used (and the constants are not used somewhere else), then we could suggest to introduce an enum. All this would require, that the constants are not public, as otherwise we would need to check the usages across the whole project.
“Favor For-Each Over For Loops” is an easy one and there is a PMD rule: ForLoopCanBeForeach. The rule description however misses an explanation, why a foreach loop should be used instead. The reason is, that the code is simpler, because it doesn’t expose unnecessary variables such as the index variable. That makes the code less prone to mistakes (e.g. using the wrong variable for accessing the element by index, etc.).
“Avoid Collection Modification During Iteration” finds calls to remove or add or any other modification of a java.util.Collection, while it is iterated over in a foreach loop. In most implementations this will lead to a “ConcurrentModificationException”. In case we need to remove an element during iteration, we should use the Iterator explicitly, which has a remove-method (given this operation is actually supported). Another way would be, to iterate over a copy of the collection, so that we can modify the original collection. PMD doesn’t have such a rule, but it should be easy to implement this. It would be in the category “error prone”.
“Avoid Compute-Intense Operations During Iteration” belongs to the category “performance”. In the example, a call to Pattern.matches is done inside a loop over and over again, always with the same regex pattern. The method internally compiles the regex and this is an “expensive” operation, which should be done only once. The resulting pattern should be reused. There are other methods around regex, that suffer from the same “not cached pattern”: e.g. String.replaceAll(String, String) - if this is called within a loop with the first argument always being the same, the same pattern would be compiled over and over again. There are probably more cases of “expensive” operation, that should be avoided inside a loop. For the Apex language, there is the rule OperationWithHighCostInLoop, which serves a similar purpose (but finds other method calls). For Java, we have already AvoidInstantiatingObjectsInLoop, which is at least loop-related. But there is no rule yet, that checks specifically for the mentioned regex compilation calls.
“Group With New Lines” suggests to use empty new lines in the code, to form code blocks, that belong together semantically. I guess, that would be very difficult for a static analyzer to figure this out, as it would need to really understand the code. It goes into the concept of cohesion. Theoretically, your class would be so small, that it does only one thing, that there would be only one group and no need for new lines. But that’s not the reality. In that sense, it might be easier to just detect big classes like GodClass or just ExcessivePublicCount or TooManyMethods.
“Favor Format Over Concatenation” reminds us, that java nowadays also supports C-like format strings. If you are building a big string combining labels and values, it quickly becomes a mess and is very difficult to read. A format string is most likely easier to read and should be used instead. While PMD has a couple of rules that deal with StringBuilder and the concatenation strings in this context, it doesn’t have such a rule. It shouldn’t be too difficult to detect InfixExpressions (“+”) of type String (e.g. //InfixExpression[@Operator='+'][pmd-java:typeIs("java.lang.String")]).
“Favor Java API Over DIY” suggest to use the Java API and the ready to use methods. The book brings two examples: Instead of doing a null check manually and throwing a NullPointerException, just use Objects.requireNonNull and instead of manually counting how often a given element is in a collection, use Collections.frequency. PMD doesn’t have such a generic rule, that would find all the cases. And such a rule would needed to be updated regularly and would also need to consider the used Java version, so that not a suggestion of a method is made, that cannot be used, as it was only added in a later Java version. There is however one rule, that goes into this: AvoidArrayLoops, which suggests to use System.arraycopy or Arrays.copyOf. And the PMD rule UseArraysAsList suggest to use Arrays.asList instead of a manual loop. So, there are already two PMD rules in this area, but for sure there could be much more cases.

Commenting code

This chapter is all about comments. And while comments are welcomed in most cases, unhelpful or outdated comments are bad.

“Remove Superfluous Comments” addresses old comments (with wrong content) and comments, that don’t add information, as they merely repeat the code (e.g. comments telling what is done instead of why is something done). This reminds me of JAutodoc which can generate JavaDoc comments without any useful content. PMD has some rules in the category documentation but these rules mostly do the opposite: They demand a comment. It would be difficult to write a rule, that interprets the content and figures out whether it is superfluous (and can be removed) or not. But we could create a rule, that identifies comments, that just repeats the method name and parameters and doesn’t add anything additional (finding the generated JAutodoc comments that have never been changed). Another thing is “TODO” comments: The recommendation in the book is - either fix the TODO immediately and remove the TODO comment, and if this is not possible right away, create a new issue in the issue tracker and remove the TODO comment. We could create a rule that find the TODO comments, like the checkstyle rule TodoComment.
“Remove Commented-Out Code” is about using your VCS to remember old implementations rather than using comments. There is also no PMD rules for that. Commented out code has the problem, that it often doesn’t work anymore when it is commented in, as the surrounding projects has evolved (e.g. refactored) and the commented out code has been updated. In the end, it’s very likely, that the commented out code cannot be used anyway. In regard to detecting this, there could be some regular expressions that detect method calls. And the position of such a comment can also give a hint - e.g. inside an argument list.
“Replace Comments with Constants” is a bit similar to “Replace Magic Numbers with Constants”. In the example given in the book, the comments explained the meaning of the magic numbers. The comment could be used as an additional information/suggestion when creation a violation message for the magic number, like “Did you mean to create a constant named XYZ?”. In that sense, it would be an enhancement to the magic number rule, but not a rule on its own.
“Replace Comments with Utility Methods” gives an example, that sometimes a code block (some contiguous code lines) are preceded by a single line comment. That seems to be an opportunity to refactor out this code block and give it a name - exactly the name from the comment. I think, it would be quite difficult to reliably find such comments, where such a refactoring would make sense, while ignoring the other comments.
“Document Implementation Decisions” is about expanding short comments (like “fast implementation”) into a more detailed comments, explaining the what, why, how of the decision that is made. There is no PMD rule that can identify such comments and it would probably be quite difficult to implement such a rule.
“Document Using Examples” makes the point, that the JavaDoc for a regex pattern is much more easier to understand when examples are provided of some strings, that match, and some, that doesn’t. This however is again about interpreting the content of a comment, which is out of scope for a static analyzer like PMD. Of course, we could say, that a java.util.Pattern should be preceded with a comment, that has the two keywords “Valid examples” and “Invalid examples”. That could identify exactly one special case (regex patterns), but wouldn’t find other missing examples.
“Structure JavaDoc of Packages” is about the content of the JavaDoc for packages. It should mention the most important classes of the package and maybe give a usage example of those. All this is again out of scope for PMD. There is the rule CommentRequired which can detect missing comments, but doesn’t consider, whether the comment contains useful content. And also, this rule is able to find missing JavaDocs on packages (e.g. in package-info.java).
“Structure JavaDoc of Classes and Interfaces” is about ensuring a specific structure for the JavaDoc of types. It should start with a summary sentence. While PMD itself can only ensure that there is a JavaDoc at all (see CommentRequired), it doesn’t check the contents. Checkstyle has a rule called SummaryJavadoc which does some checks on the first (summary) sentence. While a static analyzer can definitely verify the there is a first sentence and that this is separated by a blank line from the rest of the doc, it can’t verify that the content is useful at all.
“Structure JavaDoc of Methods” and “Structure JavaDoc of Constructors” are similar like the one before: PMD can only verify the existence of such Javadocs, but can’t check the contents. Checkstyle has some more sophisticated checks for completeness, like all method parameters need to be documented (see JavadocMethod), but it can’t verify the content.

Naming

Naming is said to be one of the hardest things to get right in computer science. If you want to know the other, see Two Hard Things. So, this chapter is all about naming things in Java.

“Use Java Naming Conventions” is about “correct” casing of types, fields, variables, methods and all that. Luckily PMD has a rule already, actually several rules: ClassNamingConventions, FieldNamingConventions, FormalParameterNamingConventions, GenericsNaming, LocalVariableNamingConventions, MethodNamingConventions, PackageCase. Most of these rules can be configured, but that’s not the goal of the example the book makes: You should simply use the Java Code Conventions as defined by Oracle, and don’t invent your own conventions.
“Follow Getter/Setter Conventions for Frameworks” says that if you create a JavaBean, you should follow the bean conventions. The getter and setter methods and the field names must be connected. There are two PMD rules for that purpose: InvalidJavaBean and BooleanGetMethodName. Following such conventions spares you from surprises, that something doesn’t work as expected, when your bean is used in a “standard” Java framework.
“Avoid Single-Letter Names” is about naming of parameter names and local variables. Today, with the IDE’s help, it not difficult anymore to avoid any too short names. The IDEs can most times help you with autocompleting the variable names. So you often don’t have to type the full name. Longer names can provide so much more information about the variables, that the code is much easier to understand. PMD has a couple of rules for that case as well: ShortClassName, ShortMethodName, ShortVariable.
“Avoid Abbreviations” might be picked up by the other short naming rules, since an abbreviation might be short enough. But not always, e.g. “bufW” is a correct variable name and 4 letters long, but it isn’t helpful. In the book’s example, it’s used as an abbreviation for “bufferedWriter” - which contains too much detail - so “writer” is the sensible solution here. There is no PMD rule for that, but a similar rule in Checkstyle: AbbreviationAsWordInName. In general, we could split the name into single words (e.g. a case change starts a new word) and verify the length of the single words. The “bufW” would become “buf” and “W” and we have a short name. Whether that’s an abbreviation can only be a guess.
“Avoid Meaningless Terms” is a nice example about bad naming. In today’s software, you often find a bunch of “Manager” classes or “Impl” suffixes. The most famous example referenced in the book is with the following quote:

Roses are red, leaves are green but only Java has AbstractSingletonProxyFactoryBean.

It’s from http://bash.org/?962108 (wayback machine) and it relates to the AbstractSingletonProxyFactoryBean in the spring framework.

There is no rule in PMD that finds such things. We could write a rule, that searches for some given meaningless terms, but whether they are really meaningless can’t be decided. We could only highlight suspicious names.
“Use Domain Terminology” is about using the correct terms when modeling the classes and methods. That way the software is easier to understand (if you now the domain at least). Don’t invent new names for something, that is already known. This is again very difficult for static analyzer, as it would require some level of understanding, what the software really does.

Error handling

This chapter is about proper error handling using exceptions and other means.

“Fail Fast” is a simple code style, that avoids nested structures and keeps the logic flat. If you check for an error condition at the start of a method and return or throw an exception directly, then you can be sure, that for the remaining part of the method, this particular condition is already handled. This makes the normal path of execution again more visible, as it is not hidden between other error condition checks. There is no PMD rule for this, but it shouldn’t be too difficult to write a rule, that detects methods that start with an “if” statement with multiple branches and each branch (except one) ends with a throw statement.
“Always Catch Most Specific Exception” is about the problem, when you simply catch “Exception”, which is very unspecific: It could be any problem - and you handle all the different problems the same way. And especially with unchecked exception, you might handle more cases than you wanted. But you want to have your program crash when e.g. there is a NullPointerException, as this is a programming error. There exists already the PMD rule AvoidCatchingGenericException.
“Explain Cause in Message” addresses the difficulty of good error messages. The simple text “An unexpected error occurred” is not helpful - and this is already clear from the exception anyway. The book suggests a simple pattern that one can follow: “Expected [EXPECTED], but got [ACTUAL] in [CONTEXT]”. This is much more helpful in understand later on, what actually led to the error. For this, there is no direct PMD rule. We could check however, that an exception isn’t created without a message at all. But the rule would be very inaccurate. I guess, this means, that such a rule would be a bit more difficult to implement.
“Avoid Breaking the Cause Chain” is about passing the original exception, when throwing a new exception. This is easy: PMD has already a rule for that: PreserveStackTrace.
“Expose Cause in Variable” reminds us, that exceptions are classes on its own and can contain more structured information other than just the message. They can contain fields and methods as well. This can be useful for other parts of the application, where one needs to handle the exceptional situation. The suggestion is to create a custom exception subclass, that contains the interesting information pieces as fields. There is a PMD rule AvoidThrowingRawExceptionTypes, but it doesn’t match this case exactly. I guess, it will be difficult to automatically detect, when a custom exception would be useful. One heuristic could be: If the message is built as a string concatenation out of multiple single components, then this would be an indication to just pass the components into the exception’s constructor and have variables for these components.
“Always Check Type Before Cast” suggests to always do a instanceof check before you do the cast. Not checking the type before might result in a ClassCastException at runtime. There is no PMD rule for that, but it shouldn’t be too difficult, to write a rule. The cast should be happening as the first statement in a if block, that checks the type using “instanceof” - then it’s ok. A cast without such an if might be problematic.
“Always Close Resources” suggest to simply use try-with-resources statement for dealing with resources. This is available since Java 7 and ensures, that the resource is closed at the end. Luckily there is already a PMD rule for that: CloseResource. Actually, there is another rule: UseTryWithResources.
“Always Close Multiple Resources” is a variation of the case before and makes an example when you manually close multiple resources in a finally block. You better just use a try-with-resources statement with declaring multiple resources. The following PMD rule applies here: UseTryWithResources.
“Explain Empty Catch” focuses on catch blocks, that contain no statements and just swallowing the exception. It is unclear then, whether the exception handling has been forgotten to implement or whether this is intended behavior in that case. PMD has the rule EmptyCatchBlock for this.

Unit testing

This chapter gives some examples on what to pay attention to when writing unit tests.

“Structure Tests Into Given-When-Then” gives advice how to make unit tests a bit more readable. Often you first need to setup some test preconditions like the data. After that, you are executing the functionality under test. And then you need to verify the final state and compare it your expectations. There is no PMD rule to check for this example. And I think, it would be difficult to write one: PMD can’t understand, which lines of the unit test belong to which semantic part (“given”, “when”, or “then”). We could check for the length of a unit test - if it is too long (whatever this means), it is suspicious to be testing too much (e.g. several functionalities). But that would be a different case.
“Use Meaningful Assertions” is about using the correct assert-helper-methods when possible. E.g. instead of assertTrue(result == 1), one should use the more appropriate method assertEquals(1, result). This method also gives a better failure message, when the test fails: “Expected 1 but was xyz” instead of “Expected true but was false”. PMD has a rule for that, that checks a couple of cases: SimplifiableTestAssertion.
“Expected Before Actual Value” is about calling assertEquals but using the wrong parameter order. For JUnit5 Assertions and JUnit4 Assert, the first parameter should be the expected value and the second parameter should be the actual value. Mixing up the two parameters results in confusing test failure messages. Note, that TestNG Assert uses a different argument order. There is no PMD rule to detect this. To be 100% reliable, it would be very difficult to implement. But for simple cases, where the expected value is a compile time constant, while the actual value is the result of a method call or a variable, we could actually find such cases.
“Use Reasonable Tolerance Values” suggests to use a delta when comparing floating point values. It’s easy to forget this and there won’t be a compiler error, since the two double values would be boxed to Double and then compared using equals. But exact comparisons of two double values aren’t reliable. JUnit provides an overloaded assertEquals method for doubles that takes a third double argument as delta. There is no PMD rule for this, but thanks to type resolution, it should be simple to implement a rule, that finds assertEquals calls with doubles or floats, that only use 2 parameters.
“Let JUnit Handle Exceptions” covers the old-style way of testing exception, before assertThrows() was available or manually failing the test in a catch-block with fail(). That has the downside, that you don’t see the exception and therefore it might not be clear, why the test was failing. Interestingly, PMD doesn’t have a rule for that, but it should be quite easy to implement.
“Describe Your Tests” is about two similar things, and it is about naming. First, you test names should have a meaningful name. With JUnit, you can be more descriptive by adding a @DisplayName annotation. This helps to understand, which functionality exactly is broken in case a unit test fails. However, PMD won’t be able to assess whether the description is good enough and is exactly correct. It could only require that every test is additionally annotated with @DisplayName and that the test methods e.g. don’t start with the prefix test - as this is not necessary anymore since JUnit4.
The other aspect of this example is, to give a clear reason, why a test is ignored. You can ignore tests with the @Disabled annotation - and this annotation allows you to specify some explanation, why the test has been disabled. PMD could require, that such an explanation is always there for disabled tests, but it won’t be able to asses, whether the explanation makes sense.
“Favor Standalone Tests” is about a problem that arises, when setup methods like @BeforeEach are used: In order to understand a test, you also need to read the setup method in addition to the test itself. That means you need to jump around in the test source. The tests are better comprehensible, if they have the complete pattern “given-when-then” contained in each test. The given-part would otherwise be separated out into the setup-method. So, the recommendation here is, to simply avoid @BeforeEach and @BeforeAll. The setup-method could still be shared, but just needs to be called explicitly. We could easily write a PMD rule to detect any usages of these setup-methods.
“Parametrize Your Tests” looks at a test pattern, where you test inside one method the same code, but with different values. This means, you usually have already factored out the actual testcode, e.g. in a custom assert method, which is called repeatedly with varying values. For this, @ParameterizedTest is a good option: It makes the test report more understandable, and you directly see, which of the different values is causing a problem. There is no PMD rule for that, and I think it wouldn’t be so easy to identify a test, that would be suitable to be refactored into a parameterized test. PMD could maybe detect the very simple cases, but you immediately need to have some understanding of control flow, as there are some method calls involved.
“Cover the Edge Cases” is about test coverage and writing not only one test (“the happy case”) but also considering failure cases and - as the title mentions - edge cases. As this all depends on the subject under test, it seems to be very difficult to write a rule for that. We could write a rule, that detects test classes, that only have one test case defined - but that’s probably a too simplistic heuristic.

Design examples

This chapter is about design and how functions should not only be functional, but also beautiful and convenient.

“Split Method with Boolean Parameters” focuses on methods, that take a boolean parameter and contain a big if-statement with two execution branches. This is a opportunity to make two methods out of this one method, give a better name and avoid the ugly boolean parameter. If a PMD rule would exist for that, it would be in the category “Design”. But there is no rule. Simple cases like the example in the book can easily be detected, though.
“Split Method with Optional Parameters” looks at methods, which take a parameter and that parameter is part of a null check and early return. This might also be a candidate for a simple rule, that finds such simple cases like in the book.
“Favor Abstract Over Concrete Types” addresses the same problem, that the existing PMD rule LooseCoupling does. Interestingly, this rule is not in the “Design” category, but in “Best Practices”.
“Favor Immutable Over Mutable State” is about designing value objects. If you objects are immutable, you avoid lots of the problems in multi threading and consistency. Instead of mutating the object, a new instance is returned instead, similar like BigDecimal or LocalDate. PMD has some related rules, like ImmutableField or UselessOperationOnImmutable, but no rule that suggests to write such immutable value classes in the first place. It might be a bit difficult, to limit this rule and avoid too many false-positives. And it would be a rule, that requires more refactoring and not a simple quick fix.
“Combine State and Behavior” is about object oriented design and looks out for classes without methods but only fields. Then there would be a second class, which contains only methods and take the first class always as a parameter. First class represents the “State” and the second class represents the “Behavior”. Another way to look at this is, that the methods could be static, as they don’t use any instance variables of their own class. So, maybe that is an indication, that the method should be moved somewhere else. Right now, there is no PMD rule for that. A first try could be rule, that finds methods, that could be static.
“Avoid Leaking References” is about encapsulating and protecting the inner state of a class. PMD has a rule for arrays, that are leaking: MethodReturnsInternalArray. But the book is right, this is not only about array, but also about collections.
“Avoid Return Null”: PMD has a rule that suggests to return a empty collection, rather than null: ReturnEmptyCollectionRatherThanNull. The book is more strict about this and suggests, that you should think about this, whenever you return null from a method. Avoiding null and returning a null object instead is a common pattern, that can avoid null pointer exception later.

Using streams

Streams have been introduced in Java 8 with JEP 107. These provide a functional way to express, what should happen in data processing instead of how. Using streams often means, that lambdas and method references are used. And there are a couple of things to consider in terms of readability and maintainability.

“Favor Lambdas Over Anonymous Classes” shows an example with a simple anonymous class, that implements a a functional interface. This can be written more concise without the boilerplate as a lambda expression. The specific example here is by the way not directly related to streams - it uses the method computeIfAbsent, which is part of a Map and takes a single Function<T, R> as an argument. There is no PMD rule for that. But it should be simple thanks to type resolution, to figure out whether the implemented anonymous class is actually implementing a single abstract method (or a functional interface) and suggest to use a lambda instead.
“Favor Functional Over Imperative Style” is now really an example, that suggest to use streams. The example uses a for-each loop with an if-condition inside and collecting the result in a list after performing a mapping of the values. There is no PMD rule for this. Creating such a rule might result in lots of false positives, so that rule could only say: “This for-each loop probably can be converted into a stream pipeline, but I’m not 100% sure”. I guess, this rule would be a bit more difficult to implement reliably.
“Favor Method References Over Lambdas” is easy: There is a PMD rule already existing: LambdaCanBeMethodReference. Using a method reference can express the intention more concise and is therefore simpler to read and understand. The used methods themselves can be tested independently of their usage in the stream pipeline.
“Avoid Side Effects” provides an example of a stream pipeline, that uses as the last stage the forEach method. In this last stage, a object defined outside of the stream is modified - the side effect. And this is not pure functional programming style anymore. There is no PMD rule for that and it seems, that a reliable rule is not so easy to implement. We could start with verifying the forEach call - it can be used as a terminating (last) operation. But if it modifies some other object, this might cause problems, if the stream is executed in parallel, as the terminating operation might not be executed in the original order or even in multiple threads.
“Use Collect for Terminating Complex Streams” suggests to use instead of a forEach operation, which then collects each element of the stream manually into a collection, to use collect instead with the various provided collectors, such as grouping. In the given example, the end result of the method is a Map of some kind, but the stream doesn’t create this map with collect, but rather using forEach with side effects. The Map is a local variable outside of the stream. Similar to the last example, there is no PMD rule for this. And it seems to so simple to implement a good rule for that. We could only point to suspicious stream pipelines.
“Avoid Exceptions in Streams” gives an example on how to deal with (checked) exceptions that need to be handled in the lambda expressions used for the stream operations. In functional style programming, functions should produce an output given a specific input - exceptions don’t fit into that scheme. Throwing a RuntimeException inside a stream pipeline make the whole stream not producing a result anymore and just end with an exception. To avoid this, in this example it is suggested to ignore the element of the stream, that caused the exception and let the stream continue to compute the result. This can be achieved by using flatMap and returning a single element Stream via Stream.of() in the successful case and an empty stream (Stream.empty()) in the exceptional case - effectively skipping the element if there is a problem with that. This of course depends heavily on the actual use case, whether you can ignore an exception or not. There is no PMD rule for that, but one could be written that detects throwing of exceptions inside a lambda expression, that is used in a stream pipeline.
“Favor Optional Over Null” is about communicating explicitly, if a return value of a method could return nothing (aka. an empty optional or null). Just returning null would require to do a null check by the caller of the method, but that could be easily forgotten and resulting in a NullPointerException. Returning an Optional<T> instead forces the caller to think about, what to do, if there is no result from the method call, and thus can avoid NullPointerExceptions. There doesn’t seem to be a PMD rule for detecting this. We could start with simple cases, where a method explicitly returns null in some execution path. But this wouldn’t detect all potential cases. The rule would error on the false negative side, which might be ok for the beginning. Another angle for detection could be to look at method calls. If there is directly after the method call a null-check, then the code assumes that the method could return null, so we could report a violation here.
“Avoid Optional Fields or Parameters” is about a different usage of Optional. The last example suggested to use Optional as a return type of methods. When returning an Optional, you never return null for that method. But if you have a method parameter with Optional, a caller could provide null - and now you need to deal with three possible values: null, empty Optional, and a value wrapped in Optional. This just makes implementing the method just more complicated. It would be simpler, if the method just allowed to be called with a non-null value and throws an exception, if it is called with null (e.g. via Objects.requireNonNull). The same applies for fields, as they can then have potentially the same three possible values and every access to the field in the class is more complicated. There is no PMD rule for this, but the rule is straightforward. We just need to look for usages of Optional in field declarations and in the argument list of method declarations.
“Use Optionals as Streams” suggests a more functional approach when using Optionals: Instead of checking with isPresent() and then getting the value explicitly, the Optional class has a lot of convenience functions that are very similar to the stream functions: like map, filter. Additionally, it as terminating functions like orElseThrow. There is no PMD rule for this. A potential rule could look for a “isPresent” method call followed by “get” method call to find suspicious usages of Optional. The rule might find usages, that maybe are not so easily adaptable to the suggested functional style.

Production readiness

This chapter is partially about general advice, such as “Use Static Code Analysis Tools”. Even PMD is suggested, among other tools like SpotBugs, Checkstyle, Error Prone, or IntelliJ IDEAs Code Inspections. Other general advice is: Use a common code format, use automated builds and continuous integration. But there also some more concrete examples.

“Favor Logging Over Console Output” targets the usage of System.out and System.err. Instead of these, you should use a proper logging framework. Such a framework can easily be configured with a log level to output debug messages only in certain situations and they also provide a way to log exception stacktraces properly. PMD has a couple of related rules for using logging, such as GuardLogStatement, InvalidLogMessageFormat, MoreThanOneLogger, ProperLogger, and UseCorrectExceptionLogging. All these rules already assume, that a logging framework is used. There is two additional rules: AvoidPrintStackTrace finds calls to exception.printStackTrace() and it says “use a logger call instead”. And there is the rule SystemPrintln which flags any usages of System.out/err. Exactly what this is about. It simply suggests to use a logger.
“Speed Up Your Program” is a simple tip when using Streams: Streams might be sped up when using parallel streams. This however is only possible, if the single stream operations are indeed free of side effects and depending on the stream (if it is sorted), a parallel stream might not always be faster. We don’t have a rule in PMD for that. Creating a rule, that identifies streams, that are not yet parallelized, is simple. But the rule can only give the suggestion, to try out, whether using a parallel stream would make your program faster - as it depends not only on the stream operations but also on the amount of data, that is being processed.
“Know Your Falsehoods” is the last concrete example in the book. It is about making too simplistic assumptions about the real world. The example show some code, that parses a given name of a person and extracts the surname. However, the surname might not always be provided as the last word - and it could be composed of actually two words separated by a space. Not to mention other regions (i18n, l10n). The solution for this concrete problem is, to make no assumption at all and just ask for the surname directly. This is definitely an example, that can’t be formalized as a PMD rule.

Summary

The book contains a lot of examples. And if you like to learn from examples (bad and good), that’s are very useful book. My goal here was to map the examples and suggestions to existing PMD rules and to rules, that could be potentially implemented. That way, you have an easy reminder, if you don’t follow the suggested practices.

Even if you are not new anymore to developing software, it’s still a good book to remind you of the good practices, that are maybe forgotten in the hectic everyday life.

Im closing with a couple of tables, that contain the mapping between the examples from the book and the PMD rules. It would be great to update the rule documentations, to mention the book and improve the reasoning behind the rules.

Existing PMD rules

Example Title	PMD Rule	Category in PMD
Avoid Unnecessary Comparisons	SimplifyBooleanExpressions	Design
Return Boolean Expressions Directly	SimplifyBooleanReturns	Design
Avoid Switch Fallthrough	ImplicitSwitchFallThrough	Error Prone
Always Use Braces	ControlStatementBraces	Code Style
Favor For-Each Over For Loops	ForLoopCanBeForeach	Best Practices
Use Java Naming Conventions	ClassNamingConventions, FieldNamingConventions, FormalParameterNamingConventions, GenericsNaming, LocalVariableNamingConventions, MethodNamingConventions, PackageCase	Code Style
Follow Getter/Setter Conventions for Frameworks	InvalidJavaBean, BooleanGetMethodName	Design and Code Style
Avoid Single-Letter Names	ShortClassName, ShortMethodName, ShortVariable	Code Style
Always Catch Most Specific Exception	AvoidCatchingGenericException	Design
Avoid Breaking the Cause Chain	PreserveStackTrace	Best Practices
Always Close Resources	CloseResource, UseTryWithResources	Error Prone and Best Practices
Always Close Multiple Resources	UseTryWithResources	Best Practices
Explain Empty Catch	EmptyCatchBlock	Error Prone
Use Meaningful Assertions	SimplifiableTestAssertion	Best Practices
Favor Abstract Over Concrete Types	LooseCoupling	Best Practices
Favor Method References Over Lambdas	LambdaCanBeMethodReference	Code Style
Favor Logging Over Console Output	AvoidPrintStackTrace, SystemPrintln	Best Practices

Potential PMD rules

These are the examples, for which no direct counterpart in PMD exists. But on first glance, it sounds feasibly to implement PMD rules in the future.

Example Title	Related existing rules
Avoid Negations	ConfusingTernary
Simplify Boolean Expressions	CyclomaticComplexity
Avoid NullPointerException in Conditionals	LiteralsFirstInComparisons, BrokenNullCheck, EqualsNull , MisplacedNullCheck, UnusedNullCheckInEquals
Replace Magic Numbers with Constants	Checkstyle MagicNumber
Avoid Collection Modification During Iteration	n/a
Avoid Compute-Intense Operations During Iteration	OperationWithHighCostInLoop (Apex), AvoidInstantiatingObjectsInLoop
Favor Format Over Concatenation	n/a
Favor Java API Over DIY	AvoidArrayLoops, UseArraysAsList
Remove Superfluous Comments	Checkstyle TodoComment, JAutodoc
Remove Commented-Out Code	n/a
Avoid Abbreviations	Checkstyle AbbreviationAsWordInName
Fail Fast	n/a
Expose Cause in Variable	AvoidThrowingRawExceptionTypes
Always Check Type Before Cast	n/a
Expected Before Actual Value	n/a
Use Reasonable Tolerance Values	n/a
Let JUnit Handle Exceptions	n/a
Favor Standalone Tests	n/a
Split Method with Boolean Parameters	n/a
Split Method with Optional Parameters	n/a
Avoid Leaking References	MethodReturnsInternalArray
Avoid Return Null	ReturnEmptyCollectionRatherThanNull
Favor Lambdas Over Anonymous Classes	n/a
Avoid Exceptions in Streams	n/a
Favor Optional Over Null	n/a
Avoid Optional Fields or Parameters	n/a
Use Optionals as Streams	n/a

Difficult rules

These are the examples, for which is sounds very difficult to implement a reliable PMD rule for detection.

Example Title
Ensure Code Symmetry
Favor Enums Over Integer Constants
Group With New Lines
Replace Comments with Constants
Replace Comments with Utility Methods
Document Implementation Decisions
Document Using Examples
Structure JavaDoc of Packages
Structure JavaDoc of Classes and Interfaces
Structure JavaDoc of Methods
Structure JavaDoc of Constructors
Avoid Meaningless Terms
Use Domain Terminology
Explain Cause in Message
Structure Tests Into Given-When-Then
Describe Your Tests
Parametrize Your Tests
Cover the Edge Cases
Favor Immutable Over Mutable State
Combine State and Behavior
Favor Functional Over Imperative Style
Avoid Side Effects
Use Collect for Terminating Complex Streams
Speed Up Your Program
Know Your Falsehoods

Java by Comparison and PMD