I have not written Java in more than three years. I started working on Java applications earlier this year. This post documents some of the Java quirks, tooling, and best practices that caught my eyes in the first 3 months of meeting Java again. It is intended to be useful for others (and my future self) who experience some gap years with Java. The items and references mentioned here are not comprehensive. They are meant to be entry points for the interested readers. I might write detailed posts about specific topics in the future.
Java Quirks¶
-
Effective Java ( Blo18 ). This is the best refresher. After many years, this is still the best book on the best practices of Java programming. This updated version offers guidelines on the functional features on Lamda and Streams. This had been a must-read, and it still is.
-
Pseudo pass by value. I could not choose the exact semantics of parameters passing, e.g. by value, by reference, and by pointer, as I would in c++. The mode of parameter passing is fixed in Java. It is technically always pass by value, but that is confusing terminology. For object parameters, it is effectively pass by value of the pointer. If the variable is reassigned, the original value is unaffected. If the object’s members are reassigned, the original value are affected. For primitive types, it is pass by value. If the variable is reassigned, the original value is unaffected.
-
Check Exceptions. See che14 , Blo18
I still do not have a strong opinion on this feature. In many code bases, it feels pretty random on how folks are passing up and down the chain of checked exception. In rare cases when I am writing a small library, it makes sense to indicate expected issues through checked exception. This features feels awkward. Everyone has an opinion on it in code reviews and uses up a lot of back and forths. I am not convinced one way or the other. I recommend minimize its usage as much as possible. I would not object to a style guide and linting tool that warn against all checked exceptions.
-
Prefer protobuf or thrift over Java serialization.
-
Prefer StringBuilder. StringBuffer is synchronized. StringBuilder is not.
-
Use file utilities from Guava or Apache commons
-
JVM exiting could be tricky.
-
Java class path ordering matters
-
AtomicInteger is lock free, using compare-and-swap (CAS) hardware instruction in x86.
synchronize
is lock based. However,synchronize
only uses CAS and work similarly as a lock-free algorithm when there is no contention. If contended, it uses spin locking and then fall back to system calls (e.g. futex). See Tho11 , tho11 , Dic06 , Goe04 , ato14 -
The utility in Concurrent Collections makes extensive use of lock free algorithms
-
Java non-blocking IO is not the same as Asynchronous IO. AIO is confusingly termed
NIO.2
. See disambiguiation and this post. -
Java has good support for memory-mapped file IO
-
Java Concurrency in Practice ( Goe06 ). Going through this book, even just the table of contents, is a good use of time to get a good download on most of the core concurrent concepts in Java. This does not include actor model made popular by Scala and Akka.
Library Choices¶
-
Google Guava: Its collection idioms are still the best.
-
Jackson for json
-
Http Client Library: If I am developing in Java 11, I would go with the client that comes with the standard library. In older versions, I would choose one of these three:
- Jersey: My top pick.
- Google Http Java Client
- Apache Http Client
- OK Http
-
HTTP Server Library. I had never been fully on board with any of HTTP REST web frameworks in the past. As of the summer of 2019, my opinion has not changed. I would like to see a lightweight framework that are easy to get started, logically compact, and easy to upgrade.
- Easy to get started. Spring Boot and DropWizard are good in this regard.
- Opinionated and hardened designs on its API and usage.
- A minimum set of well-chosen dependencies. The library maintainers need to be disciplined to keep the dependency graph small. Existing frameworks are hard to upgrade due to conflicting dependencies issues.
- Make stuffs optional: Features such ORM, Swagger, and Metrics should be are extensions.
If I have to create a lightweight REST API server, I would base on Jersey. See an example). It is worth mentioning a few REST alternatives: GraphQL, GRPC, and Twirp.
-
Yourkit is a decent profiling tool.
Dependency¶
The diamond dependency problem will show up one way or the other. Regardless if it is a monorepo or multi-repo setup, dependency issues cause significant headaches and time spent on trouble shooting. The problem will show up unexpectedly when there are attempts to upgrade Java version, introduce a new computing framework, use a new build tool, upgrade old libraries, or add features to legacy applications.
The worst offenders are also the most popular libraries and frameworks: Guava, Jackson, Jetty, Jersey, Dropwizard, Spring, etc. Issues on libraries such Guava or Jackson are usually easy to resolve. Issues caused by Jetty and Jersey are still manageable. When it comes to Spring, it requires a healthy dosage of mental fortitude as much as debugging experiences. It speaks volume to the importance of choosing technologies with a long view.
The most common strategy is to specify a fixed version of the conflicting library. However, that is easier said than done. In all likely scenarios, the dependency graph is generated through walking down transitive dependencies. There are likely too many conflicting libraries to manually inspect by hand. The hard part is to identify a smallest set that works for the application in question. It is an art.
Tooling¶
For dependency management, I would only consider Maven and Ivy. Both work well enough, and most build tools will be able to work pull and push dependencies from both type of repositories. However, each build tool has a strong preference over one or the other.
-
Maven: This is by far the most popular build tool. It has been around for a long time. It is the simplest to use. It works out of the box. It is designed to work for Java projects. I would not try to fit this tool to work with any other languages.
-
Gradle: It is a general purpose build tool. I do not have experiences using this for a monorepo including multiple languages. I would choose Bazel instead if that is the goal.
-
Ant: It is a general purpose build tool.
-
Make: Good luck! If it is a toy project with multiple languages, it is fun to use make. It would not be practical for any production system.
-
Bazel: With the maturing of the Bazel rule on using Maven artifacts, I am choosing Bazel for both small and large projects for java-only as well as monorepo supporting multiple languages. I will write a post and example on how I maintain my bazel-based playground monorepo supporting go, python, c++, java, scala, and javascript.
Application Setup¶
-
Spring Framework: NO! I would avoid using Spring for dependency injection, configuration, bean life cycle, task scheduling, datasources mapping, etc.
- No compile time checks.
- Misconfigurations show up at runtime. It is hard to debug.
- A bloated library. It always causes unexpected dependency problems. It always did, and always will.
- Hard to test components
- Too flexible and hard to enforce usage guidelines. Invariably, bad patterns will show up and it will be a losing battle maintain sanity.
-
Dependency Injection: My current vote is a no. I would architect my applications to be sufficiently small and wire them by hand. If I have to use one because for so many other reasons, the application has to be monolithically large and complex, I would go with Dagger. The history of major dependency injection frameworks in Java is a starting point to learn more about Spring, Guice, and Dagger.
-
Configuration: There are three common ways to store configurations: command line args, config files, and environment variables. There are good reasons to use any of the three. A design principle like 12 factor might or might not work well for your CI/CD and production environments. But in terms of simplicity of user experiences, I would say using only command line args is the simplest, followed by config files, and then environment variables. I prefer staying with the simplest model until it is necessary to add complexity.
Style¶
I prefer to choose a style and setup a formatter. Debating how to line break while reviewing a feature release is counter-productive, but I totally get it that there are indents and long lines could lead to compulsively typing “ugly”. I would pick Google java style guide and integrates the formatter into the build tool. The formatter could be run automatically. go-fmt says it the best. Formatted code is:
- easier to write: never worry about minor formatting concerns while hacking away,
- easier to read: when all code looks the same you need not mentally convert others’ formatting style into something you can understand.
- easier to maintain: mechanical changes to the source don’t cause unrelated changes to the file’s formatting; diffs show only the real changes.
- uncontroversial: never have a debate about spacing or brace position ever again!
Fun reads in Java¶
Citations
- Bloch, Joshua. Effective Java Third Edition. Pearson Education Inc., 2018. 1 2
- Checked exceptions: java’s biggest mistake. 2014. URL: http://literatejava.com/exceptions/checked-exceptions-javas-biggest-mistake/. 1
- Thompson, Martin. Java lock implementations. 2011. URL: https://mechanical-sympathy.blogspot.com/2011/11/java-lock-implementations.html. 1
- Biased locking, osr, and benchmarking fun. 2011. URL: https://mechanical-sympathy.blogspot.com/2011/11/biased-locking-osr-and-benchmarking-fun.html. 1
- Dice, Dave. Biased locking in hotspot. 2006. URL: http://literatejava.com/exceptions/checked-exceptions-javas-biggest-mistake/. 1
- Goetz, Brian. Going atomic. 2004. URL: https://blogs.oracle.com/dave/biased-locking-in-hotspot. 1
- When is atomicinteger preferrable over synchronized? 2014. URL: https://stackoverflow.com/questions/11670687/when-is-atomicinteger-preferrable-over-synchronized. 1
- Goetz, Brian. Java Concurrency in Practice. Pearson Education Inc., 2006. 1
- It's all about buffers: zero-copy, mmap and java nio. 2016. URL: https://xunnanxu.github.io/2016/09/10/It-s-all-about-buffers-zero-copy-mmap-and-Java-NIO/. 1
- Palaniappan, Sathish and Nagaraja, Pramod. Efficient data transfer through zero copy. 2008. URL: https://developer.ibm.com/languages/java/articles/j-zerocopy/. 1
- Jcs. The Log: What every software engineer should know about real-time data's unifying abstraction. 2013. URL: https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying. 1
- Šor, Vladimir. It's all about buffers: zero-copy, mmap and java nio. 2013. URL: https://plumbr.io/blog/memory-leaks/why-does-my-java-process-consume-more-memory-than-xmx. 1