Monday, 25 July 2016

Microservices - Part 1, what are they?

We're getting a lot of requests lately to produce a Microservices course. It's been high up on our (growing) todo list so it will definitely happen, but for some reason I've held off. I think this might be because we've tended to favour courses based on specific tools and concrete techniques rather than high level architectural stuff.
But we're definitely doing it - possibly as a standalone course, possibly spread across multiple videos (the upcoming courses on Messaging in Spring Boot and Wildfly would both be good candidates to contain some microservices, and our series on DevOps would be a good place to cover how to deploy microservices). Or maybe we'll do both, a short course addressing the overall ideas and then we'll apply those ideas in the appropriate courses.
This will happen over the coming months, in the meantime here's a bit of an unstructured exploration of Microservices.

What are Microservices?


Simply put, the Microservice movement is a shift away from the old-school technique of building huge, single applications that have cross business scope (these are commonly known as Monoliths).
As an honest example, our website at VirtualPairProgrammers has been developed as a traditional monolith - a single WAR file containing, effectively, our entire business.
Ok, we're not a huge business. We're not Spotify (yet). It's just a simple shopping cart site I hear you cry! Well, there's a lot more going on behind the web interface:
  • Video production (rendering pipelines)
  • Video Subtitling
  • Newsletter Production
  • Customer Lists
  • Sales Data
  • Viewing Figures
  • System Administration
  • Support and Ticketing
  • International Currencies
  • International Business Rules (eg VAT, tax rates)
  • The usual CMS stuff (content management)
  • Standard eCommerce/Shopping Cart
  • Affiliate Management
  • Subscriptions billing and re-billing
Probably more. Our marketing manager loves nothing more than shoving in hideous cartesian join SQLs and exports to Excel, because - that's what marketing managers do. It annoys the purist developers who think that their Hadoop based Viewing figures code is so beautiful it should be framed and put in an art gallery - but that's a consequence of running a Monolith - we have different business areas with very different needs treading on one another's toes.
We're not ashamed of this monolith. It made absolute sense in our early days to deploy this as a single WAR file - it was the simplest thing that could possibly work, and work it did, for many years. But today, there's so much complexity in there, it's becoming more brittle over time. A simple tiny change (like a change in the VAT rate for a country) means a full rebuild (taking around 5 minutes) and then a complete reboot of the Java Server (Tomcat in our case).
A move to a Microservice based architecture would see us deploy multiple, small or tiny applications, each aligned to a specific area of the business.
In Part 2 of this series (next week), I'll describe our first steps in migrating this monolith to Microservices. In this blog, I'll talk about the general principles we should be adhering to.
In one sense, there's nothing exciting in Microservices - it's just good software engineering principles.

Loose Coupling / High Cohesion

At the core of Microservices is the same principle that is at the heart of any good software design. Loose coupling in this context means that a single microservice must do ONE thing, and do that ONE thing well. What "ONE thing" means is vague and is down to judgement, but key to a microservice is that it should be aligned to a specific area of the business. I've said that once already, but it's of utmost importance. It's absolutely no use making your Microservices align along tiers - if you have a "Web" microservice and a "Middle Tier" microservice and a "Database" microservice, then you don't have microservices at all - you've got three monoliths with an enormous amount of coupling between them. Which brings me to...
...loose coupling, meaning that the dependencies between the services should be as minimal as possible. In development terms, a change to one microservice should have minimal impact on the other microservices in the overall system. In run-time operations terms, ideally, we should be able to take down an entire microservice with no degradation of the performance of the overall system.

Code Repository Isolation

So at one level a microservice is a simple expression of good software engineering, but it leads to difficult architectural choices. If your system is now a hundred microservices, should you have a source code repository that contains all of the microservices (with the services being in subfolders)? Or would you have to maintain a hundred seprate git/mercurial repositiories?
The answer is separate respositories - if you go with one huge repository, the temptation will be to start doing mega-builds and this will lead to "lock step" deployments where all 100 microservices are deployed at the same time - you therefore have much of the unpleasantness of a monolith. There are in fact many successful Microservice projects that do keep single repositories, this is something of an "ideal" goal, and it's probably not a killer - but it makes sense that services which are deployed independently should also be developed independently.
In a similar vein, is it ok to deploy all of your microservices to a single server/VM instance/EC2 Instance/Azure Thingy?

Service Isolation

Ideally a microservice should be deployed onto it's own standalone "instance". Many projects do deploy multiple their microservices to a single machine, but again, this can lead to the temptation of coupling them together, leading again to "lock step" deployment.
This can be expensive, which is where container services such as Docker step in. A container can be thought of as much lighter than a Virtual Machine - a single VM can host multiple containers, each container responsible for a single microservice.

We love Docker at VirtualPairProgrammers, and yes, we will be doing a course on it!

Automate, Automate, Automate

You might be able to put up with the pain of manually deploying a monolith. Or manually spinning up a few Cloud Instances to host it. Or manually installing software onto those instances. Some people like pain, especially if there's a bit of drama involved. Scaling that up to 100 deployments, 1000 instances, forget it. Microservices absolutely depend upon the automation of deployment, of provisioning, of configuration management. Continuous Delivery (http://martinfowler.com/bliki/ContinuousDelivery.html) is a prerequisite.

No Integration Databases

This is my favourite principle of Microservices - there should be no Integration Databases - avoid them at all costs. (For a recap of integration vs application databases, you can watch a chapter from our NoSQL course here).
There will be much wailing and gnashing of teeth over this - the integration database is the most precious possession of many businesses. But a database into which anything can delve in, read and write at will, is both incohesive (ie it captures many different parts of the business, by definition) AND it is tightly (not loosely) coupled (again by definition, as many disparate applications DEPEND upon it).
So it's unarguable really, that integration databases have no part in microservices, it's part of the definition. However, in the real world, expect to see many projects proudly proclaim that they use microservices, of which one "micro"service is the "database service". Which you can't change without the permission of the DBA. Oh look, there's a "Business Logic" Microservice!
These are just my unstructured thoughts about the main ingredients of a Microservice - in part 2 (next week), I'll describe a concrete example of how we at VirtualPairProgrammers are slowly migrating our IT across to a Microservice architecture.


Monday, 18 July 2016

How (Not) to Design a REST API

On my last project I had to integrate our code with an external REST provider. The provider was a banking service (I'll call them "TheBank" to protect their identity) and we had to record financial transactions with them.

Check out the API documentation that we had to work with here (note: I've completely changed the API terminology so that the actual provider can't be identified, but the structure and the errors are still the same).

A good Java interview question would be - what have they done wrong in this REST API design? Have a read of the docs before reading on, see if you can come up with a list of what could be improved.




(Plug: our JavaEE with Wildfly and Spring Remoting courses both explore good REST API design.)

Ok, you're back and hopefully you're face-palming. There's a quick answer - it's all wrong. I can't find a single good decision in that entire API. Good going, TheBank.

Designing APIs is hard, admittedly, but whoever put this together hasn't even grasped the fundamentals of REST (or HTTP). But this isn't an isolated case - I've lost count of how many times I've had to integrate with similarly broken APIs - in fact it would be much quicker to count the ones which *are* well implemented (the figure is not far north of "zero"). This is probably the worst I've seen.

I'm not a REST zealot by any means, in fact on our course I openly admit that HATEOAS is a bit of a lofty goal and it's no disgrace if you don't go that far. I don't care about purity or satisfying some aesthetic goal. What I do care about is wasted time and development effort, and I care if I'm forced to write brittle and error prone code.

So let's run through TheBank's blunders and see why it matters:

No URIs or Representations

Leaving aside the dodgy looking "endpoint.shtml" (what does this even mean? SHTML was a server side include, some kind of Apache extension. Why do I as the client care about this?), they are routing every single API call in through a single URI. Thus they are immediately losing the expressiveness of URIs. The URIs *are* the API.

So rather than an API, we have a single method with a huge telescoping list of query parameters.

Even though they call their API "RESTful", there's no trace of any kind of representation. This means that all the data for every call has to be converted into a long series of query params, leading to the very ugly and unreadable construction.

[There's nothing wrong with query params - we use them on our REST courses. But only for constraints or extra information that doesn't belong in the representations. Example - if you only want the first 20 records in a query, then this would make a good query param.]

Why this matters: if done properly, I could have quickly coded up a "Transaction" class in my client and let my framework (I was using Spring) to convert to JSON. Instead I had to spend time string concatenating, always an error prone and tedious process.

Invalid use of GET. No use of HTTP verbs.

GET, by definition, is for "safe" and "idempotent" operations. Meaning, no changes to state and no side effects. The "record" method is of course recording new transactions, so this has violated the contract of GET.

They're clearly unaware that other verbs exisit. POST should have been used for this non-idempotent operation, but update and delete would have also been needed to avoid the ugly use of "method=record".

Why this matters: I had to be extremely careful to ensure that my get requests are issued once and once only. Every call made to this API looked more or less identical because the very important "method=" is buried in an unreadable list of query params.

Implementing their own authentication scheme

The very weird process of hashing your API Key and Secret is clearly an authentication mechanism that they've invented themselves. Why? HTTP has a specified and well understood form of authentication - Basic Authentication. Under the standard, all I would have to do is send a "Authorization" header like this:

Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l

The odd looking string there is the username (API key) and password (secret) separated by a colon and then base64 encoded.

Instead, they want me to SHA-256 my key and secret before sending in a weird custom header.

I imagine their reasoning here is that SHA-256 is a secure one way hash that can't be intercepted and reversed. (The Base64 encoding is definitely not secure). This is totally unnecessary - if they'd simply mandated HTTPs, then the traffic would be automatically encrypted, including the username and password. My guess is they've had a business directive asking them to not insist on HTTPs, and they've tried to fix that by rolling their own (almost certainly broken) encryption scheme.

A fundamental rule of security is that you should never roll your own security scheme, because it *will* be flawed.

Why this matters: I don't care that the bank might get hacked, but I do care about the wasted day I spent trying to comply with their weird hashing rules. If they'd done it properly, my REST Client would have been able to handle the key and secret through a simple method call.

Bad return codes

This one wins them the jackpot. Every one of their API calls returns "Success!" (HTTP 200), until you check the body string and find out it actually failed. So I'm forced to write client code like this:

ResponseBody response = rest.get("big ugly uri");
if (response.getEntity().equals("Transaction Suceeded"))
{
 // continue
}

YES - they have misspelled "Suceeded" (should be two c's). So - when they fix this typo my code will instantly break. Thanks, Bank.

Why this matters: I had to spend a long time probing their API to find out the strings they're returning. I now have brittle string checks which are very likely to break at any time they decide to change those strings. And they will.

In many REST textbooks, they get themselves excited about HATEOAS. Forget that, the basics of URIs, Representations, HTTP Verbs, HTTP Return Codes and Security are all fundamental. Not many get all of these right, and a huge number get them ALL wrong. Don't be like the bank.

On our Spring REST course, we set a programming challenge, part of which is to design a REST API - you can see that video here - it's a bit long but there are some interesting decisions to make. Subscribe and you get the full course!