Subtitles section Play video Print subtitles (gentle instrumental music) - Hi. I'm Praveena. Just the logistics, make sure you rate the session and please do, if you have any questions, please do put in there as well. I'm really interested to hear what you have to say or what have you observed and things like that. I know that Sam has already given me an introduction, but I thought I'll just do it again. I'm a software engineer at Neo Technology or like how my Swedish colleagues would say, Neo4j. I want to talk about microservices. But before I proceed to present my case study, I just wanted to understand, to get a bearing on the audience so that I don't use a ton of jargons and I don't explain things that this audience already knows so it'll be easier to skip through the parts, the obvious ones at least. How many of you have worked with microservices? Okay. That's good. How many of you are developers? Okay. How many of you work like in agile methodologies? How many of you have use like ... Good. That makes things really simple. Microservices is a software architecture style where you compose complex applications through small independent processes. When you look at microservices, almost all major conferences around the world have a separate track dedicated for microservices. It's not just in terms of software architecture style, even like specific language conferences have specific tracks on microservices. So it's been gaining a ton of momentum in the recent years. The places which use microservices are companies like Pinterest, Twitter, Halo, Uber, and ton of these things which I already use and I'm sure like many of you are end customers of these services. Netflix at least, come on. Who isn't a customer of Netflix? When you look at these companies, you see that all of them have been there for maybe maximum of decade, you could say. They have lot of young engineers working there. In terms of legacy, they don't have much to carry it around in some sense. The things that microservices brings in are like ... The kind of small gifts that comes along with using microservices are the ability to do quick deployments, you can release something to production and then look at the feedback and try and adapt to the feedback as quickly as possible, and you can scale up and you can scale down services however you like based on how your user consumption is. It's very, very easy to enhance features or to add features. The part from when you start development to when it is deployed is already used so rapidly. These are the kind of small gifts that comes along with microservices. But what about age old systems? I worked in a place as a consultant where we were in an environment which had Microsoft servers behind this huge like, what do you call it, architecture, and we don't know what was happening there. But what we knew was that the customers were kind of seeing patterns where they saw new startups which came in to their own space, their own publication domain where they were able to quickly add features, the ones that they have been trying to add for years, and then eating into their user market. They were in a position where they were still like market pioneers but they slowly saw their user base being corroded away by tons of new startups in their space. They really wanted to change how they work. They wanted to change how they deliver software because ultimately that's what like, that's the interface to their end users. This web page that they were like ... That is what they were serving to the end users. I was, I mean, not just me, my team were dealing with a 20-year old system which was written in C and the team did not have any business analyst at the beginning. We were basically given a part of the codebase, not the entire one, a part of the codebase. We were told like, "This is the thing. "You don't have any business "unless you have to go ahead and re-engineer code." We don't know how things work or why things work but we want it to be exactly the same because we don't want to lose any user base because some features are not there which they were used to before. The ultimate goal that they wanted to have was also they were moving a development, their development center from across continents. They wanted to say, "We've got a new development center "and we are going to transform how this thing works, "and we want to prove the success." Remember the good old days when you were taking three months to just release, not develop, just release. We are going to like, we're done with that. Now, what we want you to have is basically write code and then release it maximum in a week. That's the target that they were looking at. With any business critical application, you always have tight deadlines which was the case in our system as well. When we looked at, when we saw how people are tackling problems like these, we saw microservices as like the future, this great world where it answered every single problem that we were having. We wanted to release quickly. We wanted to ... Gauge? Gouge? - Gauge. - Gauge, yeah. Gauge customer feedback. Yeah, basically. When we looked at microservices, we felt that it's the future. Like, yes, that's the answer. That's where we want to go to. But it also seemed like an Utopian mystery to me because when you look at ... Just to get some basic definitions. Utopia is like an imagined place where everyone exactly knows what they're supposed to do and everyone have a purpose in that world. A mystery is something that I mean it just explains what it is. I mean, it's impossible to understand or explain it. When I was given this project and I was looking at microservices, that's exactly how I felt. I felt like all of this is great, but it's used by companies which are there for 10 years and they basically know what they are doing and they know their software well. Whereas we were in that place. It looked a lot like Utopian mystery to us but we still decided to steer on because we were like, "We need to start delivering "and let's just do something and see how it works out." We decided to use microservices for our application. We initially thought that our monolithic application is going to be like that, this beautiful multi-tier, tire? Tier. This is great audience. Thank you. Anyways, multi-tier cake. We were all balancing it greatly. It's going to be like when we change it into microservices, it's going to look like that where you have like the same tier of architecture but you have smaller cupcakes where it's divided in really small tiny nuggets and you can just go ahead and bite on to it. Reality was (chuckles) much like this, like literally. I mean, not literally, figuratively it is like this. I don't want to be ... It was like this. It isn't an easy thing when you're dealing with a big pile of shit and now you have to deal with 10 piles of little poo. It's not like ... Oh, my earring. Sorry. Anyways, it's not an easy, it's not an easy problem to solve. These are basically like ... Oh, I ran a bit there. These are are basically 10 lessons and if I can, I can talk in length about each one of these and how important this is, but I'm just giving you a big gloss over what I learned by using microservices. The first lesson that we learned from doing things wrong was we have to keep it small. What do you mean by small? What does micro in microservices mean? Is it the number of services that you have in production? Which I'm gonna say the answer is no, it's not that. Why is it important to have small services? It is important to have small services so that you can basically rewrite the entire service if you want. You can measure the size of your service by answering this question which is, how long does it take to rewrite your entire service? The ideal answer, ideal answer for that should be in the order of two weeks. You can say, "Hang on a minute Praveena, "I thought you were saying that you're talking about "20-year old system, "and whatever you're saying right now, "it just looks like hipster talk to me. "Why? "Why would you ever have to rewrite an entire service? "I think you're lying," or something like that. But there is actually a reason why you want the service to be able to like, to have the ability to be rewritten basically. It is because so that you could work on a story and you should be able to deploy it quick to production. So, how long does your story take from development to deployment? Ideally, what you would want is when you put a story in, when you put a story in, you want it to be like a smooth ride across this lane. This was our card wall. When you pull a story from analysis into ready for dev, what you want is you want a smooth ride across all this lane and you have to come to the ready for prod. That's your ideal situation and you should be able to glide on that like you're riding on a cycle which is like on a really smooth plane. But if you're working with a service which is massive, then it's going to be an experience as though you're riding with this bicycle. Because it's not going to be a pleasant experience when you take a story, do the development, and then push it into test, and then wait because some other story is having a dependency on the one that you're working on. You will run into problems where you face questions like, I worked on a story to improve the query performance and there is a UI bug, and why is ... These are completely unrelated things, why is my story blocked because of a UI bug? I don't see any point in blocking my story. If you have a service which is small enough and it has one responsibility and it knows and it does just one thing, then it's very, very simple for you to be able to make changes to that service and be able to deploy it quickly. Remember, one of the important things that we had in our application was that it took months to see any changes deployed to production. And this is what we started to ... This is what we wanted to avoid is first to have a quick deployment to production. This was very important for us to keep our services small. The next one that I ... I'm sorry. The important thing to realize here is that you have smaller codebases that leads to really small context to change which means that it helps you to do autonomous delivery. Small isn't just beautiful, it's really practical for you to deploy small services in a very big environment. You have to remember why this is very, very important. Lesson two is to focus on autonomy from design to deployment. Autonomy is basically the right of self-governance. How does autonomy look like in a microservices environment? You would want, ideally when you're doing a deployment, you basically say that like, "I have developed "and I was paring or not paring, it's fine. "I'm done with development "and testing is over on that story. "Now I want to go and deploy." What you want to do is just push a button and it's deployed or it gets deployed automatically. You basically have absolutely no choreographed deploy- ... You don't have to do any choreography at all. You don't have to talk to people. You don't have to talk to other services. That's what you want. Here is where something to kick arms and leg. Earlier I was talking about how your service should be small. If your service is small enough, it definitely would have dependencies that it needs to make its work happen. Then in that case, how can you just deploy your service without choreography on its dependent services? Because more often than not when you're working on a solid piece of work, you do have to touch multiple services. You can again say that this is one of those things where people say one thing but the reality is actually something else which was in our case. One thing that we (chuckles) finally succumbed to was basically on saying that, "There's always going to be dependencies, "and we just have to deal with it." That was a important turning point for us because we tried really hard to have, "Oh no, no, no, no, "you're not supposed to do choreography at all. "We shouldn't have to do this. "We are doing microservices wrong." The first step to (laughs) ... The first step to requery is acceptance. We accepted that there's always going to be dependencies. Now let's have a conversation on how to deal with those dependencies. What we wanted was no choreography in our deployments, but what we needed was actually easy choreography in deployments. Which is why this is like a user story card. I don't know whether it's really clear, but what you see in this ... (murmurs) What you see in this is like three different services on a sticky note. They run some numbers next to it. In my project, we used RPMs. All our services gets packaged as RPMs and then they are baked into an AMI and installed. What we knew was that when we developed this story, we knew that this one has this ... To implement the story, we had to touch three different services. Those three different services, once the changes were made, were built into that number RPM. We basically marked those RPM versions on that story and said, "Fine, if you're going to ... "If this story ... "If these RPMs are deployed in any environment, "then it means that the story's feature "is already there in that environment." That was one easy way for us to track how the services are dependent is like when you start working on the story, we decided like, these are the services that it's going to touch and these are ... Once the development is done, we mark the service RPMs that it was using and say like, "Fine, if you deploy these three services, "those service, the story is done. "It's basically there in that environment." So we made a conscious decision to think about what our deployment strategy is going to be at the beginning of the story development. When you do an analysis, the first thing that we did was like, the story is going to touch three services, so let's just identify those services and put it on the card. Then when the development is done, you just add the service RPMs there. When you do have easy choreographed deployments, there's always going to be breaking changes. The important thing is to plan for breaking changes. Sometimes when we were thinking about the deployment strategy, we realized that there are going to be breaking changes. So the next thing to think about is how do we ensure that the end user isn't going to be affected when these breaking changes are going to be deployed? How do you avoid breaking changes? We had a bunch of, a set of tools that we were using sometimes in conjunction with other things and sometimes by itself. We used semantic versioning. We ensured that we had a tolerant reader in our APIs. Sometimes we would do lock-step deployments. The one thing that we used extensively was feature toggle, and feature toggle is based on which environment a certain story is in. While a story is in development, it will be feature toggled on on dev environment, but the QA, the staging, the production environment, it would be toggled off. Even when the RPMs actually reached to those environments, until we are able to test and all those necessary checks are done, they're actually not available to the end user. When you have feature toggles and when you ensure that you have semantic versioning tolerant reader and other things, you can try and avoid breaking changes and you can at least plan for contingency measures. When we did have breaking changes, like if you see, this one says here, blocked on some story here which is in analysis. This is one way where we identified like these stories, although they are independent, if we clamp both of them together while in development, it's going to take weeks for it to complete and weeks for it to finish testing. What we would go ahead and do is actually divide that story into sensible pods and then say it's going to be blocked so do not deploy this until the other story is ready. The developers can continue with their development. The testers can continue with that. When these two things are ready, then you just basically turn the feature on. That's one of the important things. When we were able to plan for it at the start of development, it made a lot of sense to us. Another important thing that microservices brings along is actually having a heterogeneous architecture. This is one of the things that people talk repeatedly that you can explore to your benefit to say that you can choose the tools that works for you. But when you work with old systems, you have to be really careful on what you're doing. As a consultant, I have the responsibility to ensure that I'm not choosing tools that my customers are not able to maintain it at the end of it like when I'm out of the door. There comes a certain responsibility when you're deciding what you have to do. In our case, we chose things that we knew that is going to add long term benefits to customers. They were done in really simple languages rather than like ML or like Clojure, dare I say? (laughs) We used ChatOps. We used things like Qbot to make sure our deployment ... You can basically ask Qbot which is like a JavaScript ChatOps that you can add it to HipChat and Slack as well I suppose. It will tell you what's the status of your deployment and you can tell like notify me when this service is deployed. So you can just go ahead and do your stuff. Go grab a coffee or something like that and it will tell you basically when you're deployment is done. Other things that we used were Kibana dashboards which is again built on ELK Stack. You know it's saved in JSON. People were able to interface with it and it's easier to learn as well. We were able to quickly fix. We had the ability to add really small cards which would go through all the lanes really quickly. If we know that we are getting user feedback on a certain thing. That was like, this is going to add long term value but it's like a really short thing. Can we just quickly do this? This was not just for the end user. This was also for things that AppSec people needed or things that the DevOps people needed. We were able to quickly prioritize stories on those ends. Other things that we did was like automating support scripts. ChatOps was using JavaScript. For support scripts, we realized like, so I was working on a Java application. Now, Java and scripting doesn't really mix so we decided we will do, we will use Python scripts because our infrastructure was an Ansible so we thought Python runs really good with Ansible. We wrote really small scripts whenever systems would go down. If a support engineer were to look at it, how would they do it really quickly? We wrote these really small scripts. The next thing we did was automate those support scripts. Without even the intervention of a support engineer, the minute something goes down, it shows certain status automatically. These are just examples of things that you can use even in age old systems where it's really hard to incorporate things. Lesson four was to pay attention to the bounded context. I think, I really do think, no microservices talk gets complete without mentioning bounded context because it's such an important thing to consider and it's very, very easy to miss as well. Bounded context. This is from Martin Fowler's blog and Eric Evans on his Domain-Driven Design book talks extensively about this. In simple terms, bounded context is like you have in any application, you have multiple context. These multiple context have models which are named exactly the same. For example in a support context, you have customer, and you also have a customer in a sales context, but although these models are named the same and they kind of indicate the same person, the operations that that person has is completely different. This is actually a reflection of humans in reality. I am here as a speaker but I'm also an employee at what do you call it? At Neo Technology. This happens. This happens quite a lot. People sometimes misinterpret this and they try and share operations and models between these two contexts. That's a big no, please don't do that. We were bitten really hard by that. I'll show you how quickly. We had two services, authentication service and the web app. When the web app is trying to authenticate, it uses the authentication service returns back a JSON response with the auth token username and some user details, let's say age for instance. What the web app needed was a response like this that didn't need the auth token. But if I were to impart this inside authentication service, then it would mean that authentication service would have to return this response. Which is what we did, honestly, which is what we did. Which meant like anytime web app had to change, we had to change what authentication service was returning, which was a big mistake on our part. What we should have done instead is made this transformation inside web app and turned off authentication service. So authentication service always gives this one response and its consumers interpret it in different ways based on what they want. Lesson five was choose what works for you and document your reasons. I'm a software developer and the last thing that I want to do is write pages of Confluence document. But it was very important for us to do this one important step. Because developers like to add, always improve the system that they're working on, which meant that we were having the same conversations over and over again about why are we doing this and this shouldn't be this way kind of thing. Which is fair. I mean, I would do that in anything that I am in. But what it meant was that people were getting tired of explaining this again and again and try and see what was happening. One of the things that we did was when we take a decision on something, just understand the context why a certain decision was taken and just put it up somewhere, in the mail or somewhere so that it's communicated to the entire team about what we are doing and why we decided to do this, and re-evaluate it anytime where one of those context, things changes. So you don't end up having the same discussion over and over again, but you do have the discussion where it matters. When your context in which you decided a certain, when you came up at a certain decision, has changed, then your decision no longer applies because your context has changed. We ensure that certain ... We imparted certain things like that. An example of this was understanding where our constraints and where our principles came from. An example of principle was we decided that we will do our validations at every service level which meant the validations were duplicated but we were fine. We had good reasons why the validations should be duplicated, which I'll probably cover in a short while. The important thing was around constraints. There are certain things which you can't change in your project constraint. For instance, we had ... I did tell you that I was working on a Java application and we had Python scripts. When we came to use a journey test, we started with Ruby. At that point, our clients went like, "No. "We can't do one more language into the port. "Can you please just pick up something else? "Either Python or Java to do this?" So we did, like after doing the user journey test in Ruby, we decided to change the user journey test using Java because Cucumber has a Java API as well. Anyways, we decided to do that. It made a lot of sense at that point why we had to do it because it was from a constraint which we had no control over and there's no point in discussing this over and over again whether Ruby is better or Java is better when your clients says to choose Java. About validations. For principles, we decided to ... An example of this is to the acceptance that it is okay to duplicate validations. The important thing when I first became a developer in my training that was taught over and over again was DRY, do not repeat yourselves. Why would I want to duplicate validations? Because that's one of the tenets of software development. You never repeat yourself. But it comes from a greater understanding where you understand that like, okay fine, there are certain duplications which you never want to do. But then there are certain duplications which you should be doing. An example of this one would be say validations where you have your client-side validations in JavaScript when a user is entering something and you say you match an email or something like that and you say like, "Well, this is not a valid email address." That's something that you would do in JavaScript. Whereas when you're trying to save something in your database, that's a completely different kind of validation which you should be doing which is, does it have any SQL injection in it? Does it have Bobby Tables and things like that? These validations serve two different purposes. When it's done in the client side versus when it's done on the server side. It is a good thing to ensure that you do validations whenever it is necessary. In some cases where we had to duplicate validations, we needed to check this over and over again. There was this tension between like, do we make shared libraries or shared clients or do we just accept everything that comes by? In our case, we decided to do both. One of the services which I was taking care of, the service itself published a client for any of its consumers to connect to the service. The client had all the validations that the service would do. If you were to use the client, then you can ensure that all the validations are done. There were times where you don't want to use shared client, but you want to use instead a shared library. An example where we use a shared library which is a very good layer of abstraction was managing negative TTL caching. So we used ELBs. When your ELB switch is being made like your service endpoint switch is being made, at that point, there are times where your negative TTL caching affects the service discovery. When we had to abstract that, so that was a very important thing for all the services to basically have it done the same. It was no reason for it to be done in two different ways. That was a very good example for us to do something in a shared library. An example of abstracting something which we shouldn't have is abstracting models into shared libraries. An example of this is what I spoke about earlier. We saw that the support context and sales context had basically customer and product shared. Someone thought like, I mean, I think in all fairness it could have been me. It was like, "Oh, I know what to do here. "We're doing the validations again and again. "Let's just do the smart thing. "Let's just extract that into a separate library "and then ensure that "those domain models are basically "shared between those things." What it did was basically removed this boundary completely and it made this entire thing into a support and sales context, context. It was a mess. We couldn't add any features to one context without affecting the other. Which is why we wanted the microservices in the first place. Which is why we wanted to keep it small in the first place. But what's the point of keeping it small when you share libraries in between? You might as well have both of them together. Which was a costly lesson for us because by the time we wanted to change the web app to do something else, it was very much tied to authentication service and it meant that we had to ... We spent like a month or so trying to strip away features from web app, and it was still a mess. Which is a shame really. Lesson six is to embrace Conway's law. This is one of those things that gets mentioned along with bounded context time and again in any micro- ... Sorry. In any microservices talk. Which is Conway's law. Conway's law is basically something that has been threw in again and again where when you try and come up with a solution, your solution is going to mirror what your organization structure is. So instead of fighting Conway's law, it's very, very useful for you to embrace it and make it work for you. An example of that is deciding between these two teams. Initially we had a UI team, which is why we have a web app service. Then we had a platform form team which is why we had multiple, what do you call it? We had one AWS deployment architecture. Then we had feature teams which meant we have features in between. But what this (laughs) meant was that any time you had to add a feature, it is going to touch both the UI team and the feature team and the platform team. Which meant you had to somehow coordinate giving or splitting work across these three different teams and deployment between these three different teams. It's a mess. As I say it, I can realize how difficult. It's like a trauma. It's coming back all to me. It's not really a good feeling for you to ... The last thing as a developer you want is sit in meetings about how you should develop code or how you should deploy code. You're like, "This is what I wanted to avoid. "Why am I here again?" An important thing that we decided after trying out the wrong way of doing it in that context is try and do it in a different way. Which is actually make our teams, like our feature teams have UI developers, have a platform person, and importantly, have a product owner within that. When we had a product owner team. I don't know, somehow it was making things really difficult when we had to make decisions. We had to send emails, wait for a long time, and wait for a long time for it to get business owner approval and things like that. The minute we actually had a product owner in our team, when he was attending on stand-ups, everyday stand-ups, it was very easy for us to just go like, "What do you think about this story? "We have tested this. "Should we just deploy this?" He goes like, "Yup, if it's .. "Yeah, sure, deploy it." At the end of the stand-up it was deployed basically. That's how simple things became in terms of deployment. What you want to have in your team is a vertical slicing of teams instead of having horizontal slicing. Other things that we used Conway's law to our benefit was to have self-contained systems and just talking to a product owner about, "Look, you're the product owner, "you give us the requirements. "You're not stopping from changes being happening." Because we are basically in this together. We need to work out how we are going to work but it's not like I'm suggesting something and you are like, "No, that can't go in." It's instead the other way around where you're suggesting us something and we are going to make sure that that change is actually in. That made tons of difference. Lesson seven is to take monitoring seriously. I can't stress how important it is to ensure that you monitor your systems. Because things can go wrong easily. I mean, when you have one thing to take care of, you just have to sit and stare at that one thing. Whereas when you have hundreds of things to take care of, you would go mental about this. It's important to take your monitoring seriously and ensure that you have proper alerts to just alert the important things not all, like some region. A region went down is an important thing, but like an instance went down may be not that important because you do have ELBs and things like that in front which can take care of it. It can be like a post-mortem. You need to understand at which different levels you have alerts. We had our dashboards configured for three important things. The first one was business metrics. The second one was like mangled together which is your application metrics and your system metrics. Business metrics, this was a huge win for us to win our product owner's confidence because he could basically see anything that he's adding, how quick it reaches to production and how does it affect the user interaction with the system. We were showing him after a certain feature was added, how did the downloads increase or decrease or how many users stayed on these pages and things like that. Those are the kind of metrics that they like to see. Those are the kind of metrics which you should track as well as a developer on your team to understand how the software that you put out there is interacting with the users. We use a ton of these tools, whichever one that suited our needs and we worked on our dashboards continuously as well as we went by because we were like, "When we thought that this one thing was really important, "now we actually fixed that problem, "we automated that problem. "It shouldn't happen anymore, "so we can just push it down the stack "and let's look for some other alert "that we should look for." We worked on this continuously. It wasn't just one thing that was just there. An example of this is, how many of you have used Hystrix? Netflix Hystrix. It's a great library. It ensures that ... It has circuit breakers. If any of your dependent services aren't working, we decided that we will show the content to the user anyway because the user shouldn't be penalized for basically our problem of not being able to put software out there properly. That was an example of a very good callback we had in our system. We would see, we were tracking callbacks in our application. Hystrix also gives out metrics for you. We're tracking callbacks in our application and we saw that certain services were like, you see here, there is a huge, what do you call it, like alert which means that something went wrong there at that point like maybe a region went down and something happened. But the user got the content anyway. Other things that we were tracking were things like availability zones. Is there any CloudWatch alarm? Not like specific alarms. It's like, if there's any CloudWatch alarm that's on then it would just ... Where's my thing? That one would go red and a support engineer would go ahead and look at it. That was an example of an application metrics. This is an example of a business metric. We track the user interaction and how much time the user spent in our application. We send an event out when the user enters, enters the system, and we also send an event out when the user exits the system. We were able to see something happening here. That's around the time when the callbacks were failing and were like, "Okay, there's definitely something wrong." Those are the ways that we were able to show to the product owner like, "This application that we've put out there, "this work that we're doing in terms of infrastructure, "it's for you as well. "You can add these dashboards." He was really happy. Going on to the next lesson. An important thing for us was to make sure that we do testing. This shouldn't be a point of contention in this century. I mean, it's important that we have to do testing. It's important that we have to do testing at different levels. In our case, when you talk about microservices, people always say that, "You have done your development properly, "just release it into production "and you can see something goes wrong, "just pull it back or roll back "or release a fix." I was working in a system where our product owners were not used to change at a rapid pace. This was basically scaring them that they could potentially release software and lose customers. They weren't basically ready for it. In our case, we decided, in that case, what we will do is have a QA environment which mirrors exactly what the production does and we will run a soak test so that we have always a user interaction happening with the QA environment. Anytime we were finished with a story, we will check whether the QA environment is free and we would just release it in QA. Because the soak test was running continuously, it was giving us feedback if anything were to go wrong. That covered 80% of our cases when things could go wrong. The product owners seem to come around after it. When he saw that like things went wrong in QA, developers are working to fix it. It was very easy for us to say once something is QA testing done, it's ready for production. We can release it immediately in a matter of minutes. Once QA testing is done, we would talk about it in the stand-up. We'll be like, "We saw this. "We saw these problems, we fixed it. "Can we release in prod?" It went from a point where the product owner was like, "Yeah, sure. "Go ahead, release it, it's fine. "If it's done in QA, if all the test has passed, "just go ahead and release it." It went there in a matter of months. It was a great feeling for us to be able to do that. This was basically how we thought we did our testing. When we released something in production, it's like Schrodinger's cat. They say that you never know whether the cat is dead or alive when it's in a box. The moment you open it is when you see it's dead or alive. But no, you can actually test it. If you shake the box and the cat shouts, then it's probably alive. That's how we tested in our QA environment. Apart from this, to enable that our production is always online, we had ton of other measures that we added apart from just testing. Chaos Monkey is an important thing that I think if you're working in a microservice environment, even if Chaos Monkey isn't automated in your test set, just try doing it as like exercise for the week and see what happens. Chaos Monkey will take care of bringing down systems and you can actually track whether your infrastructure is resilient enough to handle chaos. I think one of the important things that we did which was like a no-brainer and it was really quick fix for all our services was to add health checks. But the health check would not only ... Say for example the web app, the health check would not only check whether the web app is up and running, the web app is able to accept request, it would also check whether it can talk to the authentication service that it was dependent on or other five other services that it was dependent on. We tied the health checks of a service to be also tied to the health checks of its dependent services. That was a very, very easy win for us when we broke any contracts between those two services. We deploy something in production or let someone ... We deploy something in QA or someone deployed a dependent service in QA, then we know that the health checks of the dependent services go down and we're like, "Wait, there's something wrong here. "Maybe this shouldn't go to production. "Maybe we should just stop and see what happens." Just by doing that, it gave us tons of resources on how we are doing things wrong and all we had to do was just quick go fix one by one and it was a very, very simple thing to do. If you can do one thing to your system right now which you haven't already done, please go and add health checks. Also, health checks to make sure that it's dependent on your dependent services. In terms of testing, we had our ... A test pyramid looks like this. You need to have tons of unit, I mean not tons. You need to have a lot of unit test underneath and then on top of it, you have integration test and then functional test. You can't see clearly, I'm really sorry. That's a contract test there. You basically want to have something like that. Sorry, that's a journey test. But in our case, it actually became like this. We had a unit test, integration test, a functional test, and then in between, we have something called as a service test which was checking whether the services on by itself is working by sending dummy requests into it and checking whether we are getting a proper response back. That was a service test which was ran anytime a service was deployed. On top of that was the contract test which is something that our dependent services would give us so that we add it in our pipeline and we go like, "Now this thing has been deployed, "the service is working, "but is it still working as how my "consumers expected it to be working?" That's why we want to test the contract test. On top of that was a journey test which was actually replicating how a user interacts with the system. If we were to do just depend on the journey test and not do any of these things, it would take ... We wouldn't know when something fails, what's failing basically. So it's important for you to realize how do you ensure that your service by itself is working, and then service with its ecosystem is working, and then services for your user is working. That's very important for you to think about in your testing strategy. Note that tests are there to validate constraints. They shouldn't be constraints themselves. If you see yourself whenever you have to add a feature going and editing all the way down like the pyramid, then you're probably not testing it at the right level and that's an indication that you have to have a look at and definitely fix it whenever you can. Lesson number nine is definitely invest in ... That should say 10. I'm sorry, nine. Lesson number nine is to invest in infrastructure. Again, just so you guys know, this is an indicative chart of how my, what do you call it? How the story split was. There was actual data behind it but for reasons, I can't share that with you. This is an indicative chart. When we started our application between infrastructure and feature, at the start we had like hundred percent infrastructure stories and then it started going down. This is for one service. Then we added one more service. This is how it looked like. We started getting feature stories at the beginning because some infrastructure work has already been done. Then for the third service, this is how it looks. As we went by, we saw that the amount of infrastructure code that needs to be done was reducing. It's important that, don't be ... It can be very daunting when you start with infrastructure. It's like, I have to reinvent the wheel again. Why do I have to do it? But perseverance is key. Definitely you need to invest in infrastructure. Why wouldn't you at this age? That's an important thing. You need to embrace new technology. It's an evolving ecosystem out there and you are losing out if you're not actually tapping on to that potential. Digital disruption has already happened. The blockbuster used to be a big thing. Now, I don't own a TV, I don't own a DVD. My MacBook doesn't have a DVD slot. That's where it is at. Now, a lot of things are done towards your middle man. If you're not embracing new technology and if you're not adopting things that's already out there, you're going to be phased out sooner than later. I think it's important for really old companies to understand this and not fight this just because you have the user base because it's not going to work out that way. At least maybe for now it works out, but it won't be like that for too long. Every environment like banking, technology, other services, everyone is working on this space and you need to tap into that potential to make things happen for you. There are new kids on the block. I would definitely have you please go ahead and read about it. If you're high up in the ladder, please ensure that you know what these things are. The three factors that people say that it contributes to your personal growth is autonomy, mastery, and purpose. I think your microservices also need to have these three things in them for them to work properly, which is like ensure that it has proper autonomy and it knows what it is doing. It has the right tools to do and it has a purpose to exist. If it doesn't exist, just don't leave it out there. Microservices helps you to improve on iterations which is very, very important in agile software development is to improve in iterations, adapt to feedback. And having autonomy and taking control of business as well as technology in smaller teams helps you to deliver software quickly. You have to ensure that there is high cohesion between your services but there are coupled loosely enough such that you can alter its state whenever as you need. You need to be able to embrace Conway's law. If your team isn't structured to do that, then that's definitely going to get into the way of you adopting microservices properly. Why do microservices? It's a lot of fun doing it out there. As a developer, it gives me great purpose. when I work on something and I don't have to wait for it like six months to be released to the user. I enjoy it when I see something that I worked on and I can actually show to my mom. Like, "You know that thing? "I worked on it." I think I deserve bragging rights on the things that I worked on. That's my summary. These are resources. I'm just reading my slides at this point. Image credits. Thank you. (audience applauding)
A2 service user qa basically test context GOTO 2016 • Microservices: A Utopian Mystery • Praveena Fernandes 119 9 colin posted on 2017/04/26 More Share Save Report Video vocabulary