Using an architectural review for improving site reliability

Tuesday, June 16, 2015 Category : , , , 2

I stumbled across another AWS Blogger,  Eric Hammond who blogs at

One of the recent things which Eric has done is his Unreliable Town Clock (UTC) which you can use to schedule triggering of AWS Lambda functions. Its a cool idea.

Eric certainly knows what he is doing, he not only launched a service he sat down and ensured "this service is as reliable as I can reasonably make it". No wonder he is a AWS Community Hero!

Of course reliability is only one of the elements of an architectural review of an AWS environment. You should cover off such things as Security, Availability, Scalability and Cost Efficiency. Eric has covered some of this. Check out what he has done to ensure UTC is always up and running, there are some great tips in there.

What if you wanted to do a architectural review of your AWS environment. How would you go about that? What questions would you ask? What things require focus? Maybe post in the comments. Saying I will call my friendly AWS Solution Architect is cheating, although its a great idea.

Two items that will really help you get started with a review are these whitepapers.

What would you do beyond this? Here is some very small things I would investigate.

  • Auditing. Is CloudTrail, Config and VPC Flows all turned on? Its hard to do debugging or forensics on something in the past when you were not capturing the data. Is all the activity from the instance logged to CloudWatch Logs?
  • What dependancies are there that might stop a failed employment? That autoscaling group may relaunch an instance if it fails. What AMI is it using? Is it your own AMI sitting in the account or are you launching from a public one? What if the public ones goes away because a new one is released? How is the code deployed into that AMI? Is it baked in, coming from S3, does it need to download software from github, what if it can't?
  • Monitoring. There are 4 metrics in CloudWatch for SNS. Are there any alarms that could be created to provide alert of failure? What if the number of published messages dropped below a certain rate? An alarm like that could replace what Eric is using for. You can even create those alarms with CloudFormation!
  • Turning on MFA is always a great idea.

This is the simplest of examples. For your typical system there are hundreds of review items to assess. But you get the idea.

Doing an architectural review is something you should do periodically in your AWS environment. As AWS keeps releasing new features there is frequently new things you can do to improve your setup. 

If only everyone was like Eric! Also, anyone use builds everything in CloudFormation is a winner in my book!


Shortcuts in the AWS Console

Monday, June 15, 2015 Category : , 0

Here is something that I did not know you could do for ages, shortcuts inside the AWS console that appear on the top bar.

See this animated Gif for how to add them and then use them. I think the Edit button used to be a lot less obvious.

Its very handy to have the links for your most frequently accessed services always there.



A quick first look at AWS VPC Flow Logs

Thursday, June 11, 2015 Category : , , , 0

I woke up this morning to yet another new AWS feature, VPC Flow Logs, as described by Jeff Barr.

Jeff did a great job of providing an overview so make sure you read that before continuing.

Its really interesting to think what you can do with network flows logs. A lot of Enterprise customers ask for this so they can perform various security activities. Many of those security activities are really not needed in the new world of Cloud. However there are some valid ones that you may want to consider. There are also some good reasons to have flows available so you can perform some troubleshooting of your Security Groups or NACLs.

I suggest people turn them on, capture the data and set a retention period on the destination Cloud Watch Log Group, say 3 days up to 6 months. The data is then there if you need it. Just like Cloud Trail data. Its to late after the fact!

A great little use case would be some general visualization of network flows on a dashboard. Its not real time but its going to give you a general indication. You could analyze the amount of traffic by category, such as incoming, outgoing, cross AZ and within AZ (by reverse engineering the subnet ranges). You could even track it down to traffic to AWS regional based services such as S3. You may want to track these patterns over time, looking for trends. You could also look at top talker hosts internally or externally. I suspect it will be of interest to people at first, and then it will be a colorful screen to show visitors. After all, AWS handles all that heavy lifting of operating and scaling the networking.

Many will be interested in monitoring rejected traffic and if they see a lot if it starting, wonder if there is something else going on they should look at or take precessions on.  Generally you probably don't care, nothing to see here, its just dropped traffic.

Be great to see what AWS Partners do in the visualization space, I sense some eye candy coming.

I quickly turned VPC Flow Logs on in my account this morning.

Here is my Cloud Watch console showing the Log Groups.

Notice I have set the expire at 6 months. You can see below that when I look at my Log Group each of my Elastic Network Interfaces (ENIs) is shown.

I have 4 ENIs. Some of those are for my Workspaces instances which is cool. 

If I look at the instance I launched this morning by clicking on the eni-981db9fc-all here is the data displayed.

Notice how I have applied a filter. Nice hey. Here is what that filter looks like in that text box.

[version, accountid, interfaceid, srcaddr, dstaddr, srcport, distport=23, protocol, packets, bytes, start, end, action=REJECT, logstatus]

Notice that by putting the field names separated by commas and between brackets you can parse out the text. This is a general feature of Cloud Watch Logs. The field list is in the VPC Flow Logs documentation. 

There is lots of filters you can apply, here you can see I am just checking for matching values of a destination port of 23 (telnet) and where the action was to reject the packets. You can see all of those machines which have attempted to telnet into my little server. Thats why it has a correctly configured Security Group!

There is documentation in CloudWatch for the filter patterns syntax.  It supports both string and numeric conditional fields. For string fields, you can use = or != operators with an asterisk (*). For numeric fields, you can use the >, <, >=, <=, =, and != operators.

If someone asks you which hosts are communicating with the database at the moment you can quickly jump into the console and answer it by look at traffic to the right port.

The other nice thing you can do is create a metric on this filter to pull out the data. Here is one that creates a metic on the number of bytes accepted as SSH traffic into the ENI.

I created a few of these for my machine, here is the metrics display after I pushed some data its way. I am using the sum function to get the sum of bytes.

During this time period there were a few rejected telnet sessions, some SSH traffic and lots of general traffic. If you can write a filter on it, you can graph it.

Of course this only gets you so far. You have to know the ENI etc. 

You will probably want to extract all of the data into something easier. If you want to roll your own a good way would be to create a Subscription on the whole Log Group, see and push all the data to a Kinesis stream (it will handle the scale). How do you get data out of Kinesis? Well you use Lambda functions of course, see You Lambda function could dump it to S3 and from there you load into Redshift (which can be automated too) or start writing some EMR jobs. Now thats the power of AWS.

Hope that little bit of a first look helps you understand a bit more about VPC Flow Logs. I am really interested to see what people are going to do with it. The main uses will be those occasional operations or forensic events. 



P.S. Remember, I might work for AWS but these posts are my own ramblings late at night. Its the geek speaking. 

How to rock re:Invent 2015, Rodos style

Wednesday, June 10, 2015 Category : , , 0

I confess I am a conference junkie. Not any conference, but the conference that fertilizes the roots of my current IT thinking.

Back in the day this was VMware. I was a VMworld junkie. I think I may have done six in a row. I collected the t-shirts year after year and even blogged about it. I would stay up till 3am recording nightly video summaries of the days events.

Today Matt Wood (@mza) from AWS, a rock star, did a post on his personal blog on How to Rock re:Invent. He lists things like how to prepare, what gear to bring, what sessions to see. It was music to my ears. Of course you should go an read all of his post.

Matt asked for any other suggestions. Of course I responded with a career limiting move (CLM) and started tweet bombing him with my deep experience and insight, okay random ideas. Maybe not my finest moment but he was in my wheel house!

So here are my random and not as well thought out additions to Matt's list.

Get there 
First stop, just get a ticket and a hotel room booked. This is the hard part. One year I was between two jobs and had to take annual leave, pay for my own flights from Australia to the US, scrounge a conference ticket and beg a spare bed in a friends room. People do more to go and watch a Rugby game, so you can do this for something as amazing as re:Invent. 
When it comes to hotels try and stay close to the convention. I once stayed at the other end of the strip in Vegas and it was horrible having to walk back and forwards each day. Its so great being able to quickly visit your room to drop something off or pick something up on the way to something else. This means getting your booking in early.
Try and arrive the day before or even two days before. If you want to play tourist, don't do it after the event, you will be exhausted and just want to sleep. Being adjusted to the timezone and having a bit of R&R before the week of full days and little sleep gives you the best conference experience.
Also register the day before the event. Registration always opens the day before at these things and there is less crowds. Its madness the morning of the first day, no matter how well organized it is. 
There is lots you can do before hand. As Matt says go through the agenda, think about what equipment to bring. I would add to his list business cards (old skill but useful when you are in a hurry and want to pass details). I also don't recommend you bring two laptops!
Bring comfortable clothes and especially shoes, you will be walking a lot! Think about what bag you will carry. You may get a conference bag or you may not. I always prefer my own bag. Ensure you have enough spare room in your luggage for any swag, conference materials or shopping you pick up.
What evening activities will you go to? There is always the exhibition opening which is usually packed and full of people trying to grab swag. If you don't like big crowds and mayhem you may need to skip this one. The conference party is always awesome and not to be missed (I did one year and regretted it). But what other parties are on and can you get an invite? Who are the cool vendors that will be having an event? The more you get into the community the harder this is as there are often multiple on per night and you have to choose or jump from one to another. 
Shameless plug for, but this is also something I did before I was an employee. If you live overseas place an order on for all those things you want and get them shipped to the hotel to bring home. I bank up my wish list all year and empty it each trip. You save a lot on shipping and its a bit like Christmas when you arrive. However check with the hotel for extra package handling charges as they can sting you. If you want to bring gifts home for your kids this is a great method, as you are not going to find a lot of gift items around Vegas unless you go to the outlet malls and who wants to do that!
Extra activities
There are extra activities that people often miss. The day before the event starts there are Bootcamps. You pay extra for these but they are really worth it, IMHO. They are either half or a full day and are presented by the best subject matter experts from AWS. Bootcamps have a large hands on component. I ran one my first year at re:Invent and I know people who have run them or will be this year. The instructors put a huge amount into making these relevant and worth while, so check them out.
Certifications can be done onsite, what a great time to get your first or that extra AWS Certification. Go and book in and get one out of the way. Past years there has been a certification lounge where you get a private space, a bit of swag and some snacks and power outlets. So its worth being certified.
There are usually a number of Hackathons.  I have some friends who went to them last year and I dropped in and they were amazing. If you want to have fun with others and test your skills these and something to check out.
Hands on labs are fun to do.  These are run by training and certification and are a great way to quickly get some experience with an AWS Service or a Partner product. 
Matt called it out as "The Corridor", meaning have those conversations with people that matter. To me this is one of the most valuable things to do. When you sit at a table for breakfast or lunch, talk to the people at the table. Introduce yourself, ask what people do. Ask why they came, ask what they have learnt. You will gather so many nuggets of information, tips and ideas from these conversations. Get out of your comfort zone and engage with people who are just as obsessed with this stuff as you are.
People have different approaches to sessions and I have evolved mine over the years. One thing I am positive on is don't miss the keynotes. You want to get in early and get into the main room and not be 15m late and end up in the overflow. You want to experience the vibe, you want to sit with >10,000 other people. You want to live tweet it from in the room!
For the breakout sessions pick what is key to you. Yes they will be recorded and available later online, but you probably will not find the time to do it. Sitting in the sessions gives you time and permission to think about the topic, to digest, to ponder. So go to sessions rather than thinking this is something I can do later. When you are conflicted go to the one that is furthest from your comfort zone or that has a speaker you want to meet. If the session is something you are really familiar with or passionate about you are more likely to watch and digest the recoding afterwards. As Matt mentioned, keep an eye out for those secret ones for new services which may be announced in the keynotes.
Exhibition Hall  
This will be huge and take you a LONG time to get around. Plan to visit it each day and cover a portion. Go and see the small vendors, see the startups and not just the big guys. Engage with the vendors and find out how they help their customers and if they could help you. If there is no fit between you and them politely move on, but give them a chance. 
There should be a part of the AWS stand that has Solution Architects, Support and Training & Certification. Visit each of these. Ask the Solution Architects the hard questions that you have just not been able to figure out. Check in on a support case, or log one, or just say thanks to the support guys. Maybe ask the support people how best to utilize them or the types of cases you can log. Lastly discuss the training and certification options with the team, what should you consider doing?
Give Feedback
Amazon and AWS is a customer obsessed company. This is your chance to give feedback. If you see a staff member (check their badge and possibly lanyard color) give them feedback both good and bad. If they cover a particular speciality (training, sales, architecture, support etc) and you have interacted with that group tell them about your experience and how they can do better. If you are talking with someone from a service team or someone who knows something about a service (bootcamp instructor or assistant, lab assistant, breakout speaker) then give them feedback on that specific service. Approximately 95% of all those features that AWS releases are based on customer feedback. Staff will be super keen for this feedback and will really appreciate you taking the time. 
Hope that helps with some of your re:Invent preparation. If you have your own ideas you may want to leave a comment on this blog entry, hit me and Matt on Twitter or contact Matt (see his post).

AWS Re:Invent 2015 is taking place at The Venetian in Las Vegas from 6th to the 9th October. Registration is now open. If you go, see if you can bump into me and say hello, that would be awesome!


Two books on changing the "how"

Tuesday, June 09, 2015 Category : , , , 0

In my role I have the great privilege to get to speak to a lot of really interesting people and companies that are on their Cloud journey. The bits and bytes of Cloud are not that hard for most technologists and companies, the hard part is the cultural change.

For a long time I have been recommending that people read the book, "The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win" by Gene Kim, Kevin Behr & George Spafford. You can get it on Kindle for under $10 and you can read it in a weekend.

This book can be a little confrontational to read if you have been in IT for quite some time. Its written as a narrative and you may recognize yourself or others you have closely worked with in the characters. But what this book does do is provide a great vision that will challenge your thinking about how IT can be done in todays world.

Since reading the book I have read quite a few others that have helped in my thinking about how software and architecture should be done. For example "Release It!: Design and Deploy Production-Ready Software (Pragmatic Programmers)" by Michael T. Nygard. Its a little dated but you can tell how it is really sitting on the cusp of public Cloud. 

But this last weekend I stumbled across a new book, "The Practice of Cloud System Administration: Designing and Operating Large Distributed Systems, Volume 2" by Thomas A. Limoncelli , Strata R. Chalup & Christina J. Hogan.

I view this book as a great follow up to get one thinking about the "how" of the Phoenix Project. One of the authors, Thomas, worked at Google for 7 years and I see many parallels in the operational aspects described and those which are in place inside and AWS. 

The book covers a lot of topics from how to build large scale distributes system to the important bit of how to operate them. Yes there is a chapter on DevOps but this is not a DevOps book.

For people who are keen to understand the possible way to perform operations in the new world of Cloud this book is a great primer and full of great tips and examples from the real world. 

An interesting quote, "cloud or distributed computing was the inevitable result of the economics of hardware. DevOps is the inevitable result of needing to do efficient operations in such an environment." (p. 171). Again, lets not overdo the DevOps model but I think the premise is that cloud  presents an amazing new way to architect systems. The parallel of that is we need new ways to operate these new architectures and reading this book will help give you insights on how people have been doing that.

If you have read the book, or once you have, would love to hear what you think in the comments. 

I have been thinking a lot about operations in the Cloud over the last few weeks so expect some more posts coming up on this topic. 

Till the next post.


Powered by Blogger.