Pages

Tuesday, June 16, 2015

Using an architectural review for improving site reliability

I stumbled across another AWS Blogger,  Eric Hammond who blogs at https://alestic.com

One of the recent things which Eric has done is his Unreliable Town Clock (UTC) which you can use to schedule triggering of AWS Lambda functions. Its a cool idea.

Eric certainly knows what he is doing, he not only launched a service he sat down and ensured "this service is as reliable as I can reasonably make it". No wonder he is a AWS Community Hero!

Of course reliability is only one of the elements of an architectural review of an AWS environment. You should cover off such things as Security, Availability, Scalability and Cost Efficiency. Eric has covered some of this. Check out what he has done to ensure UTC is always up and running, there are some great tips in there.

What if you wanted to do a architectural review of your AWS environment. How would you go about that? What questions would you ask? What things require focus? Maybe post in the comments. Saying I will call my friendly AWS Solution Architect is cheating, although its a great idea.

Two items that will really help you get started with a review are these whitepapers.

What would you do beyond this? Here is some very small things I would investigate.

  • Auditing. Is CloudTrail, Config and VPC Flows all turned on? Its hard to do debugging or forensics on something in the past when you were not capturing the data. Is all the activity from the instance logged to CloudWatch Logs?
  • What dependancies are there that might stop a failed employment? That autoscaling group may relaunch an instance if it fails. What AMI is it using? Is it your own AMI sitting in the account or are you launching from a public one? What if the public ones goes away because a new one is released? How is the code deployed into that AMI? Is it baked in, coming from S3, does it need to download software from github, what if it can't?
  • Monitoring. There are 4 metrics in CloudWatch for SNS. Are there any alarms that could be created to provide alert of failure? What if the number of published messages dropped below a certain rate? An alarm like that could replace what Eric is using Cronitor.io for. You can even create those alarms with CloudFormation!
  • Turning on MFA is always a great idea.

This is the simplest of examples. For your typical system there are hundreds of review items to assess. But you get the idea.

Doing an architectural review is something you should do periodically in your AWS environment. As AWS keeps releasing new features there is frequently new things you can do to improve your setup. 

If only everyone was like Eric! Also, anyone use builds everything in CloudFormation is a winner in my book!

Rodos

Monday, June 15, 2015

Shortcuts in the AWS Console

Here is something that I did not know you could do for ages, shortcuts inside the AWS console that appear on the top bar.

See this animated Gif for how to add them and then use them. I think the Edit button used to be a lot less obvious.


Its very handy to have the links for your most frequently accessed services always there.

Enjoy.

Rodos

Thursday, June 11, 2015

A quick first look at AWS VPC Flow Logs

I woke up this morning to yet another new AWS feature, VPC Flow Logs, as described by Jeff Barr.

Jeff did a great job of providing an overview so make sure you read that before continuing.

Its really interesting to think what you can do with network flows logs. A lot of Enterprise customers ask for this so they can perform various security activities. Many of those security activities are really not needed in the new world of Cloud. However there are some valid ones that you may want to consider. There are also some good reasons to have flows available so you can perform some troubleshooting of your Security Groups or NACLs.

I suggest people turn them on, capture the data and set a retention period on the destination Cloud Watch Log Group, say 3 days up to 6 months. The data is then there if you need it. Just like Cloud Trail data. Its to late after the fact!

A great little use case would be some general visualization of network flows on a dashboard. Its not real time but its going to give you a general indication. You could analyze the amount of traffic by category, such as incoming, outgoing, cross AZ and within AZ (by reverse engineering the subnet ranges). You could even track it down to traffic to AWS regional based services such as S3. You may want to track these patterns over time, looking for trends. You could also look at top talker hosts internally or externally. I suspect it will be of interest to people at first, and then it will be a colorful screen to show visitors. After all, AWS handles all that heavy lifting of operating and scaling the networking.

Many will be interested in monitoring rejected traffic and if they see a lot if it starting, wonder if there is something else going on they should look at or take precessions on.  Generally you probably don't care, nothing to see here, its just dropped traffic.

Be great to see what AWS Partners do in the visualization space, I sense some eye candy coming.

I quickly turned VPC Flow Logs on in my account this morning.

Here is my Cloud Watch console showing the Log Groups.



Notice I have set the expire at 6 months. You can see below that when I look at my Log Group each of my Elastic Network Interfaces (ENIs) is shown.


I have 4 ENIs. Some of those are for my Workspaces instances which is cool. 

If I look at the instance I launched this morning by clicking on the eni-981db9fc-all here is the data displayed.


Notice how I have applied a filter. Nice hey. Here is what that filter looks like in that text box.

[version, accountid, interfaceid, srcaddr, dstaddr, srcport, distport=23, protocol, packets, bytes, start, end, action=REJECT, logstatus]

Notice that by putting the field names separated by commas and between brackets you can parse out the text. This is a general feature of Cloud Watch Logs. The field list is in the VPC Flow Logs documentation. 

There is lots of filters you can apply, here you can see I am just checking for matching values of a destination port of 23 (telnet) and where the action was to reject the packets. You can see all of those machines which have attempted to telnet into my little server. Thats why it has a correctly configured Security Group!

There is documentation in CloudWatch for the filter patterns syntax.  It supports both string and numeric conditional fields. For string fields, you can use = or != operators with an asterisk (*). For numeric fields, you can use the >, <, >=, <=, =, and != operators.

If someone asks you which hosts are communicating with the database at the moment you can quickly jump into the console and answer it by look at traffic to the right port.

The other nice thing you can do is create a metric on this filter to pull out the data. Here is one that creates a metic on the number of bytes accepted as SSH traffic into the ENI.

I created a few of these for my machine, here is the metrics display after I pushed some data its way. I am using the sum function to get the sum of bytes.


During this time period there were a few rejected telnet sessions, some SSH traffic and lots of general traffic. If you can write a filter on it, you can graph it.

Of course this only gets you so far. You have to know the ENI etc. 

You will probably want to extract all of the data into something easier. If you want to roll your own a good way would be to create a Subscription on the whole Log Group, see http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/Subscriptions.html and push all the data to a Kinesis stream (it will handle the scale). How do you get data out of Kinesis? Well you use Lambda functions of course, see http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-kinesis-events-adminuser.html. You Lambda function could dump it to S3 and from there you load into Redshift (which can be automated too) or start writing some EMR jobs. Now thats the power of AWS.

Hope that little bit of a first look helps you understand a bit more about VPC Flow Logs. I am really interested to see what people are going to do with it. The main uses will be those occasional operations or forensic events. 

Enjoy.

Rodos

P.S. Remember, I might work for AWS but these posts are my own ramblings late at night. Its the geek speaking. 

Wednesday, June 10, 2015

How to rock re:Invent 2015, Rodos style

I confess I am a conference junkie. Not any conference, but the conference that fertilizes the roots of my current IT thinking.

Back in the day this was VMware. I was a VMworld junkie. I think I may have done six in a row. I collected the t-shirts year after year and even blogged about it. I would stay up till 3am recording nightly video summaries of the days events.


Today Matt Wood (@mza) from AWS, a rock star, did a post on his personal blog on How to Rock re:Invent. He lists things like how to prepare, what gear to bring, what sessions to see. It was music to my ears. Of course you should go an read all of his post.

Matt asked for any other suggestions. Of course I responded with a career limiting move (CLM) and started tweet bombing him with my deep experience and insight, okay random ideas. Maybe not my finest moment but he was in my wheel house!

So here are my random and not as well thought out additions to Matt's list.

Get there 
First stop, just get a ticket and a hotel room booked. This is the hard part. One year I was between two jobs and had to take annual leave, pay for my own flights from Australia to the US, scrounge a conference ticket and beg a spare bed in a friends room. People do more to go and watch a Rugby game, so you can do this for something as amazing as re:Invent. 
When it comes to hotels try and stay close to the convention. I once stayed at the other end of the strip in Vegas and it was horrible having to walk back and forwards each day. Its so great being able to quickly visit your room to drop something off or pick something up on the way to something else. This means getting your booking in early.
Try and arrive the day before or even two days before. If you want to play tourist, don't do it after the event, you will be exhausted and just want to sleep. Being adjusted to the timezone and having a bit of R&R before the week of full days and little sleep gives you the best conference experience.
Also register the day before the event. Registration always opens the day before at these things and there is less crowds. Its madness the morning of the first day, no matter how well organized it is. 
Prepare 
There is lots you can do before hand. As Matt says go through the agenda, think about what equipment to bring. I would add to his list business cards (old skill but useful when you are in a hurry and want to pass details). I also don't recommend you bring two laptops!
Bring comfortable clothes and especially shoes, you will be walking a lot! Think about what bag you will carry. You may get a conference bag or you may not. I always prefer my own bag. Ensure you have enough spare room in your luggage for any swag, conference materials or shopping you pick up.
What evening activities will you go to? There is always the exhibition opening which is usually packed and full of people trying to grab swag. If you don't like big crowds and mayhem you may need to skip this one. The conference party is always awesome and not to be missed (I did one year and regretted it). But what other parties are on and can you get an invite? Who are the cool vendors that will be having an event? The more you get into the community the harder this is as there are often multiple on per night and you have to choose or jump from one to another. 
Shameless plug for Amazon.com, but this is also something I did before I was an employee. If you live overseas place an order on Amazon.com for all those things you want and get them shipped to the hotel to bring home. I bank up my wish list all year and empty it each trip. You save a lot on shipping and its a bit like Christmas when you arrive. However check with the hotel for extra package handling charges as they can sting you. If you want to bring gifts home for your kids this is a great method, as you are not going to find a lot of gift items around Vegas unless you go to the outlet malls and who wants to do that!
Extra activities
There are extra activities that people often miss. The day before the event starts there are Bootcamps. You pay extra for these but they are really worth it, IMHO. They are either half or a full day and are presented by the best subject matter experts from AWS. Bootcamps have a large hands on component. I ran one my first year at re:Invent and I know people who have run them or will be this year. The instructors put a huge amount into making these relevant and worth while, so check them out.
Certifications can be done onsite, what a great time to get your first or that extra AWS Certification. Go and book in and get one out of the way. Past years there has been a certification lounge where you get a private space, a bit of swag and some snacks and power outlets. So its worth being certified.
There are usually a number of Hackathons.  I have some friends who went to them last year and I dropped in and they were amazing. If you want to have fun with others and test your skills these and something to check out.
Hands on labs are fun to do.  These are run by training and certification and are a great way to quickly get some experience with an AWS Service or a Partner product. 
Engage 
Matt called it out as "The Corridor", meaning have those conversations with people that matter. To me this is one of the most valuable things to do. When you sit at a table for breakfast or lunch, talk to the people at the table. Introduce yourself, ask what people do. Ask why they came, ask what they have learnt. You will gather so many nuggets of information, tips and ideas from these conversations. Get out of your comfort zone and engage with people who are just as obsessed with this stuff as you are.
Sessions
People have different approaches to sessions and I have evolved mine over the years. One thing I am positive on is don't miss the keynotes. You want to get in early and get into the main room and not be 15m late and end up in the overflow. You want to experience the vibe, you want to sit with >10,000 other people. You want to live tweet it from in the room!
For the breakout sessions pick what is key to you. Yes they will be recorded and available later online, but you probably will not find the time to do it. Sitting in the sessions gives you time and permission to think about the topic, to digest, to ponder. So go to sessions rather than thinking this is something I can do later. When you are conflicted go to the one that is furthest from your comfort zone or that has a speaker you want to meet. If the session is something you are really familiar with or passionate about you are more likely to watch and digest the recoding afterwards. As Matt mentioned, keep an eye out for those secret ones for new services which may be announced in the keynotes.
Exhibition Hall  
This will be huge and take you a LONG time to get around. Plan to visit it each day and cover a portion. Go and see the small vendors, see the startups and not just the big guys. Engage with the vendors and find out how they help their customers and if they could help you. If there is no fit between you and them politely move on, but give them a chance. 
There should be a part of the AWS stand that has Solution Architects, Support and Training & Certification. Visit each of these. Ask the Solution Architects the hard questions that you have just not been able to figure out. Check in on a support case, or log one, or just say thanks to the support guys. Maybe ask the support people how best to utilize them or the types of cases you can log. Lastly discuss the training and certification options with the team, what should you consider doing?
Give Feedback
Amazon and AWS is a customer obsessed company. This is your chance to give feedback. If you see a staff member (check their badge and possibly lanyard color) give them feedback both good and bad. If they cover a particular speciality (training, sales, architecture, support etc) and you have interacted with that group tell them about your experience and how they can do better. If you are talking with someone from a service team or someone who knows something about a service (bootcamp instructor or assistant, lab assistant, breakout speaker) then give them feedback on that specific service. Approximately 95% of all those features that AWS releases are based on customer feedback. Staff will be super keen for this feedback and will really appreciate you taking the time. 
Hope that helps with some of your re:Invent preparation. If you have your own ideas you may want to leave a comment on this blog entry, hit me and Matt on Twitter or contact Matt (see his post).

AWS Re:Invent 2015 is taking place at The Venetian in Las Vegas from 6th to the 9th October. Registration is now open. If you go, see if you can bump into me and say hello, that would be awesome!

Rodos

Tuesday, June 09, 2015

Two books on changing the "how"

In my role I have the great privilege to get to speak to a lot of really interesting people and companies that are on their Cloud journey. The bits and bytes of Cloud are not that hard for most technologists and companies, the hard part is the cultural change.

For a long time I have been recommending that people read the book, "The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win" by Gene Kim, Kevin Behr & George Spafford. You can get it on Kindle for under $10 and you can read it in a weekend.


This book can be a little confrontational to read if you have been in IT for quite some time. Its written as a narrative and you may recognize yourself or others you have closely worked with in the characters. But what this book does do is provide a great vision that will challenge your thinking about how IT can be done in todays world.

Since reading the book I have read quite a few others that have helped in my thinking about how software and architecture should be done. For example "Release It!: Design and Deploy Production-Ready Software (Pragmatic Programmers)" by Michael T. Nygard. Its a little dated but you can tell how it is really sitting on the cusp of public Cloud. 

But this last weekend I stumbled across a new book, "The Practice of Cloud System Administration: Designing and Operating Large Distributed Systems, Volume 2" by Thomas A. Limoncelli , Strata R. Chalup & Christina J. Hogan.


I view this book as a great follow up to get one thinking about the "how" of the Phoenix Project. One of the authors, Thomas, worked at Google for 7 years and I see many parallels in the operational aspects described and those which are in place inside Amazon.com and AWS. 

The book covers a lot of topics from how to build large scale distributes system to the important bit of how to operate them. Yes there is a chapter on DevOps but this is not a DevOps book.

For people who are keen to understand the possible way to perform operations in the new world of Cloud this book is a great primer and full of great tips and examples from the real world. 

An interesting quote, "cloud or distributed computing was the inevitable result of the economics of hardware. DevOps is the inevitable result of needing to do efficient operations in such an environment." (p. 171). Again, lets not overdo the DevOps model but I think the premise is that cloud  presents an amazing new way to architect systems. The parallel of that is we need new ways to operate these new architectures and reading this book will help give you insights on how people have been doing that.

If you have read the book, or once you have, would love to hear what you think in the comments. 

I have been thinking a lot about operations in the Cloud over the last few weeks so expect some more posts coming up on this topic. 

Till the next post.

Rodos

Monday, June 08, 2015

Deep Linking into the AWS Console

Last year Jeff Barr did a blog post on improvements to the AWS Console. One of these was about "Deep Linking Across EC2 Resources".

"The new deep linking feature lets you easily locate and work with resources that are associated with one another. For example, you can move from an instance to one of its security groups with a single click."

This deep linking is something you can use yourself when creating operational dashboards or your own interfaces. For example here is a little dashboard that I have which displays a quick interactive console for an Auto Scaling Group. You can see that the instance IDs are links, which take you to the specific instance in the EC2 console.


If you look in the AWS Console documentation you will not find any documentation for the deep linking URLs.

Here is a list of what I have used which works, your millage may vary.
  • Instance : https://console.aws.amazon.com/ec2/home?region=<region>#Instances:search=<instance id>
  • Volume : https://console.aws.amazon.com/ec2/home?region=<region>#Volumes:search=<volume id>
  • Load Balancer : https://console.aws.amazon.com/ec2/home?region=<region>#LoadBalancers:search=<lb name>
  • Auto Scaling Group : https://console.aws.amazon.com/ec2/autoscaling/home?#AutoScalingGroups:id=<autoscaling group name>

There are lots of others.  To reverse engineer one you can use the Tag Editor which shows them, see below.


If you are keen to find out about a specific deep link but can't find it, post in the comments.

Rodos

Friday, June 05, 2015

2 minutes and counting

Do you find that you clean at home the fastest when you know you have an imminent visit by a friend? Its amazing how fast you can whip around the house and get things ready in time for the knock on the door.

Spot instances are one of my many favorite things about AWS. They are for people who really want to cost optimize and have a workload that can match its model. The main element to be aware of with Spot is that it is a market where you place a bid for how much you are willing to pay for the resource, if the market changes and the amount you are willing to pay is no longer high enough your instance will be taken away (or reclaimed if you want a nice way of saying it).

How fast is the instance reclaimed? Well before the start of the year it was reclaimed fast, essentially no notice. Now you get a two minute warning before they are terminated. Enough time to tidy the house before that knock on the door. Many people have not noticed this change as it occurred very early in January whilst many people were away on their Christmas holidays. Who reads the AWS blog on their holidays?

The notice period provides you the opportunity to automate performing tasks such as saving state, coping off data such as logs, or numerous other things which may be part of stopping processing.

How do you know when your two minutes notice period starts? You can monitor for the existence of the metadata field http://169.254.169.254/latest/meta-data/spot/termination-time. Its recommended you check every 5 seconds, but thats really up to you.

If you need to be aware of the status from outside of the instance each spot request has a bid status which will change to marked-for-termination during the notice period. See all the details at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-bid-status.html

You can read more about all of this on the blog announcement. For details on Spot see the documentation at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html.

Rodos


Using AML to predict weather

You can't really predict the weather can you? Well I have been thinking quite a bit about Lambda and Amazon Machine Learning lately and just yesterday I posted about AWS bloggers. Well todays post combines two of those topics.

Arne Sund at http://arnesund.com just did a post on "Using Amazon Machine Learning to Predict the Weather". Its a good read about how you can get started with AML. I have no idea if this is a good model for weather prediction but could it be any worse? Will let our data scientist friends way in on that one. Certainly for the simple machine learning I have been doing its been working great.

You may want to follow Arne's AWS feed. This is his first post on AWS, nice work and would be great to see some more.

Rodos

Wednesday, June 03, 2015

AWS Bloggers

I love reading blogs as I believe that a great way to learn is to listen to people who have spent lots of time investigating something or experiencing it. For good bloggers this is what they do, take their hours of learning and share it with you in a digestible format.

Back in 2008 and onwards I was really into learning about the new world of server virtualization and a great way to do so was through bloggers (see http://vlp.vsphere-land.com for how the space has grown). Since I have been in the AWS world for the last few years I have not seen a lot of individual bloggers out there diving into AWS. Maybe I am just looking in the wrong places.

One day I would love to collate a feed of the AWS specific bloggers that people can follow. However, here are two blogs I do know of that cover interesting stuff that is usually related to AWS. You may want to subscribe to their feeds.

Of course there are the large scale blogs that you are probably already following.


If you know of others, please post in the comments. Even better, if you are using AWS, why not start your own blog and share you experiences.

Cheers

Rodos

Tuesday, June 02, 2015

Remember to make your Lambda functions idempotent

Todays post is about an AWS service I have been having some fun with, Lambda.

Essentially Lambda its a service which executes your code within millisecond of an "event" happening. An event may be your own action or it can be triggered by actions in other AWS services such as S3, DyamoDB or Kinesis. The great thing is there is no infrastructure to build or run and you pay only for the requests served and the compute time required to run your code. Billing is metered in increments of 100 milliseconds! Its "way cool". You can read all about it on the product page if you need an introduction. But this post is not about whats so cool about Lambda.

What I wanted to cover was that you need to make sure your functions that you write are idempotent. Idempotency in software "describes an operation that will produce the same results if executed once or multiple times". "It means that an operation can be repeated or retried as often as necessary without causing unintended effects."

Why is this important to remember with Lambda? Well there is some text in the documentation and FAQ that sort of explains why.

From the documentation. [highlight is mine]
Your Lambda function code must be written in a stateless style, and have no affinity with the underlying compute infrastructure. Your code should expect local file system access, child processes, and similar artifacts to be limited to the lifetime of the request, and store any persistent state in Amazon S3, Amazon DynamoDB, or another cloud storage service. Requiring functions to be stateless enables AWS Lambda to launch as many copies of a function as needed to scale to the incoming rate of events and requests. These functions may not always run on the same compute instance from request to request, and a given instance of your Lambda function may be used more than once by AWS Lambda.
Also from the FAQ.
Q: Will AWS Lambda reuse function instances?
To improve performance, AWS Lambda may choose to retain an instance of your function and reuse it to serve a subsequent request, rather than creating a new copy. Your code should not assume that this will always happen.
 Today Lambda functions are written in Node.js. Here is my Lambda function which returns Twitter data combined with Amazon Machine Learning Predictions to tell me if those tweets are on topic (aka SPAM) or not. My use case was creating a tweet board that filtered junk message based on machine learning. It actually worked really well. But back to our code, you want to jump right to the end, not need to read it all.

getTweetsError = function (err, response, body) {
    console.log('ERROR [%s]', err);
};

function retrieveATweetPrediction(tweet) {

    // This is an async operation and we are going to have lots. Therefore we
    // will use a promise which we will
    // return for our caller to track. When we do our actual work we will mark
    // our little promise as resolved.

    var deferred = Q.defer();

    var req = aml.predict(
    {       
     MLModelId: '',
     PredictEndpoint: 'https://realtime.machinelearning.us-east-1.amazonaws.com',
     Record: { 
         text: tweet['text'].toString(),
         id: tweet['id'].toString(),
         followers: tweet['user']['followers_count'].toString(),
         favourites: tweet['favorite_count'].toString(),
         friends: tweet['user']['friends_count'].toString(),
         lists: tweet['user']['listed_count'].toString(),
         retweets: tweet['retweet_count'].toString(),
         tweets: tweet['user']['statuses_count'].toString(),
         user: tweet['user']['screen_name'].toString(),
    source: tweet['source'].toString(),
   }
    });

    // We did not pass a function to predict so we can call the .on function and 
    // get access to the complete response data. This allows us to look up the original request and 
    // tie this async call back to our original data. If we call it the normal way we dont have access
    // to that, just the response and can't tie it back!
    req.on('success', function(response) {
     if (response.error) {
      console.log(response.error)
     } else {
      var t = "";
   if (response.data.Prediction.predictedLabel == "0") {
          t += 'ON';
    } else {
       t += 'OFF';
         }
            returnData[response.request.params.Record.id]['prediction'] = t;

    var val = response.data.Prediction.predictedScores[response.data.Prediction.predictedLabel];
    if (val < 0.5 ) {
       val = 1 - val;
    }   
            returnData[response.request.params.Record.id]['probability'] = Math.round(val*100000)/1000;
            deferred.resolve(); // This task can now be marked as done
            
     }
    });
    req.send();
    return deferred.promise;
};

function extractTweets() {

    var deferred = Q.defer();

    twitter.getSearch({'q':'#aws','count': 15}, getTweetsError, 
    
        function (data) {

            var tweets = JSON.parse(data)['statuses'];

            // We need to create a list of tasks as we are going to fire off a bunch of async calls to 
            // do a prediction for each tweet.
            var tasks = [];

            for (i in tweets) {

                var id = tweets[i]['id'];
                returnData[id] = {}; 
                returnData[id]['text']       = tweets[i]['text'];
                returnData[id]['name']       = tweets[i]['user']['name'];
                returnData[id]['screen_name']= tweets[i]['user']['screen_name'];
                returnData[id]['followers']  = tweets[i]['user']['followers_count'];
                returnData[id]['friends']    = tweets[i]['user']['friends_count'];
                returnData[id]['listed']     = tweets[i]['user']['listed_count'];
                returnData[id]['statuses']   = tweets[i]['user']['statuses_count'];
                returnData[id]['retweets']   = tweets[i]['retweet_count'];
                returnData[id]['favourites'] = tweets[i]['favorite_count'];
                returnData[id]['source']     = tweets[i]['source'];
                returnData[id]['image_url']  = tweets[i]['user']['profile_image_url'];

                // The prediction return a promise which we will push into our list of tasks.
                // When the prediction is returned it will mark its little task as resolved.
                tasks.push(retrieveATweetPrediction(tweets[i]));
            }

            // We have a list of tasks which are happening. Lets wait till ALL of them are done.
            Q.all(tasks).then(function(result) { 
                // Woot woot, all predicitons are returned and we have our data!
                // We are therefore resolved ourselves now. Whoever is waiting on us is going to 
                // now get some further stuff done.
                deferred.resolve();
            });
        }
    );
    return deferred.promise;
};

// End of Functions, let look at out main bit of code.

// Setup AWS SDK
var aws = require('aws-sdk');
aws.config.region = 'us-east-1';
var aml = new aws.MachineLearning();

// Setup Twitter SDK
var Twitter = require('twitter-node-client').Twitter;
var twitter = new Twitter({
    "consumerKey": "",
    "consumerSecret": "",
    "accessToken": "",
    "accessTokenSecret": "",
    "callBackUrl": ""
});

// Setup Q for our promises, we have lots of calls to make and we need to track when they are all done!
var Q = require('q');

var returnData = {};

// This is the function required by Lambda
exports.handler = function(event, context) {

    returnData = {}; // We may be reincarnated so ensure we are idempotent 
    
    Q.allSettled([extractTweets()]).then(
        function(result){
            // Return our data an end the Lambda function
            context.succeed(returnData);
        },
        function(reason){
            console.log("Opps : " + reason);
        });

};


See how there are lots of functions then some code which sets up some variables, Q and returnData, and then the main function which Lambda will call when an event occurs, exports.handler. Notice how I am not a great coder and I used a global variable to store some data which is used by all of the functions. Well if exports.handler gets called over and over again in the same environment those global variables will not be re-created or cleared. I did not quite realize this at first and wondered why I was sometimes getting weird data back from Lambda, not always, just sometimes.

To fix my problem I simple ensured that I cleared the key variable each time the handler function was called, so you can see that the first thing it does above is the "returnData = {}; // We may be reincarnated so ensure we are idempotent". Fixed. Of course I know I could just code better, but this was my first ever time writing node.js. You can tell me how to improve my function in the comments.

I will probably do another writeup on my Amazon Machine Learning experiment and how I trained it to filter tweets, it was really easy and I have no servers involved, thanks to Lambda to execute my application logic, so I just have S3, Lambda and AML Live Prediction for a highly scalable site.

Hopefully you won't get caught by the same mistake.

Rodos

Monday, June 01, 2015

interviews

Wow, its been so long since I did my last blog post. Over the last weeks I have felt that I really miss the days where I was blogging frequently. Hence I decided I would do a month of blogging and force myself to get something small out more often. Lets see how it goes.

Today's topic is interviews.

I see a lot of interview tips on sites like Lifehacker (http://www.lifehacker.com.au/tags/interviews/) such as Killer Questions, why not to Humblebrag or how to answer questions such as Why for a role or What Motivates You. I find these interesting to read and sometimes there is some good insight.

As someone who has done close to 300 interviews at Amazon I thought I would share my very non-official quick list of tips for a interview. Some of them may slant to how Amazon interviews or my personal preferences. I am generally interviewing for technical roles but I also do lots for sales staff, operations and so on.

Here is what I think is important when it come to interviews.
  • Be yourself - you may have a perception of what the company is looking for but there is little use putting on a show. You may assume wrong and you probably wont be be able to maintain the facade for the duration of your employment. If you never intend to wear a suit, don't wear one to the interview. People say "Dress for the job you want", I say "Be who you are." I am not really talking about dress code here, although that is one element. Show your personality and what you will be like to work with, what you will be like with customers. The interviewers are thinking, "Is this someone I want to spend my days with?", so be yourself.
  • Be articulate - The interview is a key circumstance where you want to be on your game when it comes to communication. This means body language, pace of speech, active listening and providing short clear answers. Try to ascertain early on the style of conversation the interviewer is using and match this. Is it a friendly conversation, is it a list of quick fire question and answer rounds? Also note that the style may change through the interview. Many people talk way too fast in an interview. If you do this normally then practice slowing down, as this can be hard for people who are listening to you for the first time. Nerves are no excuse IMHO. If you think an interview is stressful, trying having a conversation with a senior executive at a customer when you have a senior executive from the new company with you, that's stress. Listen to cues. If the interviewer says "So tell me about your high level career background. But lets cover this in less than 5 minutes in order to get onto other topics", then you really should answer within 5 minutes. If after 10 minutes you are still going through the subjects you did in high school there is a problem. Listen to the question and provide enough information and colour to answer it, that's all. Don't keep talking on and on and on until the interviewer needs to interrupt you. If more information is required the interview will ask a followup up question. Very long answers to the one question are not adding a lot of value to your answer and removing time for giving answers to other questions which can give greater insight into you and your skills.
  • Use examples and stories - This may be influenced by my time at Amazon but try to use examples and stories (short ones) for answers. It not only provides interesting colour and is easier to remember but it also provides great insight into what you have actually done and achieved rather than a general assertion. For example, if asked "So how do you learn new things?" you might answer "I like to read books, I love reading. I don't find classes that effective as they move slowly." but compare to "I usually learn through reading. Last year I had to learn Ruby so I read the O'Reilly book on Ruby and then hacked away. After a few months I wanted to go further so read Eloquent Ruby which really helped me understand the nuances of the language." The second version really provides some demonstration of how you applied or practiced whatever the question is about. However, don't be tempted to make something up, a good interviewer will ask you a detailed followup question which may just catch you out.
  • Do your research and improve during the process - do some research on the company and understand who they are and what they do. As you pick things up during your interviews do more research, dive in more. If you don't know the answer to a question in one interview remember what it was and do some research, you never know you may get a similar question by another interviewer.
  • Have some good questions to ask - You will often get asked if you have your own questions. In my opinion unless you are now convinced you are not going to take the job there has to be something you want to ask. You can ask questions about the role, the company, the culture. You will be spending a lot of time working for this company and with these people, surely you want to know more about them. Also, avoid common questions if they are not really that meaningful. I started getting a few "What's The Most Frustrating Part Of Working Here?" questions after the Lifehacker post. Its a fine question if you really are interested in the answer, but avoid just asking filler questions.
I can't say I have been the interviewee many times in my career, but I have survived two rounds of Amazon interviews (first externally and a second for an internal role change). What I did find was that if you are a good fit for the role and the company (which is really what you and the employer wants), then the interview should not be like a visit to the dentist. It should be like a first date, a little nerve racking, some fun, a chance to learn more about someone else and yourself, and a good start to what you hope could be a long and rewarding relationship. If the interview is like that dentist visit, maybe you are not made for each other, that's okay.

There you go, first post. Lets hope I can throw out some more random ones this month!

Rodos

P.S. Shameless plug. Remember Amazon in always hiring. See amazon.jobs for open roles in Australia. If you apply for a role in Solution Architecture you may end up having an interview with me! Wouldn't that be fun!