Holy on Dev

Archive for the ‘[Dev]Ops’ Category

Clojure – comparison of gnuplot, Incanter, oz/vega-lite for plotting usage data

Posted by Jakub Holý on November 4, 2018

What is the best way to plot memory and CPU usage data (mainly) in Clojure? I will compare gnuplot, Incanter with JFreeChart, and vega-lite (via Oz). (Spoiler: I like Oz/vega-lite most but still use Incanter to prepare the data.)

The data looks like this:

;; sec.ns | memory | CPU %
1541052937.882172509 59m 0.0
1541052981.122419892 78m 58.0
1541052981.625876498 199m 85.9
1541053011.489811184 1.2g 101.8

The data has been produced by monitor-usage.sh.

The tools

Gnuplot 5

Gnuplot is the simplest, with a lot available out of the box. But it is also somewhat archaic and little flexible.

Read the rest of this entry »

Posted in Languages, [Dev]Ops | Tagged: , , | Comments Off on Clojure – comparison of gnuplot, Incanter, oz/vega-lite for plotting usage data

How good monitoring saved our ass … again

Posted by Jakub Holý on November 1, 2018

You know how it goes – suddenly people complain your app does not work, your are getting plenty of timeouts or other errors in your error tracking tool, you find the backend app that is misbehaving and finally “fix” the problem by restarting it. Phew!

But why? What caused the downtime? A glitch an an upstream system? Sudden overload due to a spike in concurrent users? Trolls?

You know that it helps sometimes to zoom out, to get the right perspective. Here the perspective was 7 days:

It was enough to look at this chart with the right zoom to see at once that something happened on October 23rd that caused a significant change in the behavior of the application. Quick search and indeed, the change in CPU usage corresponds with a deployment. A quick revert to the previous version shortly confirmed the culprit. (It would have been even easier if we showed deployments on these charts.)

This is not the first time good monitoring saved us. A while ago we struggled regularly with the application becoming sluggish and had to restart it regularly. A graph of the Node.js even loop lag showed it increasing over time. Once it was on the same dashboard as Node’s heap usage, we could at once see that it correlated with increasing memory usage – indicating a memory leak. Few hours of experimenting and heap dump analysis later the problem was fixed.

So good monitoring is paramount.

Of course the trick is to know what to monitor and to display all relevant metrics in such a way that you can spot important relations. I am still working on improving that…

Posted in [Dev]Ops | Tagged: | Comments Off on How good monitoring saved our ass … again

Monitoring process memory/CPU usage with top and plotting it with gnuplot

Posted by Jakub Holý on October 17, 2018

siege-c3e2

If you want to monitor the memory and CPU usage of a particular Linux process for a few minutes, perhaps during a performance test, you can capture the data with top and plot them with gnuplot. Here is how:

Read the rest of this entry »

Posted in [Dev]Ops | Tagged: , | Comments Off on Monitoring process memory/CPU usage with top and plotting it with gnuplot

Why we love AWS Beanstalk but are leaving it anyway

Posted by Jakub Holý on March 14, 2018

Cross-posted from Telia’s Tech Blog.

We have had our mission-critical webapp running on AWS Elastic Beanstalk for three years and have been extremely happy with it. However we have now outgrown it and move to a manually managed infrastructure and CodeDeploy.

AWS Beanstalk provides you with lot of bang for the buck and enables you to get up and running in no time:

  • Simple, no-downtime deployment and automatic roll-back based on user-provided health-check (either one subset of nodes at a time or blue-green deployment)
  • Autoscaling
  • Managed updates – security fixes and other improvements installed automatically
  • Built-in HTTP Proxy with caching in front of your application
  • Monitoring dashboard with alerting and access to logs without the need for SSH
  • A list of past versions & ability to roll-back
  • Support for many runtimes (Java, Node.js, Docker to name just a few)

So if you need a solid, state-of-the-art infrastructure for a web-scale application and you don’t have lot of time and/or skill to build one on AWS on your own, I absolutely recommend Beanstalk.

Read the rest of this entry »

Posted in [Dev]Ops | Tagged: , | 2 Comments »

Pains with Terraform (perhaps use Sceptre next time?)

Posted by Jakub Holý on March 14, 2018

Cross-posted from Telia’s Tech Blog

We use Amazon Web Services (AWS) heavily and are in the process of migrating towards infrastructure-as-code, i.e. creating a textual description of the desired infrastructure in a Domain-Specific Language and letting the tool create and update the infrastructure.

We are lucky enough to have some of the leading Terraform experts in our organisation so they lay out the path and we follow it. We are at an initial stage and everything is thus “work in progress” and far from perfect, therefore it is important to judge leniently. Yet I think I have gain enough experience trying to apply Terraform both now and in the past to speak about some of the (current?) limitations and disadvantages and to consider alternatives.

Read the rest of this entry »

Posted in [Dev]Ops | Tagged: , | Comments Off on Pains with Terraform (perhaps use Sceptre next time?)

Nginx: Protecting upstream from overload on cache miss

Posted by Jakub Holý on October 1, 2015

These 2 magical lines will protect your upstream server from possible overload of many users try to access the same in cached or expired content: 

proxy_cache_use_stale updating timeout; # Serve the cached version even when outdated while refreshing it
proxy_cache_lock on; # Only one req is allowed to load/refresh the item, others wait / get the stale one 

You can verify this using Shopify’s Toxiproxy. 

❤ Nginx

Posted in [Dev]Ops | Comments Off on Nginx: Protecting upstream from overload on cache miss

Running Gor, the HTTP traffic replayer, as a service on AWS Elastic Beanstalk

Posted by Jakub Holý on July 30, 2015

Gor is a great utility for replicating (a subset of) production traffic to a staging/test environment. Running it on AWS Elastic Beanstalk (EB) has some challenges, mainly that it doesn’t support running as a daemon and that there isn’t any documentation/examples for doing this. Well, here is a solution:

Read the rest of this entry »

Posted in [Dev]Ops | Tagged: | Comments Off on Running Gor, the HTTP traffic replayer, as a service on AWS Elastic Beanstalk

AWS ebextensions: Avoiding “Could not enable service” (or .. disable ..)

Posted by Jakub Holý on July 30, 2015

If you are adding a service entry to your .ebextensions/ config to run a service in AWS Elastic Beanstalk and it fails with either “Could not enable service [..]” or “Could not disable service [..]” (based on the value of ensureRunning), make sure that the service init.d file supports chkconfig, i.e. contains the comments it looks for.

Posted in [Dev]Ops | Tagged: | 1 Comment »

Fixing a mysterious .ebextensions command time out (AWS Elastic Beanstalk)

Posted by Jakub Holý on July 29, 2015

Our webshop, nettbutikk.netcom.no, runs on AWS Elastic Beanstalk and we use .ebextensions/ to customize the environment. I have been just trying to get Gor running on our leader production instance to replay some traffic to our staging environment so that we get a much richer feedback from it. However the container_command I used caused the instance to time out and trash the environment, against all reason. The documentation doesn’t help and troubleshooting this is hard due to lack of feedback and time-consuming. Luckily I have arrived to a solution.

Read the rest of this entry »

Posted in [Dev]Ops | Tagged: | Comments Off on Fixing a mysterious .ebextensions command time out (AWS Elastic Beanstalk)

AWS: Passing private configuration to a Docker container (via S3)

Posted by Jakub Holý on July 29, 2015

Philipp Garbe describes how to pass environment variables that you want to keep private to a public Docker instance run on Amazon Web Services (beanstalk or ECS) in his post How to Run HuBot in Docker on AWS EC2 Container Services – Part 3. The trick is:

  1. Put them into an env.sh file that you can source on S3 (and allow the appropriate EC2 IAM role to access it)
  2. As a part of your startup CMD, run aws s3 cp to fetch and then source it

Here is his example of the CMD from a Dockerfile:

CMD ["/bin/sh", "-c", "aws s3 cp --region eu-west-1 s3://your-bucket/env.sh .; . ./env.sh; bin/hubot --adapter slack"]

See the full source code in his GitHub repo. Thanks for sharing, Phillipp!

Posted in [Dev]Ops | Tagged: , | Comments Off on AWS: Passing private configuration to a Docker container (via S3)