jakub holý

building the right thing, building it right, fast

Running Gor, the HTTP traffic replayer, as a service on AWS Elastic Beanstalk

2015-07-30[Dev]Ops

Gor is a great utility for replicating (a subset of) production traffic to a staging/test environment. Running it on AWS Elastic Beanstalk (EB) has some challenges, mainly that it doesn't support running as a daemon and that there isn't any documentation/examples for doing this. Well, here is a solution:

# File: .ebextensions/10gor.config
# Config Gor to copy a sample of Prod http traffice to staging
files:
# Utility for daemonizing binaries such as gor; see http://libslack.org/daemon
/opt/daemon.rpm:
source: "https://s3-eu-west-1.amazonaws.com/elasticbeanstalk-eu-west-1-<our id>/our_fileserver/daemon-0.6.4-1.x86_64.rpm"
authentication: S3Access # See AWS::CloudFormation::Authentication below
owner: root
group: root
# daemon config so that we don't need to repeat these command line options and
# can just use the service's name ("gor")
# We need to intercept the port 8080, not 80 (that iptables redirect to 8080)
/etc/daemon.conf:
# Troubleshooting tips:
# 1) Send stderr/out output of both daemon and the service to a file - see the commented-out line
# (it should be also possible to send it to syslog but that did not work for me)
# 2) Add "foreground" to the options and start the service manually on the server
content: |
gor respawn,command=/opt/gor --input-raw :8080 --output-http 'https://our-staging.example.com|1'
#gor output=/var/log/gor.log,respawn,command=/opt/gor --stats --output-http-stats --input-raw :8080 --output-http 'https://our-staging.example.com|1'
#gor errlog=/var/log/daemonapp.log,dbglog=/var/log/daemonapp.log,output=/var/log/gor.log,verbose,debug,respawn,command=/opt/gor --verbose --stats --output-http-stats --input-raw :8080 --output-http 'https://our-staging.example.com|1'
# Use the commented-out line to enable stats logging to syslog (/var/log/messages) every 5s for troubleshooting
content: |
gor respawn,command=/opt/gor --input-raw :8080 --output-http 'https://our-staging.example.com|1'
#gor respawn,output=gor.info,command=/opt/gor --stats --output-http-stats --input-raw :8080 --output-http 'https://our-staging.example.com|1'
# HTTP traffic replicator; see https://github.com/buger/gor
/opt/gor:
source: "https://s3-eu-west-1.amazonaws.com/elasticbeanstalk-eu-west-1-<our id>/our_fileserver/gor"
authentication: S3Access
mode: "000755"
owner: root
group: root
# System V service that supports chkconfig
/etc/init.d/gor:
mode: "000755"
owner: root
group: root
content: |
## The chkconfig <levels> <startup order> <stop order> + descr. needed to
## support ensureRunning
# chkconfig: 345 92 08
# description: Gor copies traffic to staging
### BEGIN INIT INFO
# Provides: gor
# Short-Description: Start Gor to copy traffic to staging
### END INIT INFO
# See how we were called.
case "$1" in
start)
/usr/local/bin/daemon --name gor
;;
stop)
/usr/local/bin/daemon --name gor --stop
;;
status)
if ! /usr/local/bin/daemon --name gor --running; then
echo "gor is stopped"; exit 3
fi
;;
restart)
/usr/local/bin/daemon --name gor --stop
/usr/local/bin/daemon --name gor
;;
*)
echo $"Usage: $0 {start|stop|restart}"
exit 2
esac
exit 0
commands:
"Install daemon":
command: rpm -i /opt/daemon.rpm
test: /bin/sh -c "! rpm -q daemon" # It seems 'test: ! rpm -q daemon' ignores the !
services:
sysvinit:
gor:
enabled: true
ensureRunning: true
# Allow files: to access our S3 bucket
# See https://forums.aws.amazon.com/thread.jspa?messageID=557993
# BEWARE: We have to explicitely allow access to any subdirectory by
# editing the S3 bucket's policy and adding the subdir to the allowed Resources
AWSEBAutoScalingGroup:
Metadata:
AWS::CloudFormation::Authentication:
"S3Access": # reference this in the "authentication" property
type: S3
roleName: aws-elasticbeanstalk-ec2-role
buckets: elasticbeanstalk-eu-west-1-<our id>
view raw 10gor.config.yaml hosted with ❤ by GitHub


Highlights

  1. We want to run Gor as a service (instead of just a background + nohup command) because that is the only way to ensure it will keep running even as EB adds and removes nodes.
  2. Use the daemon utility to run Gor as a daemon (which it does not support out of the box). Daemon is small and works well. It will ignore gor's output and automatically restart it if it dies.
  3. Create an init.d script for gor. To support ebextensions's ensureRunning, it has to support chkconfig
  4. The test for whether daemon is installed cannot be just ! rpm -q daemon but needs to be /bin/sh -c "! rpm -q daemon"; the test property seems to require a single command to execute
  5. The files are downloaded from a private S3 bucket (which needs to be accessible by the EC2 role used and have the policy to allow access to the files in question)


Side note

I originally wanted to run Gor only on a single node using a container_command with leader_only to enable it on just that node. However that does not work because this is only run when the app is deployed but not when autoscaling adds new nodes (f.ex. after killing some old ones - typically starting with the leader). The new nodes are somewhat cloned from the existing ones, so they have the package, service, etc., but the command does not run there. And there is no "leader" concept outside of the EB deployment process. So the only option is to run Gor on all the nodes.