No Site VPN for Cloud Data Centers

A site to site VPN is the standard solution for connecting several physical data center locations. Going to the Cloud, the first idea that comes to mind is to also connect the Cloud "data center" with a site VPN to the existing physical data centers. All Cloud providers offer such a feature.

But is such a VPN infrastructure also a "good idea"? Will it help us or hinder us in the future?

I actually believe that for having many data centers a site VPN infrastructure is a dangerous tool. On the good side it is very convenient to have and to set up and it simplifies a lot of things. On the other side it is also very easy to build a world-wide mesh of dependencies where a VPN failure can severly inhibit data center operations or even take down services. It also lures everybody into creating undocumented backend connection between services.

The core problem is in my opinion one of scale. Having a small number (3 to 5) of locations is fundamentally different from having 50 or more locations. With 3 locations it is still feasable to build a mesh layout, each location talking to each other. With 5 locations a mesh already needs 10 connection, which starts to be "much". But a star layout always has a central bottleneck. With 50 locations a mesh network already needs 1225 connections.

With the move from the world of (few) physical data centers to the world of cloud operations it is quite common to have many data centers in the cloud. For example, it is "best practice" to use many accounts to separate development and production or to isolate applications or teams (See Yavor Atanasov from BBC explaining this at the Velocity Europe 2014). What is not so obvious from afar is that actually each cloud account is a separate data center of its own! So having many teams can quickly lead to having many accounts, I have talked to many companies who have between 30 and 100 cloud accounts!

Another problem is the fact that all the data centers would need to have different IP ranges plus one needs another (small) IP range for each connection etc. All of this is a lot of stuff to handle.

As an alternative approach I suggest not using site VPNs altogether. Instead, each data center should be handled as an independant data center altogether. To make this fact transparent, I would also suggest to use the same IP range in all data centers!

I see many advantages to such a setup:

  • All connections between services in different data centers must use the public network.
    As a result all connections have to be secured and audited and supervised properly. They have to be part of the production environment and will be fully documented. If a connection for one application fails, other applications are not impaired.
  • Standard ressources can be offered under standard IPs to simplify bootstrapping (e.g. is always the outgoing proxy or DNS server or something other vital).
  • If development and production are in separate "data centers" (e.g. cloud accounts), then they can be much more similar.
  • A security breach in one account does not easily spill over into other accounts.
  • Operations and setup of the cloud accounts and physical data centers becomes much simpler (less external dependencies).
  • It will be easy to build up systemic resilience as a failure in one data center or account does not easily affect other data centers or accounts.
  • Admins will be treated as road warriors and connect to each data center independantly as needed.
Currently I am looking for arguments in favor and against this idea, please share your thoughts!


PPD - Pimp your Printer Driver

I recently got myself a new printer, the HP Officejet Pro X476dw. A very nice and powerful machine, it can not only print double sided but also scan, copy and send faxes.

And of course it has very good Linux support, thanks to the HP Linux Printing and Imaging Open Source project. On my Ubuntu 14.10 desktop everything is already included to use the printer.

However, the first printouts where very disappointing. They looked coarse and ugly, much worse than prints from my old HP LaserJet 6 printer. After overcoming the initial shock I realized that only prints from my Ubuntu desktop where bad while prints over Google Cloud Print where crisp and good looking.

So obviously something has to be wrong with the printer drive on Ubuntu!

After some debugging I was able to trace this down to the fact that by default CUPS converts the print job to 300 dpi PostScript before giving it to the hp driver, as it shows in the CUPS logs:

D [Job 261] Printer make and model: HP HP Officejet Pro X476dw MFP
D [Job 261] Running command line for pstops: pstops 261 schlomo hebrew-test.pdf 1 'finishings=3 media=iso_a4_210x297mm output-bin=face-down print-color-mode=color print-quality=4 sides=one-sided job-uuid=urn:uuid:c1da9224-d10b-3c2f-6a99-487121b8864c job-originating-host-name=localhost time-at-creation=1414128121 time-at-processing=1414128121 Duplex=None PageSize=A4'
D [Job 261] No resolution information found in the PPD file.
D [Job 261] Using image rendering resolution 300 dpi
D [Job 261] Running command line for gs: gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=ps2write -sOUTPUTFILE=%stdout -dLanguageLevel=3 -r300 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -c 'save pop' -f /var/spool/cups/tmp/066575456f596

I was able to fix the problem by adding this resolution setting to the PostScript Printer Definitions (PPD):

*DefaultResolution: 600x600dpi

As a result the print job is converted at 600 dpi instead of 300 dpi which leads to the expected crisp result:

D [Job 262] Printer make and model: HP HP Officejet Pro X476dw MFP
D [Job 262] Running command line for pstops: pstops 262 schlomo hebrew-test.pdf 1 'Duplex=None finishings=3 media=iso_a4_210x297mm output-bin=face-down print-color-mode=color print-quality=4 sides=two-sided-long-edge job-uuid=urn:uuid:83e69459-c350-37e5-417d-9ca00f8c6bd9 job-originating-host-name=localhost time-at-creation=1414128153 time-at-processing=1414128153 PageSize=A4'
D [Job 262] Using image rendering resolution 600 dpi
D [Job 262] Running command line for gs: gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=ps2write -sOUTPUTFILE=%stdout -dLanguageLevel=3 -r600 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -c 'save pop' -f /var/spool/cups/tmp/0666d544aec68

Isn't it really nice that one only needs a text editor to fix printer driver problems on Linux (and Mac)?

On github.com/schlomo/HP_Officejet_Pro_X476dw I maintain an improved version of the PPD file with the following features:
  • Set printing resolution to 600dpi
  • Use printer for multiple copies, not CUPS
  • Default to duplex printing
The corresponding Launchpad bug is still open und unresolved. Apparently it is not simply to submit improvements upstream.


Comparing Amazon Linux

Since ImmobilienScout24 decided to migrate to a public cloud I have been busy looking at various cloud offerings in detail. Amazon Web Services (AWS) has a special feature which is interesting: Amazon Linux is a fully supported, "RHEL like", RPM-based Linux distribution.

While not beeing a true Red Hat Enterprise Linux clone like CentOS or Scientific Linux (which is the standard OS for the ImmobilienScout24 data centers), it is derived from some Fedora version and comes with a nice choice of current software. To me it feels like "RHEL +" because so far all our internal stuff worked well but a lot of software packages are much newer than on RHEL 6 or RHEL 7. The 2014.09 release updated a lot of components to very recent versions.

On the other hand, we also found packages missing from Amazon Linux, most notably desktop-file-utils. This package is required to install Oracle Java RPMs. I found a thread about this on the AWS Forums and added a request for desktop-file-utils in September 2014. In the mean time the package was added to Amazon Linux, although the forum thread does not mention it (yet).

To find out in advance if there are any other surprises waiting for us on Amazon Linux, I created a little tool to collect RPM Provides lists from different Linux distros on AWS. github.com/ImmobilienScout24/aws-distro-rpm-comparison takes a VPC and one or several AMI IDs and spins up an EC2 instance for each to collect the list of all the RPM provides from all available YUM repositories.

$ ./aws-distro-rpm-comparison.py -h
Create EC2 instances with different Linux distros and compare
the available RPMs on them.

  aws-distro-rpm-comparions.py [options] VPC_ID USER@AMI_ID...

  VPC_ID        VPC_ID to use
  USER@AMI_ID   AMI IDs to use with their respective SSH user

  -h --help            show this help message and exit
  --version            show version and exit
  --region=REGION      use this region [default: eu-west-1]
  --type=TYPE          EC2 instance type [default: t2.micro]
  --defaultuser=USER   Default user to use for USER@AMI_ID [default: ec2-user]
  --verbose            Verbose logging
  --debug              Debug logging
  --interactive        Dump SSH Key and IPs and wait for before removing EC2 instances


* The AMI_IDs and the EC2 instance type must match (HVM or PV)
This list can then be used to compare with the RPM Requires in our data centers. To get a better picture of Amazon Linux I created such lists for Red Hat Enterprise Linux 6 and 7, CentOS 6 and Amazon Linux in the Github project under results. For online viewing I created a Google Spreadsheet with these list, you can copy that and modify it for your own needs.

Open the results in Google Drive Sheets.
At a first glance it seems very difficult to say how compatible Amazon Linux really is as there are a lot of RPM Provides missing on both sides. But these lists should prove useful in order to analyze our existing servers and to understand if they would also work on Amazon Linux. The tools can be also used for any kind of RPM distro comparison.

In any case, Amazon Linux is exactly that what RHEL cannot be: A stable RPM-based distribution with a lot of recent software and regular updates.