2015-02-27

mod_remoteip backport for Apache HTTPD 2.2

Apache HTTPD 2.4 has a very useful new feature for large deployments: Replacing the remote IP of a request from a request header, e.g. set by a load balancer or reverse proxy. Users of Apache HTTPD 2.2 as found on RHEL6 can now use the backport found on https://github.com/ImmobilienScout24/mod_remoteip-httpd22.

I pulled this backport together from various sources found on the Internet and "it seems to work". Working with C code (which I did not do for 14 years!) tought me again the value of test driven development and modern programming languages. Unfortunately I still can't explain a change like this without a lot of thinking:
You can easily build an RPM from the code on GitHub. The commit history shows the steps I had to undertake to get there. Configuration is as simple as this:


LoadModule remoteip_module modules/mod_remoteip.so
RemoteIPHeader X-Forwarded-For
RemoteIPInternalProxy 10.100.15.33

with the result that a reverse proxy on 10.100.15.33 can set the X-Forwarded-For header. Apache configuration like Allow From can then use the regular client IP even though the client does not talk directly to the web server.

2015-02-18

Simplified DEB Repository



2 years ago I wrote about creating a repository for DEB packages with the help of reprepro. And since then I suffer from the complexity of the process and cumbersome reprepro usage:
  • Complicated to add support for new Ubuntu version which happens every 6 months
  • Need to specifically handle new architectures
  • I actually don't need most of the features that reprepro supports, e.g. managing multiple repos in one or package staging
This week I realized that for there is a much simpler solution for my needs: apt-ftparchive. This tool creates a trivial repo with just enough information to make apt happy. For my purposes that is enough. All what I want from a DEB repo is actually
  • Work well with 50-500 packages
  • Easy to add new Debian/Ubuntu/Raspbian versions or architectures
  • Simple enough for me to understand
  • GPG signatures
It turns out that the trivial repo format is enough for that, it makes it even simpler to add new distro versions because the repo does not contain any information about the distro versions. That means that the repo does not change for new distros and that I don't need to change the sources.list line after upgrades.

The following script maintains the repo. It will copy DEB packages given as command line parameters into the repository or simply recreate the metadata. The script uses gpg to sign the repo with your default GPG key. If you want to maintain a GPG key as part of the repo then you can create a key sub-directory which will be used as GPG home directory.


The script expects a config_for_release file in the repo that contains some extra information:

To add this repo to your system add something like this to your /etc/apt/sources.list:

2015-02-07

Ubuntu Guest Session Lockdown

The guest session is a very important feature of Ubuntu Linux. It makes it very simple to give other people temporary computer or Internet access without compromising the permanent users of the computer.

Unfortunately the separation is not perfect, the guest user can actually modify critical configuration settings on the computer and even access the files of the other users, if they don't take precautions.

The following scripts and files help to lock down the guest session so that no harm can be done.

How It Works

The guest session is actually a feature of the LightDM Display Manager that is used in Ubuntu and in Xubuntu. The guest session is enabled by default.

When a user chooses a guest session the following happens:
  1. LightDM uses the /usr/sbin/guest-account script to setup a temporary guest account. The home directory is created in memory (via tmpfs) and can occupy at most half the RAM of the computer.
    Optionally, /etc/guest-session/prefs.sh is included as root to further customize the guest account.
  2. LightDM starts a new session as this account.
  3. The guest account runs the /usr/lib/lightdm/guest-session-auto.sh start script.
    Optionally, /etc/guest-session/auto.sh is included to run other start tasks as the guest user.
The guest session behaves like a regular Ubuntu session including full access to removable storage media, printers etc.

Upon session termination LightDM uses the /usr/sbin/guest-account script to remove the temporary account and its home directory.

Securing Other Users

The default umask is 022 so that by default all user files are world-readable. That makes them also readable by the guest session user. For more privacy the umask should be set to 007 for example in /etc/profile.d/secure-umask.sh:

Preventing System Modifications

Much worse is the fact that by default every user - including the guest account - can modify a lot of system settings like the network configuration. The following PolicyKit Local Authority policy prevents the guest account from changing anything system related. It still permits the user to handle removable media and even to use encrypted disks. It should be installed as /etc/polkit-1/localauthority/90-mandatory.d/guest-lockdown.pkla:

Preventing Suspend and Hibernate

LightDM has IMHO a bug: If one opens a guest session and then locks that, it is impossible to go back to that guest session. Choosing "guest session" from the LightDM unlock menu will actually create a new guest session instead of taking the user back to the existing one. In the customization below we disable session locking for the guest altogether for this reason.

The same happens after a system suspend or hibernate because that also locks all sessions and shows the LightDM unlock login screen. The only "safe" solution is to disable suspend and hibernate for all users with this policy. It should go to /etc/polkit-1/localauthority/90-mandatory.d/disable-suspend-and-hibernate.pkla:

Customizing the Guest Session

To customize the guest session the following /etc/guest-session/prefs.sh and /etc/guest-session/auto.sh scripts are recommended. The prefs.sh script is run as root before switching to the guest account. It creates a custom Google Chrome icon without Gnome Keyring integration (that would ask for a login password which is not set) and disables various autostart programs from running. They are not needed for non-admin users.

This auto.sh script is run at the start of the guest session under the guest account. It configures the session behaviour. Most important is to disable the screen locking because it is impossible to return to a locked guest session. I decided to also completely disable the screen saver since I expect guest users to terminate their session and not let it run for a long time.

The other customizations are mostly for convenience or to set useful defaults:

  • Disable the shutdown and restart menu items.
  • Configure the keyboard layout with 2 languages (you probably want to adjust that as needed).
  • Show full date and time in the top panel.
  • Set the order of launcher icons in the Unity launcher.
It is fairly easy to find out other settings with gsettings list-recursively if required.

With these additions the guest session can be a very useful productivity tool. At our school the students use only the guest session of the Ubuntu computers. They got quickly used to the fact that they must store everything on their USB thumb drives and the teachers enjoy "unbreakable" computers that work reliably.

2015-01-29

No Site VPN for Cloud Data Centers

A site to site VPN is the standard solution for connecting several physical data center locations. Going to the Cloud, the first idea that comes to mind is to also connect the Cloud "data center" with a site VPN to the existing physical data centers. All Cloud providers offer such a feature.

But is such a VPN infrastructure also a "good idea"? Will it help us or hinder us in the future?

I actually believe that for having many data centers a site VPN infrastructure is a dangerous tool. On the good side it is very convenient to have and to set up and it simplifies a lot of things. On the other side it is also very easy to build a world-wide mesh of dependencies where a VPN failure can severly inhibit data center operations or even take down services. It also lures everybody into creating undocumented backend connection between services.

The core problem is in my opinion one of scale. Having a small number (3 to 5) of locations is fundamentally different from having 50 or more locations. With 3 locations it is still feasable to build a mesh layout, each location talking to each other. With 5 locations a mesh already needs 10 connection, which starts to be "much". But a star layout always has a central bottleneck. With 50 locations a mesh network already needs 1225 connections.

With the move from the world of (few) physical data centers to the world of cloud operations it is quite common to have many data centers in the cloud. For example, it is "best practice" to use many accounts to separate development and production or to isolate applications or teams (See Yavor Atanasov from BBC explaining this at the Velocity Europe 2014). What is not so obvious from afar is that actually each cloud account is a separate data center of its own! So having many teams can quickly lead to having many accounts, I have talked to many companies who have between 30 and 100 cloud accounts!

Another problem is the fact that all the data centers would need to have different IP ranges plus one needs another (small) IP range for each connection etc. All of this is a lot of stuff to handle.

As an alternative approach I suggest not using site VPNs altogether. Instead, each data center should be handled as an independant data center altogether. To make this fact transparent, I would also suggest to use the same IP range in all data centers!

I see many advantages to such a setup:

  • All connections between services in different data centers must use the public network.
    As a result all connections have to be secured and audited and supervised properly. They have to be part of the production environment and will be fully documented. If a connection for one application fails, other applications are not impaired.
  • Standard ressources can be offered under standard IPs to simplify bootstrapping (e.g. 192.168.10.10 is always the outgoing proxy or DNS server or something other vital).
  • If development and production are in separate "data centers" (e.g. cloud accounts), then they can be much more similar.
  • A security breach in one account does not easily spill over into other accounts.
  • Operations and setup of the cloud accounts and physical data centers becomes much simpler (less external dependencies).
  • It will be easy to build up systemic resilience as a failure in one data center or account does not easily affect other data centers or accounts.
  • Admins will be treated as road warriors and connect to each data center independantly as needed.
Currently I am looking for arguments in favor and against this idea, please share your thoughts!



2015-01-22

PPD - Pimp your Printer Driver

I recently got myself a new printer, the HP Officejet Pro X476dw. A very nice and powerful machine, it can not only print double sided but also scan, copy and send faxes.

And of course it has very good Linux support, thanks to the HP Linux Printing and Imaging Open Source project. On my Ubuntu 14.10 desktop everything is already included to use the printer.

However, the first printouts where very disappointing. They looked coarse and ugly, much worse than prints from my old HP LaserJet 6 printer. After overcoming the initial shock I realized that only prints from my Ubuntu desktop where bad while prints over Google Cloud Print where crisp and good looking.

So obviously something has to be wrong with the printer drive on Ubuntu!

After some debugging I was able to trace this down to the fact that by default CUPS converts the print job to 300 dpi PostScript before giving it to the hp driver, as it shows in the CUPS logs:

D [Job 261] Printer make and model: HP HP Officejet Pro X476dw MFP
D [Job 261] Running command line for pstops: pstops 261 schlomo hebrew-test.pdf 1 'finishings=3 media=iso_a4_210x297mm output-bin=face-down print-color-mode=color print-quality=4 sides=one-sided job-uuid=urn:uuid:c1da9224-d10b-3c2f-6a99-487121b8864c job-originating-host-name=localhost time-at-creation=1414128121 time-at-processing=1414128121 Duplex=None PageSize=A4'
D [Job 261] No resolution information found in the PPD file.
D [Job 261] Using image rendering resolution 300 dpi
D [Job 261] Running command line for gs: gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=ps2write -sOUTPUTFILE=%stdout -dLanguageLevel=3 -r300 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -c 'save pop' -f /var/spool/cups/tmp/066575456f596

I was able to fix the problem by adding this resolution setting to the PostScript Printer Definitions (PPD):

*DefaultResolution: 600x600dpi

As a result the print job is converted at 600 dpi instead of 300 dpi which leads to the expected crisp result:

D [Job 262] Printer make and model: HP HP Officejet Pro X476dw MFP
D [Job 262] Running command line for pstops: pstops 262 schlomo hebrew-test.pdf 1 'Duplex=None finishings=3 media=iso_a4_210x297mm output-bin=face-down print-color-mode=color print-quality=4 sides=two-sided-long-edge job-uuid=urn:uuid:83e69459-c350-37e5-417d-9ca00f8c6bd9 job-originating-host-name=localhost time-at-creation=1414128153 time-at-processing=1414128153 PageSize=A4'
D [Job 262] Using image rendering resolution 600 dpi
D [Job 262] Running command line for gs: gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=ps2write -sOUTPUTFILE=%stdout -dLanguageLevel=3 -r600 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -c 'save pop' -f /var/spool/cups/tmp/0666d544aec68

Isn't it really nice that one only needs a text editor to fix printer driver problems on Linux (and Mac)?

On github.com/schlomo/HP_Officejet_Pro_X476dw I maintain an improved version of the PPD file with the following features:
  • Set printing resolution to 600dpi
  • Use printer for multiple copies, not CUPS
  • Default to duplex printing
The corresponding Launchpad bug is still open und unresolved. Apparently it is not simply to submit improvements upstream.

2015-01-16

Comparing Amazon Linux

Since ImmobilienScout24 decided to migrate to a public cloud I have been busy looking at various cloud offerings in detail. Amazon Web Services (AWS) has a special feature which is interesting: Amazon Linux is a fully supported, "RHEL like", RPM-based Linux distribution.

While not beeing a true Red Hat Enterprise Linux clone like CentOS or Scientific Linux (which is the standard OS for the ImmobilienScout24 data centers), it is derived from some Fedora version and comes with a nice choice of current software. To me it feels like "RHEL +" because so far all our internal stuff worked well but a lot of software packages are much newer than on RHEL 6 or RHEL 7. The 2014.09 release updated a lot of components to very recent versions.

On the other hand, we also found packages missing from Amazon Linux, most notably desktop-file-utils. This package is required to install Oracle Java RPMs. I found a thread about this on the AWS Forums and added a request for desktop-file-utils in September 2014. In the mean time the package was added to Amazon Linux, although the forum thread does not mention it (yet).

To find out in advance if there are any other surprises waiting for us on Amazon Linux, I created a little tool to collect RPM Provides lists from different Linux distros on AWS. github.com/ImmobilienScout24/aws-distro-rpm-comparison takes a VPC and one or several AMI IDs and spins up an EC2 instance for each to collect the list of all the RPM provides from all available YUM repositories.

$ ./aws-distro-rpm-comparison.py -h
Create EC2 instances with different Linux distros and compare
the available RPMs on them.

Usage:
  aws-distro-rpm-comparions.py [options] VPC_ID USER@AMI_ID...

Arguments:
  VPC_ID        VPC_ID to use
  USER@AMI_ID   AMI IDs to use with their respective SSH user

Options:
  -h --help            show this help message and exit
  --version            show version and exit
  --region=REGION      use this region [default: eu-west-1]
  --type=TYPE          EC2 instance type [default: t2.micro]
  --defaultuser=USER   Default user to use for USER@AMI_ID [default: ec2-user]
  --verbose            Verbose logging
  --debug              Debug logging
  --interactive        Dump SSH Key and IPs and wait for before removing EC2 instances

Notes:

* The AMI_IDs and the EC2 instance type must match (HVM or PV)
This list can then be used to compare with the RPM Requires in our data centers. To get a better picture of Amazon Linux I created such lists for Red Hat Enterprise Linux 6 and 7, CentOS 6 and Amazon Linux in the Github project under results. For online viewing I created a Google Spreadsheet with these list, you can copy that and modify it for your own needs.

Open the results in Google Drive Sheets.
At a first glance it seems very difficult to say how compatible Amazon Linux really is as there are a lot of RPM Provides missing on both sides. But these lists should prove useful in order to analyze our existing servers and to understand if they would also work on Amazon Linux. The tools can be also used for any kind of RPM distro comparison.

In any case, Amazon Linux is exactly that what RHEL cannot be: A stable RPM-based distribution with a lot of recent software and regular updates.

2014-10-26

DevOpsDays Berlin 2014

Update: Read my (German) conference report on heise developer.

Last week I was at the DevOps Days Berlin 2014. This time at the Kalkscheune, a much better location than the Urania from last year. With 250 people the conference was not too full and the location was also well equipped to handle this amount.

Proving DevOps to be more about people and culture, most talks where not so technical but emphasized the need to take along all the people on the journey to DevOps.

An technical bonus was the talk by Simon Eskildsen about "Docker at Shopify" which was the first time that I heard about a successful Docker implementation in production.

Always good to know is the difference between effective and efficient as explained by Alex Schwartz in "DevOps means effectiveness first". DevOps is actually a way to optimize for effectiveness before optimizing for efficience.

Microsoft and SAP gave talks about DevOps in their world - quite impressive to see DevOps beeing main stream.

My own contribution was an ignite talk about ImmobilienScout24 and the Cloud:


And I am also a certified DevOps now:

2014-07-27

EuroPython 2014

One full week of Python power is almost more than one can take, but I missing it would be even worse.

This was my first EuroPython and with 1200 participants a big upgrade compared to the previous 2 PyCon.DE events in which I participated. The location (Berlin Congress Center) deserves kudos, along with the perfect organization.

The Wifi worked really well (except for a WAN problem on Tuesday which was fixed quickly) and everybody loved the catering. They even had kosher, helal and vegan food (preordered), which is highly unusual for German conferences. Most amazing was the video crew who managed to upload all videos in about one hour after a talk was given.

I managed to give three talks:

  • DevOps Risk MitigationHow we use Test Driven Infrastructure at ImmobilienScout24 as part of our general automation to reduce the risk of giving everybody access everywhere. (Access Slides or Watch Video)
  • YAML ReaderLightning Talk about the yamlreader Python library, which provides a wrapper for the yaml.safe_load function that merges several YAML files. yamlreader is the base for most of the modularized configuration in our Python software. (Access Slides or Watch Video)
  • Open Source Sponsoring
    About why your company should invest into Open Source projects instead of into proprietary software. I did not plan this talk, but a speaker did not show up and I jumped in. (Access Slides or Watch Video)
I very much enjoyed the international public at the conference and hope to be able to also attend next years event.




2014-07-01

iPXE - The Versatile Boot Loader

iPXE is a lesser known Open Source PXE boot loader which offers many interesting features:

Talk & Article

Since iPXE plays a role in the ImmobilienScout24 boot automation I gave a talk about it at the LinuxTag 2014. The talk is half an hour long and gives a quick introduction into iPXE. It covers build, configuration & scripting and shows how to develop boot scripts in iPXE with a very short feedback cycle.



Download the slides to the talk and the audio recording as a podcast.

At the conference the German Linux Magazin became interested in the topic and asked me to write an article about iPXE:

Der vielseitige Netzwerk-Bootloader I-PXE
Linux Magazin 08/2014


Demo Scripts

For the article I created a bunch of demo scripts that are available on Gist. To try them out follow these steps:
  1. Install QEMU, usually part of your Linux distro but also available for other platforms.
  2. Download my pre-built iPXE boot kernel ipxe.lkrn
  3. Start QEMU with ipxe.lkrn and the URL to the demo script:
    qemu -kernel ipxe.lkrn -append \ 'dhcp && chain http://goo.gl/j8MbXI'
  4. Try out the various options. The login will accept any password that is the reverse of the username.
This demo script looks like that:

And the QEMU boot looks like that:
ipxe-qemu-demo2-menu.png

Try it out

Anybody struggling with PXELINUX should most definitively check out iPXE to see if it provides a better alternative to their needs.

2014-06-26

automirror - Automate Linux Screen Mirroring


I do a lot of pair working and many times I connect a large TV or projector to my laptop for others to see what I am doing.

Unfortunately the display resolution of my laptop never matches that of the other display, and Linux tends to choose 1024x768 as the highest compatible resolution. This is of course totally useless for doing any real work.

My preferred solution for this problem is to use X scaling to bridge the resolution gap between the different screens.

Since none of the regular display configuration tools support scaling, I ended up typing this line very often:

xrandr --output LVDS1 --mode 1600x900 --output HDMI3 --mode 1920x1080 --scale-from 1600x900

Eventually I got fed up and decided to automate the process, the result is automirror, a little Bash script that automatically configures all attached displays in a mirror configuration. automirror is available on https://github.com/schlomo/automirror.

Typical Use Cases

Connecting a Full HD 1920x1080 display via HDMI to my 1600x900 laptop. In this case automirror will simply configure the HDMI device with 1920x1080 and scale the 1600x900 laptop display. As a result I stay with the full resolution on my laptop display and it also looks nice on the projector.

Another case is where I work with a 1920x1200 computer monitor and add the 1920x1080 projector as a second display. Again the common resolution offered by both devices is 1024x768. automirror will recognize my 1920x1200 display as primary display and scale it to 1920x1080 on the secondary display, which is not really noticeable.

It is recommended to configure a hot key to run automirror so that one can run it even if the display configuration is heavily mwessed up. In rare cases it might be neccessary to run automirror more than once so that xrandr will configure the displays correctly.