2014-02-25

SSH with Personal Environment

A colleague, Eric Grehm, raised an interesting challenge:

How to maintain his personal work environment (VIM settings, .bashrc ...) on all servers?

The first thought was putting this somehow into our software distribution, but we quickly realized that this would trigger needless updates on hundreds of servers. The benefit would be that the personal work environment is already on every server upon first access.

The next idea is to switch from a pre-installed personal environment to an on-demand solution where the personal environment is transferred each time a remote connection (over SSH) is established.

A simple implementation would just to a scp before the ssh, but that entails two connections which takes more time and might also bother the user with a double password request.

Side-channel data transfer

An alternative is to piggyback the file transfer onto the regular SSH connection so that the personal environment is transferred in a side channel:
  1. On the client create a directory with the files of the personal environment that need to be distributed:
    $ tree -a personal_environment
    personal_environment
    ├── .bashrc
    └── .vimrc
  2. On the client create a TAR.GZ archive with files that need to be transferred and store this archive base64-encoded in an environment variable (can take till about 127 kB):
    $ export USERHOME_DATA=$(tar -C ~/personal_environment -cz .| base64)

    I put this into a function in my .bash_profile to load on each login.
  3. Configure the SSH client to transmit this environment variable (SendEnv):
    $ cat .ssh/config
    Host dev*
            SendEnv USERHOME_DATA
  4. Configure the SSH server to accept this environment variable (AcceptEnv):
    $ sudo grep USERHOME /etc/ssh/sshd_config
    AcceptEnv USERHOME_DATA
  5. Create an sshrc script on the server that unpacks the archive from the environment variable upon login:

    (Only the last part is relevant to this topic, but if an sshrc script is provided it must also take care of xauth).

Benefits

This approach has several benefits:
  • True on-demand solution, personal environment is updated on each connection.
  • No extra connection required to transfer data.
  • sshrc is executed before the login shell so that also .bashrc can be transferred.
  • Scales well with an arbitrary amount of users.
  • Scales well with high amount of changes to the personal work environment.
The disadvantages are that the SSH configuration must be extended and that the amount of transferable data is limited to 127 kB compressed & encoded, which I actually see as a benefit because it prevents abuse.

For me the benefits by far outweigh the problems and I don't need to transfer so many files. This solution fulfills all my needs without putting an extra load on the servers or on our deployment infrastructure.

2014-02-13

Rough Measurement for HTTP Client Download Speed

Henrik G. Vogel  / pixelio.de
Ever wonder if your website is slow because of the server or because of the clients?
Do you want to know how fast is your clients' connection to the Internet?
Don't want to use external tracking services, injecting JavaScript etc.?

Why not simply measure how long it takes to deliver the content from your webserver to your users? Apache and nginx both support logging the total processing time of a request with a suitably high precision. That gives the time from starting with first byte received from the client and ending after the last byte sent to the client.

To try out this idea I added %D to the log format for access.log of my Apache server and wrote a little Python script to calculate the transfer speeds. With the help of the apachelog Python module parsing the Apache access.log is really simple. This module takes a log format definition as configuration and automatically breaks down a log line into the corresponding values.

The script can be found together with a little explanation on my GitHub page: github.com/schlomo/apacheclientspeed.

It is rather rough, I am now collecting data to see how useful this information actually is. One thing I can already say: Small files have very erratic results, bigger files yield more trustworthy numbers.

Videos are a good example, the output of this script looks like this:

$ grep mp4 /var/log/apache2/access.log | tail -n 20 | python -m apacheclientspeed
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 1409 KiloByte/s
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 119 KiloByte/s
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 1936 KiloByte/s
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 90 KiloByte/s
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 159 KiloByte/s
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 83 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 2067 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 43226 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 4 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 33491 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 28 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 0 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 0 KiloByte/s
89.204.139.1 GET /c/g2/GANGNAM_320_Rf_28.mp4 HTTP/1.1 0 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 0 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 0 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 0 KiloByte/s
89.204.139.1 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 0 KiloByte/s
217.111.70.208 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 507 KiloByte/s
80.226.1.16 GET /c/g1/GANGNAM_320_Rf_30.mp4 HTTP/1.1 46 KiloByte/s

The many requests with 0 KB/s are made by iOS devices. They do a lot of smaller Range requests when streaming a movie over HTTP.