2017-06-23

Eliminating the Password of Shared Accounts

Following up on "Lifting the Curse of Static Credentials", everybody should look closely at how they handle shared accounts, robot users or technical logins. Do you really rotate passwords, tokens and keys each time somebody who had access to the account leaves your team or the company? Do you know who has access? How do you know that they didn't pass on those credentials or put them in an unsafe place?

For all intents and purposes, a shared account is like anonymous access for your employees. If something bad happens, the perpetrator can point to the group and deny everything. As an employer you will find it nearly impossible to prove who actually used the password that was known to so many. Or even to prove that it was one of your own employees and not an outside attacker who "somehow" stole the credentials.

Thanks to identity federation and federated login protocols like SAML2 and OpenID Connect it is now much easier to completely eliminate passwords for shared accounts. Without passwords the shared accounts are much less risky. You can actually be sure that only active and authorized users can use the shared accounts, both now and in the future.
The concept is fairly simple. It is based on the federated identity provider disconnecting between the login identity used for authentication and the account identity that is the result of authorization. 

In the following I use the specific case of shared accounts for GitHub Enterprise (GHE) as an application and G Suite (Google) as the federated identity provider to illustrate the idea. The concepts are however universal and can easily be adapted for other applications and identity providers.

A typical challenge in GitHub (both the public service and on-premise) is the fact that GitHub as an application only deals with users. There is no concept of services or service accounts. If you want to build sustainable automation then you must, according to the GitHub documentation, create a "machine user" which is a regular user re-purposed as a shared account. GitHub even points out that this is the recommended solution even though otherwise GitHub users must be real people according to their Terms of Service.

Normal Logins for Real and Shared Users

Before we deal with shared accounts we first look at the normal federated login process in figure 1. GHE uses SAML2 to delegate user authentication to Google.


Fig 1: Normal Federated User Login
User John Doe wants to use the GHE web interface to work with GHE. He points ➊ his browser to GHE which does not have a valid session for him. GHE redirects ➋ John to sign in at his company's Google account. If he is not signed in already, he authenticates as his own identity ➌ john@doe.com. Google redirects him back to GHE and signals ➍ to GHE that this user is John Doe. With the authorization from Google John is now logged in ➎ as @jdoe in the GHE application.

As users sign in to GHE their respective user account is created if does not exist. This "Just In Time Provisioning" is a feature of SAML2 that greatly simplifies the integration of 3rd party applications.

The traditional way to introduce shared accounts in this scenario is creating regular users within the identity provider (here Google) and handing out the login credentials (username & password) to the groups of people who need access to the shared account. They can then login with those credentials instead of their own and thereby become the machine user in the application. For all the involved systems there is no technical difference between a real user and a shared account user, the difference comes only from how we treat them.

The downsides of this approach include significant inconvenience to the users who have to sign out globally from the identity provider before they can switch users, or use an independent browser window just for that purpose. The biggest threats come from the way how the users manage the password and 2-factor tokens of the shared user and from the organization's (in-)ability to rotate these after every personnel change.

Login Shared Users without Password

Many applications (GHE really only serves as an example here) do not have a good concept for service accounts and rely on regular user accounts to be used for automation. As we cannot change all those applications we must accommodate their needs and give them such service users.

The good news is that in the world of federated logins only the user authentication (the actual login) is federated. Each application maintains its own user database. This database is filled and updated through the federated logins, but it is still an independent user database. That means that while all users in the identity provider will have a corresponding user in the application (if they used it), there can be additional users in the application's user database without a matching user in the identity provider. Of course the only way to access (and create) such users is through the federated login. The identity provider must authorize somebody to be such a "local" user when signing in to the application.

To introduce shared accounts in the application without adding them as real users in the identity provider we have to introduce two changes to the standard setup:
  1. The identity provider must be able to signal different usernames to the application during the login process.
  2. The real user must be able to choose as which user to work in the application after the next login.
Figure 2 shows this extended login process. For our example use case the user John Doe is a member of the team Alpha. John wants to access the team's account in GHE to connect their team's git repositories with the company's continuous delivery (CD) automation.
Fig 2: Login to GHE as Team User
For regular logins as himself John would "just use" GHE as described above. To login as the team account John first goes to the GHE User Chooser, a custom-built web application where John can select ➊ as which GHE user he wants to be logged in at GHE. Access to the chooser is of course also protected with the same federated login, the figure omits this detail for clarity.

John selects the team user for his team Alpha. The chooser stores ➋ Johns selection (team-alpha) in a custom attribute in Johns user data within the identity provider.

Next John goes as before to GHE. If he still has an active session at GHE he needs to sign out from GHE, but this does not sign him out at the identity provider or all other company applications.

Then John can access again GHE ➌ which will redirect ➍ him to the identity provider, in this example Google. There John signs in ➎ with his own identity john@doe.com. Most likely John still has an active session with Google so that he won't actually see this part. Google will confirm his identity without user interaction

The identity provider reads ➏ the username to send to the application from the custom attribute. When the identity provider redirects ➐ John back to the application, it also sets the GHE user from this custom attribute. In this case the custom attribute contains team-alpha and not jdoe as it would for a personal login. This redirect is the place where the identity switch actually happens. As a result, John retained his personal identity in Google and is signed in to GHE as his team account ➑ @team-alpha.

The main benefit of this approach is the radical removal of shared account passwords and the solid audit trail for the shared accounts. It applies the idea of service roles to the domain of standard applications that do not support such roles on their own. So far only few applications have a concept of service roles, most notably Amazon AWS with its IAM Roles. This approach brings the same convenience and security also to all other applications.

Outlook

Unfortunately this concept only protects the access to the shared account itself, not the access to tokens and keys that belong to such an account. Improving the general security is a step-by-step process and user chooser takes us a major step up towards truly securing such applications like GHE.

The next step would be addressing the static credentials that are generated within the application. In the case of GHE these are the personal access tokens and SSH keys that are the only way how external tools can use the shared account. These tokens and keys are perpetual shared secrets that can easily be copied without anybody noticing.

To get rid of all of this we will have to create an identity mapping proxy that sits in front of the application to translate the authentication of API calls. To the outside (the side that is used by the users and other services) the proxy uses the company authentication scheme, e.g. OAuth2. To the inside (towards the application) it uses the static credentials that the application supports. In order to fully automate this mapping, the proxy also has to maintain those static credentials on behalf of the users so that the users do not need to deal with them at all.

In this scenario there is also no need for a user account chooser as described above: users will have no need to act on behalf of the service accounts, the most interaction will be to grant permissions to those service accounts to access shared resources.

Figure 3 shows how such a proxy for GitHub Enterprise and the company's OAuth2 identity provider, e.g. Google, could be built. It is surely a much larger engineering effort than the user account chooser, but it solves the entire problem of static credentials, not only the problem of shared account passwords.
Fig 3: Identity Mapping Proxy to remove static credentials from API authentication

It really is possible to get rid of static credentials, even for applications where the vendor does not support such ideas. While these concepts can be adapted for any kind of application, the account chooser and identity mapping proxy will be somewhat custom tailored. Depending on the threat model and risk assessment in your own organisation the development effort might be very cheap compared to the alternative to continue living with the risks.

I hope that both application vendors and identity providers will eventually understand that static credentials are the source of a lot of troubles and that it is their job to provide us users good integrations based on centrally managed identities, especially for the integration of different services.

2017-06-16

Using Kubernetes with Multiple Containers for Initialization and Maintenance

Kubernetes is a great way to run applications because it allows us to manage single Linux processes with a real cluster manager. A computer with multiple services is typically implemented as a pod with multiple containers sharing communication and storage:
Ideally every container runs only a single process. On Linux, most applications have three phases with two different programs or scripts:
  1. The initialization phase, typically an init script or a systemd unit file.
  2. The run phase, typically a binary or a script that runs a daemon.
  3. The maintenance phase, typically a script run as a CRON job.
While it is possible to put the initialization phase into a Docker container as part of the ENTRYPOINT script, that approach gives much less control over the entire process and makes it impossible to use different security contexts for each phase, e.g. to prevent the main application from directly accessing the backup storage.

Initialization Containers

Kubernetes offers initContainers to solve this problem: Regular containers that run before the main containers within the same pod. They can mount the data volumes of the main application container and "lay the ground" for the application. Furthermore they share the network configuration with the application container. Such an initContainer can also contain completely different software or use credentials not available to the main application.

Typical use cases for initContainers are
  • Performing an automated restore from backup in case the data volume is empty (initial setup after a major outage) or contains corrupt data (automatically recover the last working version).
  • Doing database schema updates and other data maintenance tasks that depend on the main application not running.
  • Ensure that the application data is in a consistent and usable state, repairing it if necessary.
The same logic also applies to maintenance tasks that need to happen repeatedly during the run time of an application. Traditionally CRON jobs are used to schedule such tasks. Kubernetes does not (yet) offer a mechanism to start a container periodically on an existing pod. Kubernetes Cron Jobs are independent pods that cannot share data volumes with running pods.

One widespread solution is running a CRON daemon together with the application in a shared container. This not only brakes the Kubernetes concept but also adds a lot of complexity as now you also have to take care of managing multiple processes within one container.

Maintenance Containers

A more elegant solution is using a sidecar container that runs alongside the application container within the same pod. Like the initContainer, such a container shares the network environment and can also access data volumes from the pod. A typical application with init phase, run phase and maintenance phase looks like this on Kubernetes:
This example also shows an S3 bucket that is used for backups. The initContainer has exclusive access before the main application starts. It checks the data volume and restores data from backup if needed. Then both the main application container and the maintenance container are started and run in parallel. The maintenance container waits for the appropriate time and performs its maintenance task. Then it again waits for the next maintenance time and so on.

Simple CRON In Bash

The maintenance container can now contain a CRON daemon (for example Alpine Linux ships with dcron) that runs one or several jobs. If you have just a single job that needs to run once a day you can also get by with this simple Bash script. It takes the maintenance time in the RUNAT environment variable.


All that also holds true for other Docker cluster environments like Docker Swarm mode. However you package your software, Kubernetes offers a great way to simplify our Linux servers by dealing directly with the relevant Linux processes from a cluster perspective. Everything else of a traditional Linux system is now obsolete for applications running on Kubernetes or other Docker environments.

2017-06-12

Working with IAM Roles in Amazon AWS

Last week I wrote about understanding IAM Roles, let's follow up with some practical aspects. The following examples and scripts all use the aws-cli which you should have already installed. The scripts work on Mac and Linux and probably on Windows under Cygwin.

To illustrate the examples I use the case of an S3 backup bucket in another AWS account. For that scenario it is recommended to use a dedicated access role in the target AWS account to avoid troubles with S3 object ownership.

AWS Who Am I?

The most important question is sometimes to ascertain the identity. Luckily the aws-cli provides an option for that:
$ aws sts get-caller-identity
{
    "Account": "123456789",
    "UserId": "ABCDEFG22L2KWYE5WQ:sschapiro",
    "Arn": "arn:aws:sts::123456789:assumed-role/PowerUser/sschapiro"
}
From this we can learn our AWS account and the IAM Role that we currently use, if any.

AWS Assume Role Script

The following Bash script is my personal tool for jumping IAM Roles on the command line:
It takes any number of arguments, each a role name in the current account or a role ARN. It will try to go from role to role and returns you the temporary AWS credentials of the last role as environment variables:
$ aws-assume-role ec2-worker arn:aws:iam::987654321:role/backup-role
INFO: Switched to role arn:aws:iam::123456789:role/ec2-worker
INFO: Switched to role arn:aws:iam::987654321:role/backup-role
AWS_SECRET_ACCESS_KEY=DyVFtB63Om+uihwuieufzud/w5vm7Lhp3lx
AWS_SESSION_TOKEN=FQoDYXdzEHgaDAgVN…✂…tyHZrYSibmLbJBQ==
AWS_ACCESS_KEY_ID=ABCDEFGFWEIRFJSD6PQ
The first role ec2-worker is in the same account as the credentials with which we start. Therefore we can specify it just by its name. The second role is in another account and must be fully specified. If to switch to a third role in the same account we could again use the short form.

Single aws-cli Command

To run a single aws-cli or other command as a different role we can simple prefix it like this:
$ eval $(aws-assume-role \
    ec2-worker \
    arn:aws:iam::987654321:role/backup-role \
  ) aws sts get-caller-identity
INFO: Switched to role arn:aws:iam::123456789:role/ec2-worker
INFO: Switched to role arn:aws:iam::987654321:role/backup-role
{
    "Arn": "arn:aws:sts::987654321:assumed-role/backup-role/sschapiro",
    "UserId": "ABCDEFGEDJW4AZKZE:sschapiro",
    "Account": "987654321"
}
Similarly you can start an interactive Bash by giving bash -i as the command. aws-cli also supports switching IAM Roles via configuration profiles. This is a recommended way to permanently switch to another IAM Role, e.g. on EC2.

Docker Container with IAM Role

The same script also helps us to run a Docker container with AWS credentials for the target role injected:
$ docker run --rm -it \
  --env-file <(
    aws-assume-role \
      ec2-worker \
      arn:aws:iam::987654321:role/backup-role \
    ) \
  mesosphere/aws-cli sts get-caller-identity
INFO: Switched to role arn:aws:iam::123456789:role/ec2-worker
INFO: Switched to role arn:aws:iam::987654321:role/backup-role
{
    "Arn": "arn:aws:sts::987654321:assumed-role/backup-role/sschapiro",
    "UserId": "ABCDEFGRWEDJW4AZKZE:sschapiro",
    "Account": "987654321"
}
This example just calls aws-cli within Docker. The main trick is to feed the output of aws-assume-role into Docker via the --env-file parameter.

I hope that these tools help you also to work with IAM Roles. Please add your own tips and tricks as comments.

2017-06-08

Understanding IAM Roles in Amazon AWS

One of the most important security features of Amazon AWS are IAM Roles. They provide a security umbrella that can be adjusted to an application's needs in great detail. As I all the time forget the details I summarize here everything that helps me and some useful tricks for working with IAM Roles. This is part one of two.

Understanding IAM Roles

From a conceptual perspective an IAM Role is a sentence like Alice may eat apples: It grants or denies permissions (in the form of a access policy) on specific resources to principals. Alice is the principal, may is the granting, eat is the permission (to eat, but not to look at) and apples is the resource, in this case any kind of apples.
IAM Roles can be much more complex, for example this rather complex sentence is still a very easy to read IAM Role: Alice and Bob from Hamburg may find, look at, smell, eat and dispose of apples № 5 and bananas. Here we grant permissions to our Alice and to some Bob from another AWS account, we permit a whole bunch of useful actions and we allow them on one specific type of apples and on all bananas.

On AWS Alice and Bob will be Principals, either code hosting services like EC2 or IAM Users and Roles, find, look at, smell … are specific API calls on AWS services like s3:GetObject, s3:PutObject and apples and bananas are AWS resource identifiers like arn:aws:s3:::my-backup-bucket.

IAM Roles as JSON

IAM Roles and Policies are typically shown as JSON data structures. There are two main components:
  1. IAM Role with RoleName and AssumeRolePolicyDocument
  2. IAM Policy Document with one or several policy Statements
The AssumeRolePolicyDocument defines who can use this role and the Statements in the Policy Document define what this role can do.

A typical IAM Role definition looks like this:
The really important part here is the AssumeRolePolicyDocument which defines who can actually use the role. In this case there are two other IAM roles that can make use of this role. AWS allows specifying all kinds of Principals from the same or other AWS accounts. So far this Role does not yet allow anything, but it already provides an AWS identity. To fill the Role with life you have to attach one or more Policy Documents to the role. They can be either inline and stored within the Role or they can be separate IAM Policies that can be attached to a Role, AWS also provides a large amount of predefined policies for common jobs.

A PolicyDocument definition looks like this:
Here we have one Statement (there could be several) that gives read and write access to a single S3 bucket. Note that it does not allow deleting objects from the bucket as this example is for a backup bucket that automatically expunges old files.

Creating IAM Roles with CloudFormation

We typically create AWS resources through CloudFormation. This example creates an S3 bucket for backups together with a matching IAM Role that grants access to the bucket:
The role can be used either by EC2 instances or by the PowerUser role which our people typically have. This allows me to test the role from my desktop during the development and for troubleshooting.

Read also Working with IAM Roles in Amazon AWS for the second part of this article with practical aspects and some tips and tricks.

2017-06-02

Root for All - A DevOps Measure?

Who has root access in your IT organizations? Do you "do" DevOps? Even though getting root access was once my personal motivation for pushing DevOps, I never considered the question of the relationship till it was triggered by my last conference visit.

Last week I attended the 10. Secure Linux Administration Conference - a small but cherished German event catering to Linux admins - and there where two DevOps talks: DevOps in der Praxis (Practical DevOps) by Matthias Klein and my own DevOps for Everybody talk. I found it very interesting that we both talked about DevOps from a "been there, done it" perspective, although with a very different message.

DevOps ≠ DevOps

For me DevOps is most of all a story of Dev and Ops being equal, sitting in the same boat and working together on shared automation to tackle all problems. My favourite image replaces humans as gateway to the servers with tooling that all humans use to collaboratively deliver changes to the servers. In my world the question of root access is not one of Dev or Ops but one of areas of responsibility without discrimination.

For Matthias DevOps is much more about getting Dev and Ops to work together on the system landscape and on bigger changes, about learning to use really useful tools from the other (e.g. continuous delivery) and about developing a common understanding of the platform that both feel responsible for.

We both agree in general on the cultural aspects of DevOps: it is not a tool you can buy but rather the way of putting the emphasize on people, how they work together with respect and trust and setting the project/product/company before personal interests and ascribing success or failure to a team and not individuals.

Demystifying Root Access

So why is root access such a big deal? It only lets you do anything on a server. Anything means not only doing your regular job but also means all sorts of blunders or even mischief. I suspect that the main reason for organisations to restrict root access is the question of trust. Whom to trust to not mess things up or whom to trust to not do bad things?

So root access is considered a risk and one of the most simple risk avoidance strategies is limiting the amount of people with root access. If a company would be able to blindly trust all its employees with root access without significantly increasing the overall risk, then this might not be such a big topic.

Interestingly root access to servers is handled differently than database access with credentials that can read, write and delete any database entry. I find this very surprising as in most cases a problem with the master database has a much bigger impact than a problem with a server.

Root = Trust

If root access is a question of trust then that gives us the direct connection to DevOps. DevOps is all about people working together and sharing responsibilities. It is therefore most of all about building up trust between all people involved and between the organisation and the teams.

The other question is, do we place more trust into our people or into the automation we build? Traditional trust models are entirely people based. DevOps thinking strongly advocates building up automation that takes over the tedious and the risky tasks. 

If there is a lot of trust then the question of root access becomes meaningless. In an ideal DevOps world all the people only work on the automation that runs the environment. The few manual interactions that are still required can be handled by everyone on the team.

As both trusting people and building trustworthy automation leads to a situation where it is acceptable to grant root access to everyone, you can actually use root access as a simple and clear measurement for your organisation's progress on the DevOps journey.

Current Status

To find out the current status I did a small poll on Twitter:

As I did not ask about the status of DevOps adoption we cannot correlate the answers. What I do interpret from the result is that there is enough difference in root access in different companies to try to learn something from that. And I am very happy to see that 34% of answers give broad root access.

Embrace the Challenge

You will hear a lot of worries or meet the belief that in your organisation giving root to everybody is impossible, e.g. due to regulatory requirements. These are valid fears and one should take them serious. However, the solution is not to stop automating but rather to incorporate the requirements into the automation and make it so good that it can also solve those challenges.

The question of root access is still very provocative. Use that fact to start the discussion that will lead to real DevOps-style automation, build a dash board to show root access as a KPI and start to build the trust and the automation that you need to give root access to everyone.

I'll be happy if you share your own example for the DevOps - root correlation (or the opposite) in the comments.

2017-05-26

Is Cloud Native the new Linux?

The CloudNativeCon + KubeCon North Europe 2017 in Berlin was sold out with 1500 participants. I learned really a lot about Kubernetes and the other new and shiny tools that start to become main stream.

To get an introduction into Cloud Native, watch Alexis Richardson in the keynote on "What is Cloud Native and Why Should I care" (slides, video at 12:27). He explained the goal of the Cloud Native Computing Foundation (CNCF) as avoiding cloud lock-in, which is much more to the point than the official charter (which talks about "the adoption of a new computing paradigm"). Alexis chairs the Technical Oversight Committee (TOC) of the CNCF. The Foundation is "projects first", set up similar to the Linux Foundation and already sponsors various Open Source projects.

Linux Lock-In

His remarks got me to think about the question, especially in comparison with Linux. To me it seems that modern IT in the data center already has a pretty strong "lock-in" with Linux. It seems like most public servers on the Internet already run Linux. So what is bad with this lock-in with Linux? Apparently nothing much really. But do we really have a lock-in with Linux or actually with a specific Linux distribution? I know very few people who changed their distro, say from Red Hat to Debian. If Red Hat becomes to expensive then people switch to free CentOS instead, but don't want to (or afford to) change all their tooling and system setup.

So even though Linux is - at its core - always Linux, in practice there is a big difference between running an application on Debian, Red Hat, SUSE, Gentoo, Archlinux or others. There are even relevant differences between closely related distributions like Debian and Ubuntu or between Red Hat and CentOS.

So while we talk about the freedom of choice with Linux we very seldom make use of it. When dealing with commercial software on Linux we also don't require from our software vendors to support "our" Linux stack. Instead, we typically accept the Linux distro a software vendor prescribes and feel happy that Linux is supported whatsoever.

Cloud Platforms

So far the cloud landscape is indeed very different from the Linux landscape. With cloud vendors today , we have completely different and totally incompatible ecosystems. Code written for one cloud, e.g. Amazon AWS, actually does not work at all on another cloud, e.g. Google GCP. I mean here the code that deals with cloud features like deployment automation or that uses cloud services like object storage. Of course the code that runs your own application is always the same, except for the parts interfacing with the cloud platform.

All Linux distributions will give you the exactly same PostgreSQL database as relational database or the exactly same Redis as key-value store. Clouds on the other hand give you different and incompatible implementations of similar concepts, for example AWS DynamoDB and Google Cloud Datastore. That would be as if every Linux distribution would ship a different and incompatible database.

With public clouds we - maybe for the first time - come to a situation where it is possible to build complex and very advanced IT environments without building and operating all of the building blocks on our own. It is a fact that many companies move from self-hosted data centers into the public cloud in order to benefit from the ready services found there. Cloud providers easily out-innovate everybody else with regard to infrastructure automation and service reliability while offering pay-per-use models that avoid costly upfront investments or long commitments.

Cloud Lock-In

However, anyone using a public cloud nowadays will face a very tough choice: Use all the features and services that the cloud provider offers or restrict oneself to the common functions found in every public cloud. One comes with a fairly deep technological buy-in while the other comes with a promise of easily replacing the cloud vendor with another one.

I don't think that this hope holds true. In my opinion the effort spent on operational automation, monitoring and other peripheral topics leads to a similar deep buy-in with any given cloud vendor. Switching to another vendor will be a disruptive operation that companies will underwrite only in case of real need, just like switching Linux distributions.

The same holds true for an environment that utilises all possible services of a cloud vendor. Switching platforms will be costly, painful and only done based on real need. I think that the difference in "lock-in" between using all cloud services and using only basic infrastructure services is only a gradual one and not a difference in principle. Whenever we use any kind of public - or even private - cloud platform there is a smaller or most likely larger amount of lock-in involved.

Cloud Native

If Cloud Native hopes to break the cloud lock-in then the goal must be to develop advanced services that become a factual standard for cloud services. Once enough vendors pick up on those services public and private clouds will indeed be as portable and compatible as Linux distributions.

So far Cloud Native is focused mostly on basic infrastructure software and not on advanced services. My hope is that over time this will change and that there will be a standard environment with advanced services, similar to how the Linux and Open Source world gives us a very rich tool chest for almost every problem.

Furthermore, I am not worried about the technical lock-in with today's large cloud vendors like Amazon or Google. While their back-end software is proprietary, the interfaces are public or even Open Source and they don't prescribe us with which OS or client to use their services. This is much more than we ever had with traditional commercial vendors who forced us to use outdated software in order to be "supported".

Embracing Change

If we understand the development of our IT environments as an iterative process then it becomes clear that we can always build the next environment on the next cloud platform and migrate our existing environments if there is a real benefit or return of investment. And if there is none then we can simply keep running it as it is. With the current fashion to build micro services each environment is in any case much smaller than the systems we built 10 or 20 years ago. Therefore the cost of lock-in and of a migration is equally much smaller compared to a migration of an entire data center.

In today's fast paced and competitive world the savings and benefits of quickly developing new environments with advanced services outweighs the risk of lock-in, especially as we know that every migration will make our systems better.

2017-05-18

Embedding SSH Key in SSH URL

SSH keys are considered to be a security feature, but sometimes they make things more complicated than necessary.

Especially in automation contexts we use SSH keys without a pass phrase which degrades the security of the SSH keys to the security level of a plain text password. The only benefit of the SSH keys is the fact that an attacker who gains access to the server won't be able to use the keys found there to login somewhere else. As such SSH keys are still better and more secure than having a regular plain text password.

In automation contexts we sometimes have to handle lots of SSH keys, for example with GitHub Deploy Keys. GitHub mandates to use a different SSH key for every repository to ensure that a leaked private key will not lead to a breach of other repositories.

I recently had to configure a Go Continuous Delivery server and it turned out that it does not support managing SSH keys at all (like Jenkins or TeamCity do). In order to still be able to use GitHub Deploy Keys with Go CD I created  a small SSH wrapper that allows placing the SSH key directly in the git URL like this:

git~LS0tLS1CRUdJTiBP....SDFWENF324DS=@github.com:user/repo.git

(The URL is much longer, depending on the size of your SSH key). The format is

user~key@host

I use the ~ character as separator because git tries to interpret a : in this place. The SSH wrapper is installed for git with the help of the GIT_SSH environment variable like this:
# clone GitHub repo with Deploy Keys
$ GIT_SSH=ssh-url-with-ssh-key git clone git~LS0tLS1CRUdJTiBP....SDFWENF324DS=@github.com:user/repo.git

# connect to remote SSH server
$ ssh-url-with-ssh-key user~LS0tLS1CRUdJTiBP....SDFWENF324DS=@host

# create new SSH key pair
$ ./ssh-url-with-ssh-key --create schlomo test
Append this base64-encoded private key to the username:
~LS0tLS1CRUdJTiBFQyBQUklWQVRFIEtFWS0tLS0tCk1IY0NBUUVFSUFTZUVqcDRJcFVubGhkTDVEU0VuVkc2aVM0U21Qd3NWR1hNVDhFbDFVZlBvQW9HQ0NxR1NNNDkKQXdFSG9VUURRZ0FFbHNRYnZaKzhMLzR3enhYMDlEdGZnZGFTaDVzSFpHUHVUcnVtWXd0UW4yb0txMFVNRmZjaQo4bWFqWWRqclF1YU8vdGN6aCtOWjJ3ZVZiZmY3WE5kQ01RPT0KLS0tLS1FTkQgRUMgUFJJVkFURSBLRVktLS0tLQo=
Public Key:
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBJbEG72fvC/+MM8V9PQ7X4HWkoebB2Rj7k67pmMLUJ9qCqtFDBX3IvJmo2HY60Lmjv7XM4fjWdsHlW33+1zXQjE= schlomo test
See the GitHub repo at https://github.com/schlomo/ssh-url-with-ssh-key for the source code.

See also my other SSH related blog articles:

2017-03-24

GUUG-Frühjahrsfachgespräch 2017

I had the honor to attend a new (for me) conference: The spring meeting of the German Unix User Group, this time hosted by the Cybersecurity department of the Darmstadt Technical University.

The conference had about 115 participants and orients itself mostly towards admins. The former emphasis on Unix is long gone, all talks except one (about Solaris) where about Linux and Linux-based technologies. Two days of tutorials where followed by 2 days of talks in 2 parallel tracks.

Noteworthy talks where the keynote about Jailbreaking WiFi Firmware by Matthias Schulz, Architecture Pattern for Container and Kubernetes by Thomas Fricke and several talks about software defined storage. Especially the ensuing discussion between the speakers representing competing approaches helped many attendees to sharpen their own opinions.

Jailbreaking WiFi Firmware

Impressive walk-through of the effort it took to turn Broadcom Wifi chips into WiFi monitors suitable for WiFi hacking. The code is available on seemoo-lab/nexmon and backs the Nexmon Android app which turns a Nexus 5 or Nexus 6P into a WiFi hacking tool.

Architecture Pattern for Container and Kubernetes

Besides hacking custom WiFi firmwares enable useful applications, for example adaptive video quality streaming over WiFi so that nearby clients receive a higher quality video signal while far off clients - that have a lower WiFi signal rate - still receive a lower quality video signal. See APP and PHY in Harmony.

DevOps for Everybody

My own contribution was a new talk DevOps for Everybody - How the entire company can benefit from DevOps about bringing DevOps ideas to all employees in the company. The main idea is to see DevOps as a technique to use technology to change culture.

Together with the talk I also presented a paper A Workplace Strategy for the Digital Age which can serve as an IT strategy for corporate IT departments that want to apply my ideas.

2016-05-29

Ubuntu on Dell Latitude E6420 with NVidia and Broadcom

My company sold old laptops to employees and I decided to use the chance to get an affordable and legally licensed Windows 10 system - a Dell Latitude E6420. Unfortunately the system has a Broadcom Wifi card and also ships with an NVidia graphics card which require extra work on Ubuntu 16.04 Xenial Xerus.

After some manual configuration the system works quite well with a power consumption of about 10-15W while writing this blog article. Switching between the Intel and the NVidia graphics card is simple (with a GUI program and requires a logout-login), for most use cases I don't need the NVidia card in any case.

Windows 10 also works well, although it does not support all devices. However, the combined NVidia / Intel graphics systems works better on Windows than on Linux.

In detail, I took the following steps to install an Ubuntu 16.04 and Windows 10 dual boot system.

Step-by-Step Installation

Requirements

  • Either a wired network connection or a USB wifi dongle that works in Ubuntu without additional drivers.
  • 4GB USB thumb drive or 2 empty DVDs or 1 re-writable DVD
  • 2 hours time

Install Windows

  1. Update the firmware to version A23 (use the preinstalled Windows 7 for this task)
  2. Go through the BIOS setup. 
    1. Make sure to switch the system to UEFI mode and enable booting off USB or DVD. This really simplifies the multi-OS setup as all operating systems share the same EFI system partition
    2. Download the Windows 10 media creator tool and use it to create a USB drive or DVD
    3. Insert the installation media and start the laptop. Press F12 to open the BIOS menu and select the installation media in the UEFI section.
    4. Install Windows 10. In the hard disk setup simply delete all partitions so that Windows 10 will create its default layout.
    5. Let Windows 10 do its job, rebooting several times. Use the provided Windows 7 product key for Windows 10 and let it activate over the Internet.
    6. All basic drivers will install automatically, some question marks remain in the device manager. Dell does not provide official Windows 10 drivers, so one would have to search the internet for specific solutions. However, Dell provides an overview page for Windows 10 on E6420.

      Install Ubuntu

      1. Create the Ubuntu installation media.
      2. Boot the laptop. Press F12 when it starts and select the installation media in the UEFI section of the BIOS menu.
      3. Select "Install Ubuntu" in the boot menu. Choose to install Ubuntu together with Windows. In the disk partitioning dialog reduce the size of the Windows partition to make room for Ubuntu. Leave Windows at least 50GB, otherwise you won't be able to do much with it.
      4. Let Ubuntu finish its installation and boot into Ubuntu.

      Optimize and Configure Ubuntu

      The default installation needs some additional packages to work well. Make sure that Ubuntu has an internet connection (wired or via a supported USB wifi dongle).

      Note: For the Broadcom WiFi adapter there are several possible drivers in Ubuntu. By default it will install the wl driver which was not working well for me and caused crashes. The b43 driver works for me, although the Wifi performance is rather low.

      Note: The HDMI output of the laptop is connected to the NVidia graphics chip. Therefore you can use it only when the system uses the
      1. Update Ubuntu and reboot:
        sudo apt update
        sudo apt full-upgrade
        sudo reboot
      2. Install the following packages and reboot:
        sudo apt install firmware-b43-installer \
            nvidia-361 nvidia-prime bbswitch-dkms \
            vdpauinfo libvdpau-va-gl1 \
            mesa-utils powertop
      3. Confirm that the builtin WiFi works now.
      4. Add the following line to /etc/rc.local before the exit 0 line:
        powertop --auto-tune
      5. Reboot
      6. Check that 3D acceleration works with NVidia:
        glxinfo | grep renderer\ string
        OpenGL renderer string: NVS 4200M/PCIe/SSE2
      7. Check that VDPAU acceleration works with NVidia:
        vdpauinfo | grep string
        Information string: NVIDIA VDPAU Driver Shared Library  361.42  Tue Mar 22 17:29:16 PDT 2016
      8. Open nvidia-settings and switch to the Intel GPU (you will have to confirm with your password):
      9. Logout and log back in. Confirm that 3D acceleration works now:
        glxinfo | grep renderer\ string
        OpenGL renderer string: Mesa DRI Intel(R) Sandybridge Mobile
      10. Confirm that the NVidia graphics card is actually switched off:
        cat /proc/acpi/bbswitch
        0000:01:00.0 OFF
      11. Confirm that VDPAU acceleration works:
        vdpauinfo | grep string
        libva info: VA-API version 0.39.0
        libva info: va_getDriverName() returns 0
        libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
        libva info: Found init function __vaDriverInit_0_39
        libva info: va_openDriver() returns 0
        Information string: OpenGL/VAAPI/libswscale backend for VDPAU
      12. Check that the power consumption is somewhere between 10W and 15W:

      Resources

      PCI Devices (lspci)


      Screen Configuration (NVidia)


      Screen Configuration (Intel)



      2016-05-19

      Lifting the Curse of Static Credentials

      Summary: Use digital identities, trust relationship and access control lists instead of passwords. In the cloud, this is really easy.

      I strongly believe that static credentials are one of the biggest hazards in modern IT environments. Most information security incidents are somehow related to lost or leaked or guessed static credentials, Instagram's Million Dollar Bug is just one very nice example. Static credentials

      • can be used by anyone who has them - friend or foe
      • are typically very short and can even be brute forced or guessed
      • for machine or service users have to be stored in configuration files from where they can be leaked
      • are hard to remember for humans so that they will write them down somewhere or store them in files
      • typically stay the same over a long period of time
      • don't include any information about the identity of the bearer or user
      • are hard to rotate on a regular base because the change has to happen in several places at the same time
      All those problems disappear if we use digital identities and trust relationships instead of static credentials. Unfortunately static credentials are incredibly easy to use which makes them hard to eradicate.

      Static credentials are from the medieval ages

      Source: Dr. Pepper ad from 1963 / Johnny Hart
      Back in time passwords or watchwords where state of the art. For example membership in a secret club or belonging to a certain town could be proven with a "secret" password. Improvements where "password of the day" (nice for watchtower situations) or even "challenge response" methods like completing a secret poem or providing a specific answer to a common question.

      Basically everything we do with static credentials, for example a website login, follows exactly those early patterns. Even though the real world has moved on to identity based access control in the 19th and 20th century. Passports and ID cards certify the identity of the bearer and the border control checks if the passport is valid, if the person presents his/her own passport and decides if the person is allowed passage. Nobody would even think about granting access to anything in the real world in exchange for just a password.

      So why is IT security stuck in the medieval ages? IMHO convenience and lack of simple and wide spread standards. We see static credentials almost everywhere in our daily business:
      • Website logins - who does not use a password manager? Only very few websites manage without passwords
      • Database credentials - are probably the least rotated credentials of all
      • Work account login - your phone stores that for you
      • SSH keys - key passphrases don't add much security, SSH key security is much underestimated
      • ...
      Sadly, agreeing upon static credentials and manually managing them is still the only universally applicable, compatible and standardized method of access protection that we know in IT.

      Modern IT

      Luckily in professional environments there is a way out. In a fully controlled environment there should be no need for static credentials. Every user and every machine can be equipped with a digital identity whose static parts are stored in secure hardware storage (e.g. YubiKey and TPM). Beyond that all communication and access can be granted based on those digital identities. Temporary grants by a granting authority and access control lists give access to resources. The same identity can be used everywhere thereby eliminating the need for static credentials.

      Kerberos and TLS certificates are well known and established implementations of such concepts. Sadly many popular software solutions still don't support them or make their use unnecessary complicated. As the need to use certain software typically wins over the wish to have tight security we users end up dealing with lots of static credentials. The security risk is deemed acceptable as those systems are mostly accessible from inside only. Instagram's Million Dollar Bug of course proves the folly of this thought. A chain of static AWS credentials found in various configuration files allowed exploiting everything:
      Source: Instagram's Million Dollar Bug (Internet Archive) / Wesley
      Facebook obviously did not think about the fact that static AWS credentials can be used by everyone from everywhere.

      The Cloud Challenge

      As we move more and more IT functions into the Cloud the problem of static credentials gains a new dimension: Most of our resources and services are "out there somewhere" and not part of our internal network. There is absolutely no additional layer of security! Anybody who has the static credential can use them and you won't even notice it.

      Luckily Cloud providers like Amazon Web Services (AWS) also have a solution for that problem: AWS Identity and Access Management (IAM) provides the security backbone for all communication between machines and users one one side and Amazon APIs on the other hand. Any code that runs on AWS is assigned a digital identity (EC2 Instance Role, Lambda Execution Role) which provides temporary credentials via the EC2 instance metadata interface. Those credentials are then used to authenticate API calls to AWS APIs.

      As a result it is possible to run an entire data center on AWS without the need for static credentials for internal communication. Attackers who gain internal access will not be able to access more resources than the exploited service had access to. Eradicating internal static credentials would therefore have prevented Instagram's Million Dollar Bug.

      Avoid Static Credentials

      In a world of automation static credentials are often a nuisance. They have to be added to all kind of configuration files while protecting them from as many eyes as possible. In the end, many secrets management solutions only protect the secrets from the admins and casual observers but do not prevent leaked secrets in general. Identity-based security actually helps in automated environments. The problem of static credentials is reduced to just one set for the digital identity. All other communication just uses that identity for authentication.

      Eradicating static credentials and using digital identities not only significantly improves security but also assists in automating everything.

      If you use AWS, start today to replace static AWS credentials by IAM roles. Use the AWS Federation Proxy to provide IAM roles to containers and on-premise servers in order to remove static AWS credentials from both your public and your private cloud environments.

      For your local environment, use Kerberos pass-through authentication instead of service passwords.

      For websites try to use federated logins (e.g. OpenID Connect) and favor those that don't need a password.

      For your own stuff, be sure to enable 2 factor authentication and storing certificates and private keys in hardware tokens like YubiKey.