Glue, stitch, cobble: Weighing DIY container management

 

 

You’ve been tasked with helping your company stay competitive by modernizing your IT organization’s delivery of developed applications. Your company has already embraced virtualization and perhaps dabbled in the public cloud. Containers look like the next big thing for you, so you’re considering how to bring container technology to your organization. Some thing needs to create containers on compute resources and network them together. On the drawing board, you’re considering these general components:

You start doing the research. You soon discover that cloud management platforms, PaaS, and container management platforms are all readily available as prepackaged software and services. Even the individual components that make up those packages are available in Open Source Land. “Hmm,” you think, “Why pay anyone for a platform when the parts are there to do this myself?”

For a brief moment, you’re taken all the way back to kindergarten. The teacher starts crafting class and opens the drawer to an array of fun-looking parts. Pastel paper, glitter, and bows! You’re ready to craft that masterpiece. All you need is a bottle of glue!

After a blink, you’re back to the IT drawing board, laying out the parts for your future container management platform in greater detail:

  • Build tools
  • Servers/OSes
  • Container runtime
  • Container-to-container networking
  • Ingress networking
  • Firewall
  • Load balancer
  • Database
  • Storage
  • DNS

All you need is the “glue” to bind these parts together.

Naturally, connecting those different parts requires varying degrees of development effort. We’ve simplified this spectrum into four general “glue levels” of effort.

Glue level 1: Direct component-to-component bridging

In this case, a component has the capability to interface directly with the next logical component in the application deployment workflow.

Let’s assume you have a Jenkins platform and an instance of Docker Engine. Get Jenkins to build code, then create a Docker image. Better yet, have Jenkins call Docker Engine itself and point Docker to your newly created image.

Glue level 2: Basic scripting to bridge components

In this case, a component does not have the capability to interface with the next logical component in the application deployment workflow.

For example, with all nodes in an instance of a Docker Swarm, if a deployed service runs on port 80, then all nodes in the cluster lock down port 80, whether or not the particular node is running an instance of the container.

Let’s say you have another application that needs to listen on port 80. Because the whole Docker Swarm has already locked down port 80, you’ll have to use an external load balancer that’s tied in with DNS to listen to, for example, appA.mycluster.com and appB.mycluster.com (both listening at port 80 at the ingress side of the load balancer).

docker swarm portsApcera

After the containers have been deployed by an external script, you’ll have to interface with the load balancer to configure it to listen to the app and forward to the appropriate node.

Glue level 3: Scripting to manage components

In this case, your workflow finishes from one component and transitions to multiple separate components. At this point, you’re creating a middle-tier component that needs to maintain state and possibly coordinate workflows. You may have access to component automation (like HashiCorp’s Terraform or Red Hat CloudForms), but you still need a controlling entity that understands the application workflow and state.

Let’s say you have multiple Cloud Foundry instances with an application consisting of a web front-end container, a logic processing container, and an email generation container. You happen to want those containers on the separate Cloud Foundry instances. Even if you don’t need to create a cloud-spanning application, you may want to run applications in different clouds or move applications between clouds. This will require coordination outside of those platforms’ instances.

Assuming you’ve already laid the networking groundwork to connect those Cloud Foundry instances, your custom platform will have to interface with each instance, ship and run the containers, and network those containers appropriately.

Glue level 4: Your own enterprise automation at a level above the deployment workflow

In this case, you have enough glue for a basic start-to-finish workflow from source to deployment, but now you are considering enterprise-level features, including:

  • Automated provisioning and updating of the cluster control software, across multiple or hybrid clouds
  • Advanced container scheduling (for example, affinity between application containers)
  • Establishing role or attribute-based access controls that map organizational structures to rights on the platform
  • Resource quotas
  • Automatic firewall/IPtables management
  • Governance via a policy framework

Here is a simple example of one of the possibilities from a non-DIY alternative, the Apcera Platform. Let’s say your company has these business rules:

  1. Applications from development must run in AWS
  2. Applications in production must run in OpenStack (on-premises)

In the Apcera Platform, these business rules are translated by the admin as:

on job::/dev {
  schedulingTag.hard aws
}

on job::/production {
  schedulingTag.hard openstack
}

When a user (or automation) is identified as part of the /dev or /production namespace in the Apcera Platform, any applications deployed by that user (or automation) will be automatically deployed on the runtime components labeled with aws or openstack, appropriately. Users can either specify a tag when deploying applications (which will be checked by the policy system against a list of allowable tags) or not specify a tag and let the platform choose a runtime component automatically. Because Apcera labels can be arbitrarily defined, admins can create deployment policy for things like requiring “ssd” performance or “special-CPUs.”

Once you have built a platform that spans both AWS and OpenStack (as a single “cluster” or multiple clusters glued together), it may be an easy matter to allow operator choice of locations. With Docker Swarm, it’s quite easy with service constraints:

$ docker service create \
  —name redis_2 \
  —constraint ‘node.id == 2ivku8v2gvtg4’ \
  Redis:3.0.6

In this example, an operator chooses to deploy Redis via Docker Swarm to the specified Docker Engine node. While this is great for operator choice, this choice is not enforced. How do you enforce the company policy of deploying only to the on-premises OpenStack instance if this is a production job (per the company policy, above)?

How long are you willing to wait for the community (in this specific case, brokered through Docker Inc.) to implement this type of enforcement?

Let’s assume you’re left with coding this simple placement policy enforcement yourself. Let’s consider the planning for this effort:

  • You’d have to lock out all general access to Docker except for your enforcement automation.
  • Your enforcement automation has to be some kind of server that can proxy requests from clients.
  • You’d need to identify clients as individual users or members of some group. Do you want to run your own user management or create an interface to your enterprise LDAP server?
  • You’d need to associate the user/group with “production.”
  • You’d need to create a rule framework that permits an entry that translates to “jobs from production can only deploy to OpenStack Docker nodes.”
  • You’d need to create a way to track the node.ids of the Docker Swarm nodes that run on OpenStack.
  • You’d need to keep track of the resources available on each node to see if they can handle the resource requirements of your Docker image.
  • You’d need to understand the resource requirements of your Docker image.

What if, instead of a hard requirement that applications have to run on specific nodes, that sometimes you’re OK with a soft requirement? That is, make a best effort to deploy on the specified nodes, but failing that, deploy on other nodes. Do you really want to write your own scheduler code to fill in the gaps between what Docker offers? Apcera does all of this container management scheduling (via hard and soft tags, and more) already.

All of this glue code you’d have to create yourself, simply to solve the problem of enforcing where your applications can run. What about enforcing build dependencies as policy? Or a resource quota policy? Or virtual networking policy? Do you really want to write your own policy engine? Apcera was created to not only automate these tasks, but to provide a unified policy framework to govern all of them.

 

 

[Source:- Javaworld]