A production-level Everyware Cloud deployment requires planning. You should take some time to think about several aspects of the deployment such as securing communications, access to services and availability/resiliency of the instance. This page explains some topics you may consider to set up a production-ready Everyware Cloud instance.
When the services are exposed to the public we recommend to disable HTTP and switch to the HTTPS version which guarantees an higher level of security. Everyware Cloud Admin Console and API services do support plain HTTP connections; however, plain HTTP connection should be used only during initial setup of the instance or for instances like self evaluation or development, HTTPS guarantees an higher level of protection. Moving to HTTPS requires creating and configuring proper certificates; please take a look at the Certificate section in the Installation with Helm Charts chapter.
Same can be said for MQTT connections. MQTT protocol is implemented on top of TCP/IP; use MQTT over TLS (aka MQTTS) to secure communications between your devices and the EC instance.
While it is technically possible to access the EC front end services through their IP address, we recommended using meaningful DNS Names like for example mqtt-broker.example.com, console.example.com, api.example.com etc.
DNS Names work well especially when used in combination with the certificates mentioned in the section Encrypted communications. Moreover, having a DNS Name for your services:
- simplifies the management of the infrastructure in case when the IP address needs to be changed for any reason
- solves the issue of updating your (possibly unattended) clients when the IP address is changed
- solves the issue of changing (and re-issuing) the TLS certificates every time the IP changes
The choice of using DNS Names or not depends on your application domain. If you are sure enough that your IPs are static along the lifetime of your instance you can go with plain IPs. Even in this case, however, opting for DNS Names is a much more clean solution aligned with state of the art practices and can save you a lot of work later while running the instance.
We recommend to reserve the EC instance root account, ec-sys, for platform administration tasks only. Avoid connecting devices to it and apply a strict control over the users that are allowed to access it as well as the permissions assigned to those users.
We recommend to use EC application integration tools and services to access EC data:
Avoid write and read data through direct connections to the data engines used by EC (i.e. the relational DB and the message store). By using application integration tools and services you will ensure that the business logic is properly enforced over data, performances can be monitored and security centrally and uniformly managed.
Moreover, a new Everyware Cloud version may require data base changes that in turn may break integrations built using direct connections to data engines. The product team will strive to minimize the impact of these changes on application integration tools and services in order to keep them as much stable as possible so that integrations built on top of them require less maintenance or they can rely on the EC documentation for details regarding the migration path.
Everyware Cloud supports Resiliency and Highly Availability. These two features are a combination of product and infrastructure features.
Everyware Cloud is a distributed application composed by several connected service components running in Docker Containers (check the list of components). Container orchestrators are used to support the configuration, deployment, scaling and lifecycle management of the components. Resiliency and High Availability are achieved leveraging the functionalities of the orchestrator and those of the underlying infrastructure. Check high availability and scaling in prerequisites page.
Supported EC deployments are Kubernetes based, however Kubernetes comes in different flavors and integration level with the underneath infrastructure (i.e. data center technology). Check the list of supported orchestration platforms and versions.
Everyware Cloud software distribution comes with a set of Helm Charts that are used to deploy and configure the application within the supported orchestration platforms. Check installation guide here.
Each EC service component runs within a Kubernetes Pod. The pod is configured with the Always Restart policy. When a pod fails for some reason, Kubernetes control plane services will restart it so that the service component will be up and running again as soon as possible.
With just the restart policy in place when the node where a pod runs fails, the pod and ultimately the service components in it are no longer available. Pod replication works by setting a desired number of pod replicas across the cluster. If a node fails, the Kubernetes control plane will start its pods elsewhere in the cluster wherever there are enough resources to run it. If too many replicas are present for some reason, control plane will kill the exceeding replicas in order to maintain the overall number consistent with the desired target number.
Default replica number for EC service components is one. Check the next section Horizontal Scaling for more details about which service components can be scaled to more than one replicas.
The workload may increase over the capacity currently deployed so that requests from EC clients cannot be fulfilled as desired. Pod Replication helps to cope with these situations. Capacity of the service can be increased by increasing the number of replicas of a pod across the cluster.
EC supports horizontal scaling for two of its service components: the Messaging Service (MQTT) and the RESTful API Service. Check how to replicate these two services here.
Increase and decrease of the number of replicas is manual, EC administrators should quantify the current workload and the workload profile in the short and medium term and plan the amount of replicas required and ultimately the number of nodes as a consequence. Administrators should also take into account average and peak workloads in order to provide the desired service level.
If a datacenter becomes totally or partially unavailable the cluster and the application hosted in it may become unavailable as well. To avoid this situation some infrastructure providers can partition a data center in multiple Failure Zones so that if a major failure happens in one Zone, the others are not affected. With such a setup in place, a possible way to avoid that the cluster and the application are significantly impacted by a failure is spreading the nodes of the same Kubernetes cluster across multiple failure zones. EC doesn’t provide spreading of nodes as a feature in its software distribution, however if the feature is available, EC deployment may benefit of it.
Resiliency and High Availability relates to the number of worker nodes. If the Kubernetes Cluster has just one big worker node and the node crashes, there’s no other node for the control plane services to replicate the pods inside it. In order to leverage the replication feature, it is a good practice to consider some degree of redundancy in the number of worker nodes. Administrators should plan the number of worker nodes in accordance with the required service level.
Pod restart and replication are features controlled by the services in the Kubernetes Control Plane so the control plane has to be at least as available as the application. Highly available control plane require multiple dedicated Control Plan nodes, check documentation of the orchestration platform used to verify how High Availability of the Control Plane is implemented.
Autoscaling is a functionality that allows new nodes to be automatically provisioned to a Kubernetes cluster based on some conditions. For example, a worker node node that failed can be replaced by a new one allowing Pod Replication to start pods on it. Autoscaling is infrastructure specific; some orchestration platforms (e.g. Amazon EKS and Azure AKS) support autoscaling functionalities others do not. EC doesn’t provide this feature in its software distribution, however if the feature is available, EC deployment may benefit of it.
EC relies on MariaDB and optionally on Elasticsearch. The use of a Redis cache is recommended as well to increase data access performances and reduce the workload on the relational database. All these components should be deployed in a configuration appropriate to the availability targets of the overall system.
All the named services support highly available configurations natively, however, available options depend on the provider of the infrastructure. Check with the provider of your services the configurations supported.
The scope of this feature is to support the creation of dedicated resource pools that can be assigned to distinct accounts so that all the devices of an account will interact with only the resource pool assigned that account. Everyware Cloud supports partitioning of workloads through the creation of multiple instances of a service. Services supporting this feature are :
- Messaging Service (MQTT connections)
- Remote Access Service (on-demand VPN connection)
Workload partitioning works well in scenarios where, from a device perspective, the service level of an account needs to be decoupled from the service level of another account. Since the account has its dedicated resource pools, even though resource pools assigned to the other accounts have some issue, the resource pool assigned to the account will continue working (and vice versa).
Resource pool assignment is only possible for the EC root account and the level-one accounts. All the child accounts that descend from the same level-one account also share the same resource pool.
For more info regarding how to setup multiple service instances fo one type see section Installation using Helm Charts.
In this section we’ll discuss some concepts regarding the sizing of the Kubernetes cluster that hosts an EC instance. We will focus on worker nodes while Control Plane nodes are out of the scope of this discussion as well as other pre-requisite services such as the databases.
Take some time to tune the size of your infrastructure. You can contact the Eurotech team for more guidance on this topic.
When sizing a cluster you want to determine the computational resources assigned to your EC instance. These resources have to be at least adequate to sustain the workload and service level required by your solution. Some dimensions that you should consider for the estimation of the workload are:
- the number of device connections
- the type of interaction with the device (management, log and diagnostic, telemetry)
- the number of message routes to external endpoints
- the number of REST clients and the amount of requests
- the number of remote device access connections (VPN service)
- projected changes of the workload profile over time
- potential workload peaks
Next, evaluate if your solution needs redundancy at application level (e.g. if the MQTT service or the RESTful API service needs some extra number of replicas to provide service continuity in case of failures).
With this informations at hand you should determine the size and the number of EC pods that need to be deployed with the goal to find the minimum set that satisfies all the requirements.
Pod size is mainly determined by RAM and CPU. Of course these resources are limited by the characteristics of the worker node that hosts the pod.
Helm Charts distributed with Everyware Cloud come with resource limit defaults that comply the minimum requirements for the services hosted in it. System administrators can tune the resource limits by changing the defaults.
When you choose the machine size and number for the Worker nodes you must ensure that each pod replica has at least one node with enough resources available. Consider that a Worker node can host multiple pods at the same time so that the number of nodes may be less than the number of pods. Assignment of pods to nodes is done automatically by Kubernetes according to configured policies.
When defining the machines for the Worker nodes, you should take into account:
- the capacity requirements of the pods
- the machine sizes available from the infrastructure provider
Often this end up in several possible machine setups. Typically you have to decide whether to stack multiple pods in few high capacity nodes or deploy one or few pods in many lower capacity nodes.
Let’s suppose that the cost of the machines in regard of the size (CPU/RAM) is not a point since it can be considered linear (at least for the main infrastructure providers like AWS, Azure and Google Cloud). Soppose also that the system overhead is negligible. There are pros and cons in each one of the options above:
- Few nodes - High capacity
- Less maintenance activities
- Lower system overhead
- High availability is more difficult to implement.
- When a node goes down a big fraction of the total capacity is lost.
- Redundancy may lower resource utilization.
- Redundancy may be more expensive.
- Many nodes - Low capacity
- High availability is easier to implement.
- When a node goes down a small fraction of the total capacity is lost.
- Higher resource utilization.
- More maintenance activities.
- Higher system overhead
- High availability is easier to implement.
Where high availability is paramount create a cluster with smaller nodes and few pods per node with proper amount of redundancy.
Depending on your needs, you may consider to size nodes uniformly so that Kubernetes can reuse nodes for different pods in case of need.
As for the machine type, in average EC containers are balanced between RAM and CPU usage so general purpose machines should fit.
Consider that machines can be added or removed to the cluster along the way so the cluster size can adapt to changes of the workload. For example, a cluster can start small and grow as long as new device connections and integrations with third party applications are added to the solution.
Updated 5 months ago