Merge pull request #541 from flavio/aws-docs

Add AWS docs
SUSE · Feb 27, 2020 · c49c20e · c49c20e
2 parents bf455df + 8d4d47f
commit c49c20e
Show file tree

Hide file tree

Showing 6 changed files with 295 additions and 129 deletions.
diff --git a/adoc/architecture-description.adoc b/adoc/architecture-description.adoc
@@ -80,6 +80,8 @@ Versioning scheme: `x.y.z`
 * SUSE OpenStack Cloud 8
 * VMware ESXi {vmware_version}
 * Bare Metal
+* Amazon Web Services (technological preview)
+
 
 == Supported Architectures
 
@@ -329,9 +331,10 @@ it's needed to have a local RMT server mirroring the CaaSP
 repositories, a mirror of the SUSE container registry and a mirror of
 the SUSE helm chart repository.
 
+
 === Control plane nodes certificates
 
-Certificates are stored under /etc/kubernetes/pki on the control plane nodes.
+Certificates are stored under `/etc/kubernetes/pki` on the control plane nodes.
 
 ==== CA certificates
 

diff --git a/adoc/book_deployment.adoc b/adoc/book_deployment.adoc
@@ -54,7 +54,7 @@ include::deployment-preparation.adoc[Deployment Preparations, leveloffset=+1]
 
 include::deployment-ecp.adoc[SUSE OpenStack Cloud Instructions, leveloffset=+1]
 
-//include::deployment-aws.adoc[Amazon AWS Cloud Instructions, leveloffset=+1]
+include::deployment-aws.adoc[Amazon AWS Cloud Instructions, leveloffset=+1]
 
 include::deployment-vmware.adoc[VMware Deployment Instructions, leveloffset=+1]
 

diff --git a/adoc/deployment-aws.adoc b/adoc/deployment-aws.adoc
@@ -1,98 +1,197 @@
 == Deployment on Amazon AWS
 
+Deployment on Amazon Web Services (AWS) is currently a tech preview.
+
 .Preparation Required
 [NOTE]
+====
 You must have completed <<deployment.preparations>> to proceed.
+====
+
+You will use {tf} to deploy the whole infrastructure described in
+<<architecture-aws>>. Then you will use the `skuba` tool to bootstrap the
+{kube} cluster on top of it.
+
+
+[[architecture-aws]]
+=== AWS Deployment
+
+The AWS deployment created by our {tf} template files leads to the
+creation of the infrastructure described in the next paragraphs.
+
+==== Network
+
+All of the infrastructure is created inside of a user specified AWS region.
+The resources are currently all located inside of the same availability
+zone.
+
+The {tf} template files create a dedicated Amazon Virtual Private Cloud (link:https://aws.amazon.com/vpc/[VPC])
+with *two subnets*: "public" and "private". Instances inside of the *public subnet* have
+link:https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html[Elasic IP addresses]
+associated, hence they are reachable from the internet. Instances inside of the *private subnet* are not reachable from the internet.
+However they can still reach external resources; for example they can still
+perform operations like downloading updates and pulling container images from
+external container registries. Communication between the public and the private subnet is allowed.
+All the control plane instances are currently located inside of the public
+subnet. Worker instances are inside of the private subnet.
+
+Both control plane and worker nodes have tailored
+link:https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html[Security Groups]
+assigned to them. These are based on the networking requirements described
+in <<sysreq-networking>>.
+
+==== Load Balancer
+
+The {tf} template files take care of creating a
+link:https://aws.amazon.com/elasticloadbalancing/[Classic Load Balancer]
+which exposes the Kubernetes API service deployed on the control plane
+nodes.
 
-You will use {tf} to deploy the required master and worker cluster nodes (plus a load
-balancer) and then use the `skuba` tool to bootstrap the {kube} cluster on
-top of those.
+The load balancer exposes the following ports:
 
-. Download the AWS credentials
-.. Log in to the AWS console.
-.. Click on your username in the upper right hand corner to reveal the drop-down menu.
-.. Click on My Security Credentials.
-.. Click Create Access Key on the Security Credentials tab.
-.. Note down the newly created _Access_ and _Secret_ keys.
+* `6443`: Kubernetes API server
+* `32000`: Dex (OIDC Connect)
+* `32001`: Gangway (RBAC Authenticate)
 
-=== Deploying the Cluster Nodes
+[[architecture-aws-vpc-peering]]
+==== Join Already Existing VPCs
+
+The {tf} template files allow the user to have the
+{productname} VPC join one or more existing VPCs.
+
+This is achieved by the creation of
+link:https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html[VPC peering links]
+and dedicated
+link:https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Route_Tables.html[Route tables].
+
+This feature allows {productname} to access and be accessed by resources defined
+inside of other VPCs. For example, this capability can be used to register all
+the {productname} instances against a {susemgr} server running inside of a
+private VPC.
+
+Current limitations:
+
+* The VPCs must belong to the same AWS region.
+* The VPCs must be owned by the same user who is creating the {productname}
+infrastructure via {tf}.
+
+==== IAM Profiles
+
+The
+link:https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#aws[AWS Cloud Provider]
+integration for {kube} requires special
+link:https://aws.amazon.com/iam/[IAM] profiles to be associated with the control
+plane and worker instances. {tf} can create these profiles or can leverage existing ones.
+It all depends on the rights of the user invoking {tf}.
+
+The {tf} link:https://www.terraform.io/docs/providers/aws/index.html[AWS provider]
+requires your credentials. These can be obtained by following these steps:
+
+* Log in to the AWS console.
+* Click on your username in the upper right hand corner to reveal the drop-down menu.
+* Click on menu:My Security Credentials[].
+* Click menu:Create Access Key[] on the "Security Credentials" tab.
+* Note down the newly created _Access_ and _Secret_ keys.
+
+=== Deploying the Infrastructure
+
+On the management machine, find the {tf} template files for AWS in
+`/usr/share/caasp/terraform/aws`. These files have been installed as part of
+the management pattern (`sudo zypper in -t pattern SUSE-CaaSP-Management`).
 
-. On the management machine, find the {tf} template files for AWS in
-`/usr/share/caasp/terraform/aws` (which was installed as part of the management
-pattern (`sudo zypper in -t pattern SUSE-CaaSP-Management`)).
 Copy this folder to a location of your choice as the files need adjustment.
-+
+
 ----
 mkdir -p ~/caasp/deployment/
 cp -r /usr/share/caasp/terraform/aws/ ~/caasp/deployment/
 cd ~/caasp/deployment/aws/
 ----
-. Once the files are copied, rename the `terraform.tfvars.example` file to
+
+Once the files are copied, rename the `terraform.tfvars.example` file to
 `terraform.tfvars`:
-+
+
 ----
-mv terraform.tfvars.example terraform.tfvars
+cp terraform.tfvars.example terraform.tfvars
 ----
-. Edit the `terraform.tfvars` file and add/modify the following variables:
-+
+
+Edit the `terraform.tfvars` file and add/modify the following variables:
+
 include::deployment-terraform-example.adoc[tags=tf_aws]
-+
+
 [TIP]
 ====
-You can set timezone in before deploying the nodes by modifying the files:
+You can set timezone and other parameters before deploying the nodes
+by modifying the cloud-init template:
 
 * `~/caasp/deployment/aws/cloud-init/cloud-init.yaml.tpl`
 ====
-. Enter the registration code for your nodes in `~/caasp/deployment/aws/registration.auto.tfvars`:
-+
-Substitute `<CAASP_REGISTRATION_CODE>` for the code from <<registration_code>>.
-+
+
+You can enter the registration code for your nodes in
+`~/caasp/deployment/aws/registration.auto.tfvars` instead of the
+`terraform.tfvars` file.
+
+Substitute `CAASP_REGISTRATION_CODE` for the code from <<registration_code>>.
+
 [source,json]
 ----
 # SUSE CaaSP Product Key
 caasp_registry_code = "<CAASP_REGISTRATION_CODE>"
 ----
-+
-This is required so all the deployed nodes can automatically register with {scc} and retrieve packages.
-. Now you can deploy the nodes by running:
-+
+
+This is required so all the deployed nodes can automatically register
+with {scc} and retrieve packages.
+
+Now you can deploy the nodes by running:
+
 ----
 terraform init
 terraform plan
 terraform apply
 ----
-+
-Check the output for the actions to be taken. Type "yes" and confirm with Enter when ready.
-Terraform will now provision all the machines and network infrastructure for the cluster.
 
-.Public IPs for nodes
+Check the output for the actions to be taken. Type "yes" and confirm with
+kbd:[Enter] when ready.
+{tf} will now provision all the cluster infrastructure.
+
+.Public IPs for Nodes
 [IMPORTANT]
 ====
-`skuba` cannot currently access nodes through a bastion host, so all
+`skuba` currently cannot access nodes through a bastion host, so all
 the nodes in the cluster must be directly reachable from the machine where
-`skuba` is being run. We must also consider that `skuba` must use
-the external IPs as `--target` when initializing or joining the cluster,
-while we must specify the internal DNS names for registering the nodes
-in the cluster.
+`skuba` is being run.
+`skuba` could be run from one of the master nodes or from a pre-existing bastion
+host located inside of a joined VPC as described in
+<<architecture-aws-vpc-peering>>.
 ====
 
-.Note down IP/FQDN for nodes
+.Note Down IP/FQDN For the Nodes
 [IMPORTANT]
 ====
-The IP addresses of the generated machines will be displayed in the terraform
-output during the cluster node deployment. You need these IP addresses to
-deploy {productname} to the cluster.
-
-If you need to find an IP address later on, you can run `terraform output` within
-the directory you performed the deployment from the `~/caasp/deployment/aws` directory or
-perform the following steps:
-
-. Log in to the AWS Console and click on menu:Load Balancers[]. Find the one with the
-string you entered in the {tf} configuration above, for example `testing-lb`.
-. Note down the "DNS name".
-+
-. Now click on menu:Instances[].
-. Using the menu:Filter[] text box, enter the string you specified for `stack_name`
-in the `terraform.tfvars` file.
-. Find the `IPv4 Public IP` on each of the nodes of your cluster.
+The IP addresses and FQDN of the generated machines will be displayed in the
+{tf} output during the cluster node deployment. You need these information
+later to deploy {productname}.
+
+These information can be obtained at any time by executing the
+`terraform output` command within the directory from which you executed
+{tf}.
 ====
+
+=== Logging into the Cluster Nodes
+
+Connecting to the cluster nodes can be accomplished only via SSH key-based
+authentication thanks to the ssh-public key injection done earlier via
+`cloud-init`. You can use the predefined `ec2-user` user to log in.
+
+If the ssh-agent is running in the background, run:
+
+----
+ssh ec2-user@<node-ip-address>
+----
+
+Without the ssh-agent running, run:
+
+----
+ssh ec2-user@<node-ip-address> -i <path-to-your-ssh-private-key>
+----
+
+Once connected, you can execute commands using password-less `sudo`.
diff --git a/adoc/deployment-bootstrap.adoc b/adoc/deployment-bootstrap.adoc
@@ -151,8 +151,28 @@ and automatically manage resources like the Load Balancer, Nodes (Instances), Ne
 and Storage services.
 
 If you want to enable cloud provider integration with different cloud platforms,
-initialize the cluster with the flag `--cloud-provider <CLOUD_PROVIDER>`.
-The only currently available option is `openstack`, but more options are planned:
+initialize the cluster with the flag `--cloud-provider <CLOUD PROVIDER>`.
+The only currently available options are `openstack` and `aws`,
+but more options are planned.
+
+.Cleanup
+[IMPORTANT]
+====
+By enabling CPI providers your Kubernetes cluster will be able to
+provision cloud resources on its own (eg: Load Balancers, Persistent Volumes).
+You will have to manually clean these resources before you destroy the cluster
+with {tf}.
+
+Not removing resources like Load Balancers created by the CPI will result in
+{tf} timing out during `destroy` operations.
+
+Persistent volumes created with the `retain` policy will exist inside of
+the external cloud infrastructure even after the cluster is removed.
+====
+
+====== OpenStack CPI
+
+Define the cluster using the following command:
 
 [source,bash]
 ----
@@ -237,6 +257,31 @@ When cloud provider integration is enabled, it's very important to bootstrap and
 these names will be used by the `Openstack` cloud controller manager to reconcile node metadata.
 ====
 
+====== Amazon Web Services (AWS) CPI
+
+Define the cluster using the following command:
+
+[source,bash]
+----
+skuba cluster init --control-plane <LB IP/FQDN> --cloud-provider aws my-cluster
+----
+
+Running the above command will create a directory `my-cluster/cloud/aws` with a
+`README.md` file in it. No further configuration files are needed.
+
+The supported format and content can be found in the
+link:https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#aws[official Kubernetes documentation].
+
+
+[IMPORTANT]
+====
+When cloud provider integration is enabled, it's very important to bootstrap and join nodes with the same node names that they have inside `AWS`, as
+these names will be used by the `AWS` cloud controller manager to reconcile node metadata.
+
+You can use the "private dns" values provided by the {tf} output.
+====
+
+
 ===== Integrate External LDAP TLS
 
 . Open the `Dex` `ConfigMap` in `my-cluster/addons/dex/dex.yaml`
@@ -465,6 +510,7 @@ To talk to your cluster, you must be in the `my-cluster` directory when running
 [TIP]
 ====
 To make usage of {kube} tools easier, you can store a copy of the `admin.conf` file as link:{kubedoc}concepts/configuration/organize-cluster-access-kubeconfig/[kubeconfig].
+====
 
 [source,bash]
 ----
@@ -473,6 +519,7 @@ cp admin.conf ~/.kube/config
 ----
 
 [WARNING]
+====
 The configuration file contains sensitive information and must be handled in a secure fashion. Copying it to a shared user directory might grant access to unwanted users.
 ====
 

diff --git a/adoc/deployment-sysreqs.adoc b/adoc/deployment-sysreqs.adoc
@@ -7,8 +7,9 @@ Currently we support:
 * SUSE OpenStack Cloud 8
 * VMware ESXi {vmware_version}
 * Bare Metal x86_64
+* Amazon Web Services (technological preview)
 
-== Nodes
+=== Nodes
 
 You will need at least two machines:
 
@@ -206,6 +207,8 @@ Let's see two different examples:
 ----
 <1> Here we get a value of 28705usec (28ms) so the storage clearly does not meet the requirements.
 
+
+[[sysreq-networking]]
 === Networking
 
 The management workstation needs at least the following networking permissions: