Sergiusz Bazanski | de06180 | 2019-01-13 21:14:02 +0100 | [diff] [blame] | 1 | HSCloud Clusters |
| 2 | ================ |
| 3 | |
| 4 | Current cluster: `k0.hswaw.net` |
| 5 | |
| 6 | Accessing via kubectl |
| 7 | --------------------- |
| 8 | |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 9 | prodaccess # get a short-lived certificate for your use via SSO |
Sergiusz Bazanski | 13bb1bf | 2019-08-31 16:33:29 +0200 | [diff] [blame] | 10 | kubectl version |
| 11 | kubectl top nodes |
| 12 | |
| 13 | Every user gets a `personal-$username` namespace. Feel free to use it for your own purposes, but watch out for resource usage! |
Sergiusz Bazanski | de06180 | 2019-01-13 21:14:02 +0100 | [diff] [blame] | 14 | |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 15 | Persistent Storage |
Sergiusz Bazanski | de06180 | 2019-01-13 21:14:02 +0100 | [diff] [blame] | 16 | ------------------ |
| 17 | |
Sergiusz Bazanski | 2fd5861 | 2019-04-02 14:45:17 +0200 | [diff] [blame] | 18 | HDDs on bc01n0{1-3}. 3TB total capacity. |
| 19 | |
| 20 | The following storage classes use this cluster: |
| 21 | |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 22 | - `waw-hdd-paranoid-1` - 3 replicas |
Sergiusz Bazanski | 2fd5861 | 2019-04-02 14:45:17 +0200 | [diff] [blame] | 23 | - `waw-hdd-redundant-1` - erasure coded 2.1 |
Sergiusz Bazanski | 36cc4fb | 2019-05-17 18:08:48 +0200 | [diff] [blame] | 24 | - `waw-hdd-yolo-1` - unreplicated (you _will_ lose your data) |
Piotr Dobrowolski | 5691823 | 2019-04-09 23:48:33 +0200 | [diff] [blame] | 25 | - `waw-hdd-redundant-1-object` - erasure coded 2.1 object store |
Sergiusz Bazanski | 2fd5861 | 2019-04-02 14:45:17 +0200 | [diff] [blame] | 26 | |
Sergiusz Bazanski | 13bb1bf | 2019-08-31 16:33:29 +0200 | [diff] [blame] | 27 | Rados Gateway (S3) is available at https://object.ceph-waw2.hswaw.net/. To create a user, ask an admin. |
Sergiusz Bazanski | 2fd5861 | 2019-04-02 14:45:17 +0200 | [diff] [blame] | 28 | |
Sergiusz Bazanski | 13bb1bf | 2019-08-31 16:33:29 +0200 | [diff] [blame] | 29 | PersistentVolumes currently bound to PVCs get automatically backued up (hourly for the next 48 hours, then once every 4 weeks, then once every month for a year). |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 30 | |
| 31 | Administration |
| 32 | ============== |
| 33 | |
| 34 | Provisioning nodes |
| 35 | ------------------ |
| 36 | |
| 37 | - bring up a new node with nixos, running the configuration.nix from bootstrap (to be documented) |
Sergiusz Bazanski | 5f9b1ec | 2019-09-22 02:19:18 +0200 | [diff] [blame] | 38 | - `bazel run //cluster/clustercfg nodestrap bc01nXX.hswaw.net` |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 39 | |
Sergiusz Bazanski | 13bb1bf | 2019-08-31 16:33:29 +0200 | [diff] [blame] | 40 | Ceph - Debugging |
| 41 | ----------------- |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 42 | |
| 43 | We run Ceph via Rook. The Rook operator is running in the `ceph-rook-system` namespace. To debug Ceph issues, start by looking at its logs. |
| 44 | |
Sergiusz Bazanski | 13bb1bf | 2019-08-31 16:33:29 +0200 | [diff] [blame] | 45 | A dashboard is available at https://ceph-waw2.hswaw.net/, to get the admin password run: |
| 46 | |
| 47 | kubectl -n ceph-waw2 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo |
| 48 | |
| 49 | |
| 50 | Ceph - Backups |
| 51 | -------------- |
| 52 | |
| 53 | Kubernetes PVs backed in Ceph RBDs get backed up using Benji. An hourly cronjob runs in every Ceph cluster. You can also manually trigger a run by doing: |
| 54 | |
| 55 | kubectl -n ceph-waw2 create job --from=cronjob/ceph-waw2-benji ceph-waw2-benji-manual-$(date +%s) |
| 56 | |
| 57 | Ceph ObjectStorage pools (RADOSGW) are _not_ backed up yet! |
| 58 | |
| 59 | Ceph - Object Storage |
| 60 | --------------------- |
| 61 | |
| 62 | To create an object store user consult rook.io manual (https://rook.io/docs/rook/v0.9/ceph-object-store-user-crd.html) |
| 63 | User authentication secret is generated in ceph cluster namespace (`ceph-waw2`), |
| 64 | thus may need to be manually copied into application namespace. (see |
| 65 | `app/registry/prod.jsonnet` comment) |
| 66 | |
| 67 | `tools/rook-s3cmd-config` can be used to generate test configuration file for s3cmd. |
| 68 | Remember to append `:default-placement` to your region name (ie. `waw-hdd-redundant-1-object:default-placement`) |
Sergiusz Bazanski | b13b7ff | 2019-08-29 20:12:24 +0200 | [diff] [blame] | 69 | |