Overview

[edit on GitHub]

Computer security safeguards the computer system and data against theft, unlawful access, or any disaster. It’s guarding against and detecting unlawful access to your computer system. Data security refers to securing data from unlawful access or any disaster. This process includes the following terms:

Data Backup
Data Restore

Data Backup

Backup refers to making copies of data or data files to use in the event the original data or data files are lost or destroyed.

In information technology, a data backup is a copy of computer data taken and stored elsewhere to restore the original after a data loss event. A backup system contains at least one copy of all data considered worth saving. The data storage requirements can be extensive. An information repository model may be used to provide structure to this storage. Typically, backup data includes all the data such as documents, media files, configuration and registry files, and machine images.

Data Restore

Data restore is the process of copying backup data from secondary storage and restoring it to its original location or a new location. A restoring process is carried out to return lost, stolen, or damaged data to its original condition or move it to a new location.

Chef Automate High Availability (HA) Backups

You can manually back up the OpenSearch, Postgres, and Chef Automate Server data and configurations. The built-in Chef Automate CLI has no automated backup procedure that periodically backups the data.

Backup Types

You can backup the Chef Automate HA data either using an External file-system (EFS) or Amazon’s S3 bucket.

What is EFS System?

External file system refers to any non-volatile storage device external to the computer. It can be any storage device that serves as an addition to the computer’s primary storage, RAM, and cache memory. EFS aids in backing up the data used for future restores or disaster recovery, long-term archiving of data that is not frequently accessed, and storage of non-critical data in lower-performing, less expensive drives. These systems do not directly interact with the computer’s CPU (Central Processing Unit).

External file systems include devices such as Solid-state drives (SSDs), Hard disk drives (HDDs), Cloud storage, CD-ROM drives, DVD drives, Blu-ray drives, USB flash drives, SD cards, Tape drives.

What is Amazon’s S3 Bucket?

An Amazon S3 bucket is a public cloud storage resource available in Amazon Web Services (AWS) Simple Storage Service (S3), an object storage offering. Amazon S3 buckets, similar to file folders, store objects consisting of data and its descriptive metadata. Amazon S3 is a program built to store, protect, and retrieve data from buckets at any time from anywhere on any device such as websites, mobile apps, archiving, data backups and restorations, IoT devices, enterprise application storage, and providing the underlying storage layer for your data lake.

With the AWS Free Usage Tier*, you can get started with Amazon S3 for free in all regions except the AWS GovCloud Regions. See for more information.

Taking Backup with Amazon S3 Bucket

This section explains how to take backup for External Elastic Search (ES) and Postgres-Sql to the Amazon S3 bucket.

Note

Ensure you perform the backup configuration before deploying the Chef Automate High Availability (HA) cluster.

Pre-Backup Configurations

Configure an Amazon S3 storage bucket.
Configure External Elasticsearch.
Configure AWS credentials and generate access key ID and secret access key.
Create an IAM user in your AWS account to access the S3 bucket.
Provide AdministratorAccess, APIGatewayAdministrator (for AWS, AmazonAPIGatewayAdministrator), S3FullAccess (for AWS, AmazonS3FullAccess)permissions to IAM user.
Add the IAM role to the IAM user.
Create an IAM policy to be associated with an IAM role. On Elasticsearch Access Policy, associate the ARN to the resource section of your bucket.
Ensure the Chef Automate has basic permissions to run backup operation.
Ensure the statuses of Chef Automate services are up and running. You can check the status by typing the command, sudo chef-automate status.
Create .toml file.

Backup Procedure

Navigate to your deploy workspace. For example, cd /hab/a2_deploy_workspace.
Create directory configs by typing the command, mkdir configs.
Create .toml file by typing command, vi /configs/automate.toml.
Copy the following Ruby code into this file by altering the value of bucket with bucket-name and name with bucket-name.

[global.v1.external.elasticsearch.backup]

    enable = true

    location = "s3"

[global.v1.external.elasticsearch.backup.s3]

  # bucket (required): The name of the bucket

  bucket = "bucket-name"

  # base_path (optional):  The path within the bucket where backups should be stored

  # If base_path is not set, backups will be stored at the root of the bucket.

  base_path = "elasticsearch"

  # name of an s3 client configuration you create in your elasticsearch.yml

  # see https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-s3-client.html

  # for full documentation on how to configure client settings on your

  # Elasticsearch nodes

  client = "default"

[global.v1.external.elasticsearch.backup.s3.settings]

    ## The meaning of these settings is documented in the S3 Repository Plugin

    ## documentation. See the following links:

    ## https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-s3-repository.html



    ## Backup repo settings

    # compress = false

    # server_side_encryption = false

    # buffer_size = "100mb"

    # canned_acl = "private"

    # storage_class = "standard"

    ## Snapshot settings

    # max_snapshot_bytes_per_sec = "40mb"

    # max_restore_bytes_per_sec = "40mb"

    # chunk_size = "null"

    ## S3 client settings

    # read_timeout = "50s"

    # max_retries = 3

    # use_throttle_retries = true

    # protocol = "https"

[global.v1.backups]

    location = "s3"

[global.v1.backups.s3.bucket]

    # name (required): The name of the bucket

    name = "bucket-name"

    # endpoint (required): The endpoint for the region the bucket lives in.

    # See https://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region

    endpoint = "https://s3.amazonaws.com"

    # base_path (optional):  The path within the bucket where backups should be stored

    # If base_path is not set, backups will be stored at the root of the bucket.

    base_path = "automate"

[global.v1.backups.s3.credentials]

    # Optionally, AWS credentials may be provided. If these are not provided, IAM instance

    # credentials will be used. It's also possible for these to be read through the standard

    # AWS environment variables or through the shared AWS config files.

    # Use the credentials obtained from here [AWS-Credential](https://github.com/chef/automate-as-saas/wiki/Bastion-Setup#aws-credentials)

    access_key = "AKIARUQHMSKHGYTUJ&UI"

    secret_key = "s3kQ4Idyf9WjAgRXyv9tLYCQgYTRESDFRFV"

Save .toml file and exit VI editor.
Execute command, ./chef-automate patch configs/automate.toml. This command triggers the deployment.
Assign the created IAM role to all the Elasticsearch instances.
SSH into the Chef Automate instance by typing the command, sudo automate-cluster-ctl ssh automate.
Execute command, ./chef-automate backup create, from the Chef Automate front-end node. The backup gets created.

Restoring the S3 Backed-up Data

This section includes the procedure to restore backed-up data of the Chef Automate High Availability [HA] using EFS [External File System].

Check the status of all Chef Automate and Chef Infra Server front-end nodes by executing the command, chef-automate status.
Shutdown Chef Automate service on all front-end nodes by executing the command, sudo systemctl stop chef-automate.
Login to the same instance of Chef Automate front-end node from which backup is taken.
Execute the restore command, chef-automate backup restore s3://bucket_name/path/to/backups/BACKUP_ID --skip-preflight --s3-access-key "Access_Key" --s3-secret-key "Secret_Key".
Start all Chef Automate and Chef Infra Server front-end nodes by executing the command, sudo systemctl start chef-automate.

Backup with EFS System

This section explains how to take backup and restore for External Elastic Search (ES) and Postgres-Sql on External File-System (EFS). You can take the backup and restore on EFS system through DNS or IP.

A shared file system requires you to create Elasticsearch snapshots. To mount the same shared filesystem to the same location on all master and data nodes, register these snapshot repositories with Elasticsearch.

You must register this location (or one of its parent directories) in the path.repo setting on all master and data nodes.

Note

Ensure you perform the backup configuration before deploying the Chef Automate High Availability (HA) cluster.

Pre-Backup Configurations

Create the EFS over the AWS.
Open the port 2049 Proto(NFS) for the EFS security group.

Backup Procedure

Let us assume that the shared filesystem is mounted to /mnt/automate_backups. Now, follow these steps to configure the Chef Automate High Availability (HA) to register the snapshot locations with Elasticsearch:

Enter the mount /mnt/automate_backups command to ensure the shared file system exists on all Elasticsearch servers.
Create Elasticsearch sub-directory and set permissions by executing the following commands:

sudo mkdir /mnt/automate_backups/elasticsearch
sudo chown hab:hab /mnt/automate_backups/elasticsearch/

Note

If the network is mounted correctly, you need to perform this step on a single OpenSearch server.

Export the current OpenSearch configuration from the Habitat supervisor.
Log in as a root user.
SSH to a single OpenSearch server and configure OpenSearch path.repo setting by executing the following commands:

source /hab/sup/default/SystemdEnvironmentFile.sh
automate-backend-ctl applied --svc=automate-ha-opensearch | tail -n +2 > es_config.toml

Edit es_config.toml to add the following settings at the end of the file:

[path]
   # Replace /mnt/automate_backups with the backup_mount config found on the provisioning host in /hab/a2_deploy_workspace/a2ha.rb
   repo = "/mnt/automate_backups/elasticsearch"

Note

This file may be empty if credentials are never rotated.

Apply updated es_config.toml configuration to Elasticsearch by executing the following commands:

hab config apply automate-ha-opensearch.default $(date '+%s') es_config.toml
hab svc status (check opensearch service is up or not)
curl -k -X GET "https://localhost:9200/_cat/indices/*?v=true&s=index&pretty" -u admin:admin
   # Watch for a message about Elasticsearch going from RED to GREEN
`journalctl -u hab-sup -f | grep 'automate-ha-opensearch'

You can perform this application only once, which triggers a restart of the Elasticsearch services on each server.

Configure Chef Automate HA to handle external Elasticsearch backups by adding the following configuration to /hab/a2_deploy_workspace/config/automate.toml on the provisioning host or from the bastion host:

[global.v1.external.elasticsearch.backup]
   enable = true
   location = "fs"

   [global.v1.external.elasticsearch.backup.fs]
   # The `path.repo` setting you've configured on your Elasticsearch nodes must be
   # a parent directory of the setting you configure here:
   path = "/mnt/automate_backups/elasticsearch"

   [global.v1.backups.filesystem]
   path = "/mnt/automate_backups/backups"

Enter the ./chef-automate config patch automate.toml command to apply the patch configuration to the Chef Automate HA servers. This command also triggers the deployment.
Enter the chef-automate backup create command from a Chef Automate front-end node to create a backup.

External File-System (EFS) OR
Amazon’s S3 Bucket

EFS System

External File System refers to any non-volatile storage device external to the computer. It can be any storage device that serves as an addition to the computer’s primary storage, RAM, and cache memory. EFS aids in backing up the data used for future restores or disaster recovery, long-term archiving of data that is not frequently accessed, and storage of non-critical data in lower-performing, less expensive drives. These systems do not directly interact with the computer’s CPU (Central Processing Unit).

External file systems include devices such as Solid-state drives (SSDs), Hard disk drives (HDDs), Cloud storage, CD-ROM drives, DVD drives, Blu-ray drives, USB flash drives, SD cards, and Tape drives.

Amazon’s S3 Bucket

An Amazon S3 Bucket is a public cloud storage resource available in Amazon Web Services (AWS) Simple Storage Service (S3), an object storage offering. Amazon S3 buckets, similar to file folders, store objects consisting of data and its descriptive metadata. Amazon S3 is a program built to store, protect, and retrieve data from buckets anytime from anywhere on any device. The devices can be websites, mobile apps, archiving, data backups and restorations, IoT devices, enterprise application storage, and providing the underlying storage layer for your data lake.

With the AWS Free Usage Tier*, you can get started with Amazon S3 for free in all regions except the AWS GovCloud Regions. Click here to know more.

Overview

Data Backup

Data Restore

Chef Automate High Availability (HA) Backups

Backup Types

What is EFS System?

What is Amazon’s S3 Bucket?

Taking Backup with Amazon S3 Bucket

Pre-Backup Configurations

Backup Procedure

Restoring the S3 Backed-up Data

Backup with EFS System

Pre-Backup Configurations

Backup Procedure

EFS System

Amazon’s S3 Bucket

Chef Product

Search Results