Showing results for 
Search instead for 
Did you mean: 

Paxata Backup Basics

Blue LED
_In this article, you will learn about the basics of Paxata backup tasks.)_ #Overview There are three components that requires backup in case of data loss from the running servers: 1. ​Metadata Storage (MongoDB) 2. Data Library Storage (HDFS) 3. Properties Files (particularly Notably, Pipeline cache files on executors do not need to backed up, as cache loss would be recovered by cache retrieval automatically. #Basic Tools For each component, there are many tools for backup. Here we are recommending the most basic tools that can achieve the backup task alone. For better reliability/manageability, more advanced tools may be available. ##Metadata Storage (MongoDB) `mongodump --out /tmp/mongobackup_`date +"%m-%d-%y"` []( "") ##Data Library Storage (HDFS) Distcp allows you to copy directory from HDFS to another cluster/s3 bucket. `hadoop distcp hdfs://CDH5-nameservice/user/paxata/library s3a://bucket/librarybackup` []( "") Cloudera BDR is a Enterprise solution of Distcp [​](​ "​") ##Properties Files (particularly Upload Files from server local file system to S3 bucket `cd /usr/local/paxata/server/config aws s3 sync . s3://bucket/propertybackup`
Labels (1)
0 Kudos
0 Replies