May 08, 2019

MongoDB Data Migration, Backup and Restore



MongoDB is one of the most popular NoSQL database engines. Managing MongoDB production environment requires back it up, restore data, etc.  In case of converting MongoDB to SSL, move data from MongoDB on one server to another server, we can use import/export or backup/restore.

Importing and exporting a database means dealing with data in a human-readable format, compatible with other software products. In contrast, the backup and restore operations create or use MongoDB-specific binary data, which preserves not only the consistency and integrity of your data but also its specific MongoDB attributes. Thus, for migration its usually preferable to use backup and restore as long as the source and target systems are compatible.

Import/export:
MongoDB uses json and bson (binary json) formats for storing its information. Json is the human-readable format which is perfect for exporting and, eventually, importing your data.Json does not support all the data types available in bson and there will be the so called 'loss of fidelity' of the information.

Export:
sudo mongoexport --db mydb -c collections--out newdbexport.json
2019-05-06T15:47:30.931-0700    connected to: localhost
2019-05-06T15:47:31.931-0700    [........................]  mydb .collections  0/10234  (0.0%)
2019-05-06T15:47:32.932-0700    [#######.................]  mydb .collections  6100/10234  (31.5%)
2019-05-06T15:47:33.827-0700    [########################]  mydb .collections  10234/10234  (100.0%)
2019-05-06T15:47:33.828-0700    exported 10234 records

Import:

sudo mongoimport --db mydb --collection collection --file  newdbexport.json
While importing json file,you don't have to worry about explicitly creating a MongoDB database. If the database you specify for import doesn't already exist, it is automatically created. In MongoDB the structure is automatically created upon the first document (database row) insert.

 Backup/Restore:
Importing your data. using export/import json file have possibility of 'loss of fidelity' of the information.J son does not support all the data types available in bson and tit is advised to use 
mongodump and mongorestore to take (and restore) a full binary backup of your MongoDB database.

Backup:

Dump a collection to a BSON file.
mongodump -h hostname -d dbname-c collectionname-o

If you want to dump all collections in one go, simply omit the "-c collectionname" argument in the invocation below.
mongodump -h hostname -d dbname-c collectionname-o

Restore:
For restoring MongoDB we'll be using the command mongorestore which works with the binary backup produced by mongodump.

mongorestore -d mydb  /root/dump/mydb/collections.bson

Use with --drop to make sure that the target database is first dropped so that the backup is restored in a clean database.

sudo mongorestore --db newdb --drop /var/backups/mongobackups/01-20-16/newdb/
mongorestore -d mydb  /root/dump/mydb/collections.bson
2019-05-06T15:43:26.403-0700    checking for collection data in /root/dump/mydb/collections.bson
2019-05-06T15:43:26.435-0700    reading metadata for mydb.collections from /root/dump/mydb/collections.metadata.json
2019-05-06T15:43:26.452-0700    restoring mydb.collections from /root/dump/mydb/collections.bson
2019-05-06T15:43:27.280-0700    restoring indexes for collection mydb.collections from metadata
2019-05-06T15:43:27.284-0700    finished restoring mydb.collections (10234 documents)
2019-05-06T15:43:27.284-0700    done

root@ubuntu:~#
Verify the Collections Exists




Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...