Showing posts with label MongoDB. Show all posts
Showing posts with label MongoDB. Show all posts

May 08, 2019

MongoDB Data Migration, Backup and Restore



MongoDB is one of the most popular NoSQL database engines. Managing MongoDB production environment requires back it up, restore data, etc.  In case of converting MongoDB to SSL, move data from MongoDB on one server to another server, we can use import/export or backup/restore.

Importing and exporting a database means dealing with data in a human-readable format, compatible with other software products. In contrast, the backup and restore operations create or use MongoDB-specific binary data, which preserves not only the consistency and integrity of your data but also its specific MongoDB attributes. Thus, for migration its usually preferable to use backup and restore as long as the source and target systems are compatible.

Import/export:
MongoDB uses json and bson (binary json) formats for storing its information. Json is the human-readable format which is perfect for exporting and, eventually, importing your data.Json does not support all the data types available in bson and there will be the so called 'loss of fidelity' of the information.

Export:
sudo mongoexport --db mydb -c collections--out newdbexport.json
2019-05-06T15:47:30.931-0700    connected to: localhost
2019-05-06T15:47:31.931-0700    [........................]  mydb .collections  0/10234  (0.0%)
2019-05-06T15:47:32.932-0700    [#######.................]  mydb .collections  6100/10234  (31.5%)
2019-05-06T15:47:33.827-0700    [########################]  mydb .collections  10234/10234  (100.0%)
2019-05-06T15:47:33.828-0700    exported 10234 records

Import:

sudo mongoimport --db mydb --collection collection --file  newdbexport.json
While importing json file,you don't have to worry about explicitly creating a MongoDB database. If the database you specify for import doesn't already exist, it is automatically created. In MongoDB the structure is automatically created upon the first document (database row) insert.

 Backup/Restore:
Importing your data. using export/import json file have possibility of 'loss of fidelity' of the information.J son does not support all the data types available in bson and tit is advised to use 
mongodump and mongorestore to take (and restore) a full binary backup of your MongoDB database.

Backup:

Dump a collection to a BSON file.
mongodump -h hostname -d dbname-c collectionname-o

If you want to dump all collections in one go, simply omit the "-c collectionname" argument in the invocation below.
mongodump -h hostname -d dbname-c collectionname-o

Restore:
For restoring MongoDB we'll be using the command mongorestore which works with the binary backup produced by mongodump.

mongorestore -d mydb  /root/dump/mydb/collections.bson

Use with --drop to make sure that the target database is first dropped so that the backup is restored in a clean database.

sudo mongorestore --db newdb --drop /var/backups/mongobackups/01-20-16/newdb/
mongorestore -d mydb  /root/dump/mydb/collections.bson
2019-05-06T15:43:26.403-0700    checking for collection data in /root/dump/mydb/collections.bson
2019-05-06T15:43:26.435-0700    reading metadata for mydb.collections from /root/dump/mydb/collections.metadata.json
2019-05-06T15:43:26.452-0700    restoring mydb.collections from /root/dump/mydb/collections.bson
2019-05-06T15:43:27.280-0700    restoring indexes for collection mydb.collections from metadata
2019-05-06T15:43:27.284-0700    finished restoring mydb.collections (10234 documents)
2019-05-06T15:43:27.284-0700    done

root@ubuntu:~#
Verify the Collections Exists




March 02, 2013

Storing Log Data using MongoDB

This blog outlines the basic patterns and principles for using MongoDB as a persistent storage engine for log data from servers and other machine data.Servers generate a large number of events (i.e. logging,) that contain useful information about their operation including errors, warnings, and users behavior. By default, most servers, store these data in plain text log files on their local file systems.While plain-text logs are accessible and human-readable, they are difficult to use, reference, and analyze without holistic systems for aggregating and storing these data.

1. Schema Design
The schema for storing log data in MongoDB depends on the format of the event data that you’re storing.The preferred approach is to extract the relevant information from the log data into individual fields in a MongoDB document.When you extract data from the log into fields, pay attention to the data types you use to render the log data into MongoDB. Using proper types for your data also increases query flexibility: if you store date as a timestamp you can make date range queries, whereas it’s very difficult to compare two strings that represent dates. The same issue holds for numeric fields; storing numbers as strings requires more space and is difficult to query.When extracting data from logs and designing a schema, also consider what information you can omit from your log tracking system. In most cases there’s no need to track all data from an event log, and you can omit other fields.

2.System Architecture
Insertion speed is the primary performance concern for an event logging system. At the same time, the system must be able to support flexible queries so that you can return data from the system efficiently.
MongoDB has a configurable write concern. This capability allows you to balance the importance
of guaranteeing that all writes are fully recorded in the database with the speed of the insert.
For example, if you issue writes to MongoDB and do not require that the database issue any response, the writeoperations will return very fast (i.e. asynchronously,) but you cannot be certain that all writes succeeded.
The following command will insert the event object into the events collection.
>>> db.events.insert(event, w=0)
By setting w=0, you do not require that MongoDB acknowledges receipt of the insert. Although very fast, this is risky
because the application cannot detect network and server failures. See write-concern for more information.

Conversely,if you require that MongoDB acknowledge every write operation, the database will not return as quickly but you can be certain that every item will be present in the database.
In this case use pass w=1 argument as follows:
>>> db.events.insert(event, w=1)

Finally, if you have extremely low tolerance for event data loss, you can require that MongoDB replicate the data to multiple secondary replica set members before returning:
>>> db.events.insert(event, w=majority)

Sharding
Eventually your system’s events will exceed the capacity of a single event logging database instance. In these situations you will want to use a sharded cluster, which takes advantage of MongoDB’s sharding functionality.
In a sharded environment the limitations on the maximum insertion rate are:
• the number of shards in the cluster.
• the shard key you chose.
Because MongoDB distributed data in using “ranges” (i.e. chunks) of keys, the choice of shard key can control how MongoDB distributes data and the resulting systems’ capacity for writes and queries.
Shard key choices:
  • Shard by Time
  • Shard by a Semi-Random Key
  • Shard by an Evenly-Distributed Key in the Data Set
  • Shard by Combine a Natural and Synthetic Key

Fashion Catalog Similarity Search using Datastax AstraDB Vector Database

DataStax Astra DB's vector database capabilities can be leveraged to build an efficient fashion catalog similarity search, enabling user...