index creation: MongoDB

Lets learn to create index and to optimize the database in MongoDB.

Creating “Database”: “temp”, “Collection”: “no”, and inserting 10 Million documents inside it

1
2
3
4
5
use temp
switched to db temp
 
for(i=0; i< = 10000000; i++)
db.no.insert({"student_id": i, "name": "Satish"});

Since Mongo Shell is built out of JavaScript, you can pass in any valid Javascript code to it. So we write a for loop and insert 10 Million documents inside “no” collection.

creating-index-mongodb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
MongoDB shell version: 2.6.1
connecting to: test
> show dbs
admin    (empty)
daily    0.078GB
local    0.078GB
nesting  0.078GB
school   0.078GB
temp     3.952GB
test     0.078GB
> use temp
switched to db temp
> show collections
no
system.indexes
> db.no.find().pretty()
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cda"),
        "student_id" : 0,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdb"),
        "student_id" : 1,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdc"),
        "student_id" : 2,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdd"),
        "student_id" : 3,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cde"),
        "student_id" : 4,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdf"),
        "student_id" : 5,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce0"),
        "student_id" : 6,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce1"),
        "student_id" : 7,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce2"),
        "student_id" : 8,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce3"),
        "student_id" : 9,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce4"),
        "student_id" : 10,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce5"),
        "student_id" : 11,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce6"),
        "student_id" : 12,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce7"),
        "student_id" : 13,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce8"),
        "student_id" : 14,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce9"),
        "student_id" : 15,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cea"),
        "student_id" : 16,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ceb"),
        "student_id" : 17,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cec"),
        "student_id" : 18,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ced"),
        "student_id" : 19,
        "name" : "Satish"
}
Type "it" for more
 
> it

“no” collection has 10 Million record, but it won’t fetch you all records at once, as it would take a lot of time and resources of your computer! So it only fetches 20 records at a time. You can iterate through next 20 documents by using command “it“.

index creation: MongoDB


[youtube https://www.youtube.com/watch?v=zK_mRyiNs-I]

YouTube Link: https://www.youtube.com/watch?v=zK_mRyiNs-I [Watch the Video In Full Screen.]



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> db.no.find({"student_id": 5}).pretty()
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdf"),
        "student_id" : 5,
        "name" : "Satish"
}
 
 
> db.no.findOne({"student_id": 5});
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdf"),
        "student_id" : 5,
        "name" : "Satish"
}
> db.no.find({"student_id": 5000000}).pretty()
{
        "_id" : ObjectId("53c90ca6bcdd1ea7fbcf881a"),
        "student_id" : 5000000,
        "name" : "Satish"
}

find() method scans through all the documents present in the collection to find multiple matches for the condition. So in above case, find() method scans through 10 Million documents, hence returns the result slowly. Where as findOne() method stops scanning the collection as soon as it finds the first matching document, so findOne() returns result faster than find() method.

Related Read:
Multi-key Index: MongoDB
index / key: MongoDB

Creating index

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> show collections
no
system.indexes
 
> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "temp.no" }
 
> db.no.ensureIndex({"student_id": 1});
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
 
> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, 
                     "name" : "_id_", "ns" : "temp.no" }
{ "v" : 1, "key" : { "student_id" : 1 }, 
                     "name" : "student_id_1", "ns" : "temp.no" }

We create index on “student_id”. It takes little time to create the index, as we have 10 Million documents inside “no” collection.

After creating index on “student_id”, run the same command and you’ll get the results instantly – maybe it takes 0.01 ms, but the delay can’t be noticed.
Why does it return results faster after creating index on “student_id”? Watch this short video lesson to know it: index / key: MongoDB

1
2
3
4
5
6
7
8
9
10
11
12
13
> db.no.find({"student_id": 5000000}).pretty()
{
        "_id" : ObjectId("53c90ca6bcdd1ea7fbcf881a"),
        "student_id" : 5000000,
        "name" : "Satish"
}
> db.no.find({"student_id": 10000000}).pretty()
{
        "_id" : ObjectId("53c914adbcdd1ea7fb1bd35a"),
        "student_id" : 10000000,
        "name" : "Satish"
}
>

So the querys/commands can be optimized by creating indexes on frequently accessed fields.

SELECT Columns/Fields: MongoDB

In this video tutorial we shall see selecting of key: value in MongoDB.

In RDBMSs like MySQL, we do the same using SELECT statement.
In MongoDB, we use findOne() and find() methods.

key-value-pair-mongoDB

Documents in ‘info’ collection

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> db.info.find().forEach(printjson)
{
        "_id" : ObjectId("517e829d005b19f1f0d96b25"),
        "name" : "Apple",
        "product" : "iPhone5S",
        "emp_no" : "100"
}
{
        "_id" : ObjectId("517e8377005b19f1f0d96b26"),
        "name" : "Technotip",
        "product" : "Video Tutorials - Educational",
        "emp" : [
                "Satish",
                "Kiran"
        ],
        "videos" : {
                "mongo" : "MongoDB Videos",
                "php" : "PHP Video Tutorials"
        }
}

Learn Create and Insert Documents: MongoDB.

Note:
find() returns cursor objects.
findOne() returns single object.

We can use findOne() method to select and retrieve only one record at a time.

findOne() Method

1
2
3
4
5
6
7
8
> db.info.findOne({name: 'Technotip'}).product
Video Tutorials - Educational
 
> db.info.findOne({name: 'Technotip'}).videos
{ "mongo" : "MongoDB Videos", "php" : "PHP Video Tutorials" }
 
> db.info.findOne({name: 'Technotip'}).emp;
[ "Satish", "Kiran" ]

Syntax for retrieving normal {key: value} pair, sub-object {key: value} pair and array {key: value} pair is same.

Limitation
Using findOne() method, we could select and return only 1 {key: value} pair.

To select and return more than 1 {key: value} pair, we can make use of find() method, with 2 parameters.

SELECT Columns or Fields ( { KEY: VALUE } ): MongoDB


[youtube https://www.youtube.com/watch?v=u0WGmvJFFy4]

YouTube Link: https://www.youtube.com/watch?v=u0WGmvJFFy4 [Watch the Video In Full Screen.]



find() Method, with two parameters

1
2
> db.info.find({name: 'Apple'}, {product: 1}).forEach(printjson);
{ "_id" : ObjectId("517e829d005b19f1f0d96b25"), "product" : "iPhone5S" }

First parameter is the condition, second parameter specifies the {key: value} pairs.

We can pass more than 1 key in second parameter; that would returns multiple {key: value} pairs.

1
2
3
4
5
6
> db.info.find({name: 'Apple'}, {product: 1, emp_no: 1}).forEach(printjson);
{
        "_id" : ObjectId("517e829d005b19f1f0d96b25"),
        "product" : "iPhone5S",
        "emp_no" : "100"
}

1 or true means, those {key: value} pairs need to be returned.
0 or false means, excluding those {key: value} pairs, all other {key: value} pairs(records) needs to be returned.

true(0)

1
2
3
4
5
6
7
8
> db.info.find({name: 'Apple'}, {product: true}).forEach(printjson);
{ "_id" : ObjectId("517e829d005b19f1f0d96b25"), "product" : "iPhone5S" }
> db.info.find({name: 'Apple'}, {product: true, emp_no: true}).forEach(printjson);
{
        "_id" : ObjectId("517e829d005b19f1f0d96b25"),
        "product" : "iPhone5S",
        "emp_no" : "100"
}

False(0): Exclusion

1
2
> db.info.find({name: 'Apple'}, {product: 0, emp_no: 0}).forEach(printjson);
{ "_id" : ObjectId("517e829d005b19f1f0d96b25"), "name" : "Apple" }

Note:
True and False combination doesn’t work

1
2
3
4
5
6
7
8
9
10
11
12
> db.info.find({name: 'Apple'}, {product: true, emp_no: false}).forEach(printjson);
Wed May 08 12:11:30.184 JavaScript execution failed: error: {
        "$err" : "You cannot currently mix including and excluding fields. 
                  Contact us if this is an issue.",
        "code" : 10053
} at src/mongo/shell/query.js:L128
> db.info.find({name: 'Apple'}, {product: 0, emp_no: 1}).forEach(printjson);
Wed May 08 12:11:40.840 JavaScript execution failed: error: {
        "$err" : "You cannot currently mix including and excluding fields. 
                  Contact us if this is an issue.",
        "code" : 10053
} at src/mongo/shell/query.js:L128

We can not combine true(1) and false(0) together in the second parameter.

Special Provision!
But we can do it with _id(ObjectId)

1
2
3
> db.info.find({name: 'Apple'}, 
               {product: 1, emp_no: 1, _id: 0}).forEach(printjson);
{ "product" : "iPhone5S", "emp_no" : "100" }

Operation of second document/record in ‘info’ collection.
This illustrates that the syntax for sub-objects, array and the normal key/value pair is same.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
> db.info.find({name: 'Technotip'}, {product: 1}).forEach(printjson);
{
        "_id" : ObjectId("517e8377005b19f1f0d96b26"),
        "product" : "Video Tutorials - Educational"
}
{ "_id" : ObjectId("518363e2d73694e289255486") }
> db.info.find({name: 'Technotip'}, {product: 1, videos: 1}).forEach(printjson);
 
{
        "_id" : ObjectId("517e8377005b19f1f0d96b26"),
        "product" : "Video Tutorials - Educational",
        "videos" : {
                "mongo" : "MongoDB Videos",
                "php" : "PHP Video Tutorials"
        }
}
 
> db.info.find({name: 'Technotip'}, 
               {product: 1, videos: 1, emp: 1}).forEach(printjson);
{
        "_id" : ObjectId("517e8377005b19f1f0d96b26"),
        "product" : "Video Tutorials - Educational",
        "emp" : [
                "Satish",
                "Kiran"
        ],
        "videos" : {
                "mongo" : "MongoDB Videos",
                "php" : "PHP Video Tutorials"
        }
}
 
> db.info.find({name: 'Technotip'}, 
               {product: 1, videos: 1, emp: 1, _id: 0}).forEach(printjson);
{
        "product" : "Video Tutorials - Educational",
        "emp" : [
                "Satish",
                "Kiran"
        ],
        "videos" : {
                "mongo" : "MongoDB Videos",
                "php" : "PHP Video Tutorials"
        }
}

Note:
Two documents with same name

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
        "_id" : ObjectId("517e8377005b19f1f0d96b26"),
        "name" : "Technotip",
        "product" : "Video Tutorials - Educational",
        "emp" : [
                "Satish",
                "Kiran"
        ],
        "videos" : {
                "mongo" : "MongoDB Videos",
                "php" : "PHP Video Tutorials"
        }
}
{ "_id" : ObjectId("518363e2d73694e289255486"), "name" : "Technotip" }
1
2
> db.info.findOne({name: 'Technotip'})._id;
ObjectId("517e8377005b19f1f0d96b26")

If we use findOne() method on this collection, and the condition being the name key, then the oldest document will be returned.
Only one key: value pair is returned, as findOne() returns only single object.