index creation: MongoDB

Lets learn to create index and to optimize the database in MongoDB.

Creating “Database”: “temp”, “Collection”: “no”, and inserting 10 Million documents inside it

1
2
3
4
5
use temp
switched to db temp
 
for(i=0; i< = 10000000; i++)
db.no.insert({"student_id": i, "name": "Satish"});

Since Mongo Shell is built out of JavaScript, you can pass in any valid Javascript code to it. So we write a for loop and insert 10 Million documents inside “no” collection.

creating-index-mongodb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
MongoDB shell version: 2.6.1
connecting to: test
> show dbs
admin    (empty)
daily    0.078GB
local    0.078GB
nesting  0.078GB
school   0.078GB
temp     3.952GB
test     0.078GB
> use temp
switched to db temp
> show collections
no
system.indexes
> db.no.find().pretty()
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cda"),
        "student_id" : 0,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdb"),
        "student_id" : 1,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdc"),
        "student_id" : 2,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdd"),
        "student_id" : 3,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cde"),
        "student_id" : 4,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdf"),
        "student_id" : 5,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce0"),
        "student_id" : 6,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce1"),
        "student_id" : 7,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce2"),
        "student_id" : 8,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce3"),
        "student_id" : 9,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce4"),
        "student_id" : 10,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce5"),
        "student_id" : 11,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce6"),
        "student_id" : 12,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce7"),
        "student_id" : 13,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce8"),
        "student_id" : 14,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ce9"),
        "student_id" : 15,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cea"),
        "student_id" : 16,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ceb"),
        "student_id" : 17,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cec"),
        "student_id" : 18,
        "name" : "Satish"
}
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833ced"),
        "student_id" : 19,
        "name" : "Satish"
}
Type "it" for more
 
> it

“no” collection has 10 Million record, but it won’t fetch you all records at once, as it would take a lot of time and resources of your computer! So it only fetches 20 records at a time. You can iterate through next 20 documents by using command “it“.

index creation: MongoDB


[youtube https://www.youtube.com/watch?v=zK_mRyiNs-I]

YouTube Link: https://www.youtube.com/watch?v=zK_mRyiNs-I [Watch the Video In Full Screen.]



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> db.no.find({"student_id": 5}).pretty()
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdf"),
        "student_id" : 5,
        "name" : "Satish"
}
 
 
> db.no.findOne({"student_id": 5});
{
        "_id" : ObjectId("53c9020abcdd1ea7fb833cdf"),
        "student_id" : 5,
        "name" : "Satish"
}
> db.no.find({"student_id": 5000000}).pretty()
{
        "_id" : ObjectId("53c90ca6bcdd1ea7fbcf881a"),
        "student_id" : 5000000,
        "name" : "Satish"
}

find() method scans through all the documents present in the collection to find multiple matches for the condition. So in above case, find() method scans through 10 Million documents, hence returns the result slowly. Where as findOne() method stops scanning the collection as soon as it finds the first matching document, so findOne() returns result faster than find() method.

Related Read:
Multi-key Index: MongoDB
index / key: MongoDB

Creating index

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> show collections
no
system.indexes
 
> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "temp.no" }
 
> db.no.ensureIndex({"student_id": 1});
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
 
> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, 
                     "name" : "_id_", "ns" : "temp.no" }
{ "v" : 1, "key" : { "student_id" : 1 }, 
                     "name" : "student_id_1", "ns" : "temp.no" }

We create index on “student_id”. It takes little time to create the index, as we have 10 Million documents inside “no” collection.

After creating index on “student_id”, run the same command and you’ll get the results instantly – maybe it takes 0.01 ms, but the delay can’t be noticed.
Why does it return results faster after creating index on “student_id”? Watch this short video lesson to know it: index / key: MongoDB

1
2
3
4
5
6
7
8
9
10
11
12
13
> db.no.find({"student_id": 5000000}).pretty()
{
        "_id" : ObjectId("53c90ca6bcdd1ea7fbcf881a"),
        "student_id" : 5000000,
        "name" : "Satish"
}
> db.no.find({"student_id": 10000000}).pretty()
{
        "_id" : ObjectId("53c914adbcdd1ea7fb1bd35a"),
        "student_id" : 10000000,
        "name" : "Satish"
}
>

So the querys/commands can be optimized by creating indexes on frequently accessed fields.

Multi-key Index: MongoDB

Lets learn about multikey indexes and how they are efficient in MongoDB.

multi key index mongodb

database name: daily
collections: gadgets, users

Insert documents into gadgets collection

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> use daily
switched to db daily
 
> db.gadgets.insert({_id: 1, gadget: "Nexus 5"});
WriteResult({ "nInserted" : 1 })
 
> db.gadgets.insert({_id: 2, gadget: "iPhone"});
WriteResult({ "nInserted" : 1 })
 
> db.gadgets.insert({_id: 3, gadget: "iPad"});
WriteResult({ "nInserted" : 1 })
 
> db.gadgets.insert({_id: 4, gadget: "Nexus 7"});
WriteResult({ "nInserted" : 1 })
 
> db.gadgets.find()
{ "_id" : 1, "gadget" : "Nexus 5" }
{ "_id" : 2, "gadget" : "iPhone" }
{ "_id" : 3, "gadget" : "iPad" }
{ "_id" : 4, "gadget" : "Nexus 7" }

Here we have 4 documents with “_id” as 1, 2, 3, 4.

Insert documents into users collection

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
> db.users.insert({name: "Satish", 
                   products: [db.gadgets.find()[0]._id,
                              db.gadgets.find()[3]._id]});
WriteResult({ "nInserted" : 1 })
 
> db.users.insert({name: "Kiran", 
                   products: [db.gadgets.find()[0]._id, 
                              db.gadgets.find()[3]._id, 
                              db.gadgets.find()[2]._id]});
WriteResult({ "nInserted" : 1 })
 
> db.users.insert({name: "Jyothi", products: [1, 2]});
WriteResult({ "nInserted" : 1 })
 
> db.users.find().pretty()
{
        "_id" : ObjectId("53c7a30efd7d3c9597ca2593"),
        "name" : "Satish",
        "products" : [
                1,
                4
        ]
}
{
        "_id" : ObjectId("53c7a337fd7d3c9597ca2594"),
        "name" : "Kiran",
        "products" : [
                1,
                4,
                3
        ]
}
{
        "_id" : ObjectId("53c7a34efd7d3c9597ca2595"),
        "name" : "Jyothi",
        "products" : [
                1,
                2
        ]
}

Here we insert documents into “users” collection and embed the “_id” value of “gadgets” collection as array elements of “products” key.

Multi-key Index: MongoDB


[youtube https://www.youtube.com/watch?v=9C1x-xrN_qQ]

YouTube Link: https://www.youtube.com/watch?v=9C1x-xrN_qQ [Watch the Video In Full Screen.]



Command to fetch documents with array element 1 and 4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
> db.users.find({products: {$all: [1, 4]}}).pretty();
{
        "_id" : ObjectId("53c7a30efd7d3c9597ca2593"),
        "name" : "Satish",
        "products" : [
                1,
                4
        ]
}
{
        "_id" : ObjectId("53c7a337fd7d3c9597ca2594"),
        "name" : "Kiran",
        "products" : [
                1,
                4,
                3
        ]
}

Both these documents have array values 1 and 4 in them.

Related Read: Comparison Operators: MongoDB

system.indexes content

1
2
3
4
5
6
7
8
> show collections
gadgets
system.indexes
users
 
> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "daily.gadgets" }
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "daily.users" }

Here we have only 1 key and it’s on “_id”.

Basic Cursor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
> db.users.find({products: {$all: [1, 4]}}).explain();
{
        "cursor" : "BasicCursor",
        "isMultiKey" : false,
        "n" : 2,
        "nscannedObjects" : 3,
        "nscanned" : 3,
        "nscannedObjectsAllPlans" : 3,
        "nscannedAllPlans" : 3,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 0,
        "server" : "Satish-PC:27017",
        "filterSet" : false
}

If we chain explain() method to our command, we can know some details about the command and also evaluate about its efficiency. The above command doesn’t have multi-key enabled and it’s a Basic Cursor.

Lets create index on products field

1
2
3
4
5
6
7
> db.users.ensureIndex({"products": 1});
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}

This creates index on field “products”.

Now the system.indexes content

1
2
3
4
5
6
> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "daily.gadgets" }
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "daily.users" }
{ "v" : 1, "key" : { "products" : 1 }, 
                     "name" : "products_1", 
                     "ns" : "daily.users" }

So now we have 2 keys, “_id” and “products”.

BTree Cursor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
> db.users.find({products: {$all: [1, 4]}}).explain();
{
        "cursor" : "BtreeCursor products_1",
        "isMultiKey" : true,
        "n" : 2,
        "nscannedObjects" : 3,
        "nscanned" : 3,
        "nscannedObjectsAllPlans" : 3,
        "nscannedAllPlans" : 8,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 0,
        "indexBounds" : {
                "products" : [
                        [
                                1,
                                1
                        ]
                ]
        },
        "server" : "Satish-PC:27017",
        "filterSet" : false
}

Now if we run the same command chained with explain() method, we could see that its a Btree Cursor and multi-key index is true.

Note: Btree Cursors are faster and efficient than Basic Cursors.