MongoDB – Part 5 – Beginning of Indexes

By | April 9, 2018

Indexing is a way to optimize performance of a database by minimizing the number of disk accesses required when a query is processed.

Indexing helps to optimize the performance of a database by reducing the iterations of accesses when a query is executed. Said that creating too many unnecessary indexes and not creating the required indexes are the common issues We have seen.

Hence understanding Indexes is very important for any successful project.

An index is nothing but a pointer helping us to quickly identify and access the data in a database.

Let’s list it out all the possible indexes we can create in MongoDB

  1. Default _id Index
  2. Single Field Index
  3. Compound Index
  4. Multikey Index
  5. Special Index [ Geospatial, Text]
  6. Hashed Indexes

Important Points:

# Indexes in MongoDB are stored in RAM not on ROM.

# Indexes are created at collection levels.

# Indexing in MongoDB follows B-Tree structure.

Before jumping to discuss on each of the above mentioned indexes, Let’s find out How to see the queries performance and certain insights based on what we will initiate the index creation.

Query 1:

>> db.Category.explain().find()

{

“queryPlanner” : {

“plannerVersion” : 1,

“namespace” : “m3.Category”,

“indexFilterSet” : false,

“parsedQuery” : {

},

“winningPlan” : {

“stage” : “COLLSCAN”,

“direction” : “forward”

},

“rejectedPlans” : [ ]

},

“serverInfo” : {

“host” : “LAPTOP-55CTOQHT”,

“port” : 30000,

“version” : “3.6.3”,

“gitVersion” : “9586e557d54ef70f9ca4b43c26892cd55257e1a5”

},

“ok” : 1

Query 2:

> db.Category.explain().find({“Name”:”Chemistry”})

{

“queryPlanner” : {

“plannerVersion” : 1,

“namespace” : “m3.Category”,

“indexFilterSet” : false,

“parsedQuery” : {

“Name” : {

“$eq” : “Chemistry”

}

},

“winningPlan” : {

“stage” : “COLLSCAN”,

“filter” : {

“Name” : {

“$eq” : “Chemistry”

}

},

“direction” : “forward”

},

“rejectedPlans” : [ ]

},

“serverInfo” : {

“host” : “LAPTOP-55CTOQHT”,

“port” : 30000,

“version” : “3.6.3”,

“gitVersion” : “9586e557d54ef70f9ca4b43c26892cd55257e1a5”

},

“ok” : 1

}

Explain tells us 1. What is the winning plan 2. What are the boundaries taken 3. What type of scan it has used ,4. statistics and directions ( Say : forward -> started with first rec and end with last rec ).

Server Info is the least priority at this juncture. Let’s keep our focus on queryplanner.

If you notice the samples [ Query 1, Query 2 ], When we added the condition “Name” eq “Chemistry” the same was reflecting in the ParsedQuery and Winning Plan.

“Indexfilterset ; false” tells us that currently MongoDB has not applied any index filter.

Stage : COLLSCAN means Collection SCAN indicating that since there is no index made available for Index Scan, Mongo DB scans the entire collection.

COLLSCAN could occur on indexed field during aggregation framework pipeline execution – when pipeline request cannot use indexes at execution phase.

Winning Plan is the important piece of information which tells us the plan MongoDB has chosen to execute the query.

Let’s Create an Index to deep dive

> db.Category.createIndex({CatName:1})

1 -> Forward , 0 -> Backward

Following command helps us to get the list of indexes.

> db.Category.getIndexes()

[

{

“v” : 2,

“key” : {

“_id” : 1

},

“name” : “_id_”,

“ns” : “m3.Category”

},

{

“v” : 2,

“key” : {

“CatName” : 1

},

“name” : “CatName_1”,

“ns” : “m3.Category”

}

]

Lets get the explanation after creating the Index in the previous step

> db.Category.find({CatName:”Chemistry”}).explain()

{

“queryPlanner” : {

“plannerVersion” : 1,

“namespace” : “m3.Category”,

“indexFilterSet” : false,

“parsedQuery” : {

“CatName” : {

“$eq” : “Chemistry”

}

},

“winningPlan” : {

                        “stage” : “FETCH”,

“inputStage” : {

“stage” : “IXSCAN”,

“keyPattern” : {

“CatName” : 1

},

“indexName” : “CatName_1”,

“isMultiKey” : false,

“multiKeyPaths” : {

“CatName” : [ ]

},

“isUnique” : false,

“isSparse” : false,

“isPartial” : false,

“indexVersion” : 2,

“direction” : “forward”,

“indexBounds” : {

“CatName” : [

“[\”Chemistry\”, \”Chemistry\”]”

]

}

}

},

“rejectedPlans” : [ ]

},

“serverInfo” : {

“host” : “LAPTOP-55CTOQHT”,

“port” : 30000,

“version” : “3.6.3”,

“gitVersion” : “9586e557d54ef70f9ca4b43c26892cd55257e1a5”

},

“ok” : 1

}

Since we have indexes now, Stage has been set to “Fetch” and COLLSCAN replaced by IXSCAN.

Similarly If we run a distinct on a indexed scan then stage is “Projection” and input stage becomes  “Distinct_Scan”.

Lets execute and see.

> db.Category.explain().distinct(‘CatName’)

{

“queryPlanner” : {

“plannerVersion” : 1,

“namespace” : “m3.Category”,

“indexFilterSet” : false,Si

“parsedQuery” : {

},

“winningPlan” : {

“stage” : “PROJECTION”,

“transformBy” : {

“_id” : 0,

“CatName” : 1

},

“inputStage” : {

“stage” : “DISTINCT_SCAN”,

“keyPattern” : {

“CatName” : 1

},

“indexName” : “CatName_1”,

“isMultiKey” : false,

“multiKeyPaths” : {

“CatName” : [ ]

},

“isUnique” : false,

“isSparse” : false,

“isPartial” : false,

“indexVersion” : 2,

“direction” : “forward”,

“indexBounds” : {

“CatName” : [

“[MinKey, MaxKey]”

]

}

}

},

“rejectedPlans” : [ ]

},

“serverInfo” : {

“host” : “LAPTOP-55CTOQHT”,

“port” : 30000,

“version” : “3.6.3”,

“gitVersion” : “9586e557d54ef70f9ca4b43c26892cd55257e1a5”

},

“ok” : 1

}

We will further discuss on important aspects of explain() which goes hand in hand with Indexing before deep diving into different types of Indexes.

Leave a Reply

Your email address will not be published. Required fields are marked *