MongoDB - Indexes

admin

Many of the following examples are taken from the MongoDB Documentation

A few things we missed

getSiblingDB()

Used to return another database without modifying the db variable in the shell environment.

Atlas>
Atlas> dbm = db.getSiblingDB('sample_mflix')
sample_mflix
Atlas> dbm.movies.findOne()
{
  _id: ObjectId("573a1390f29313caabcd4135"),
  plot: 'Three men hammer on an anvil and pass a bottle of beer around.',
  genres: [ 'Short' ],
  runtime: 1,
  ...
}
Atlas> dbm.movies.countDocuments()
23530
Atlas> dbt=db.getSiblingDB("test")
test
Atlas> dbt.cData.findOne({"abv":"NH"})
{
  _id: ObjectId("54e23c7b28099359f566152a"),
  stateNum: 33,
  abv: 'NH',
  state: '33',
  data: [
...

Explain()

db.collection.explain().find({'department'})

Why Indexes

Data doesn’t always appear in sorted order, and if it does it might not be sorted on the fields you might want to query.

scanning every document in a DB is really tedious and slow.

Single field (e.g., _id) or composite (e.g., lname)

Suppose we define an index on department, gradyear we could quickly find the gradyear of a students in a particular department. However, the opposite would not be true.

draw index boxes as a list

Similarly, an index on department, gradyear, GPA would speed up queries like these

department;
department, gradyear;
department, gradyear, GPA;

but that doesn’t help these queries

gradYear;
GPA;
gradYear, GPA;

It would help some on a department, GPA query, but that would require some serial searches.

Indexes aren’t free… writes get slower, but reads are much faster.

Hint from MongoDB University: When adding a bunch of data , best to not do it with an existing index. Do the index after the inserts saves the overhead of updating the index with every insert.

Does my db have indexes?

➜ mongosh --quiet
test> use demo
switched to db demo
demo> show collections
cData
movies
moviesScratch
solarSystem
demo> db.cData.getIndexes()
[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  {
    v: 2,
    key: { center: '2dsphere' },
    name: 'center_2dsphere',
    '2dsphereIndexVersion': 2
  }
]
demo>

demo> db.cData.aggregate([
  {
    $geoNear:
        {
        near: {
          type: "Point",
          coordinates: [-103.459825, 43.8789472]
        },
        distanceField: "dist.calculated",
        key: "center"
      }
  },
  {
    $project:
        {
        _id: 0,
        name: 1,
        "dist.calculated": 1
      }
  },
  {
    $limit:
        5
  }
])

[
  { name: 'Alaska', dist: { calculated: 6288909.298964596 } },
  { name: 'Washington', dist: { calculated: 9015390.527937755 } },
  { name: 'Hawaii', dist: { calculated: 9126091.478865134 } },
  { name: 'Oregon', dist: { calculated: 9320108.1786801 } },
  { name: 'Montana', dist: { calculated: 9407938.766049838 } }
]
demo>

Now get some explanations:

demo> db.cData.explain().aggregate([ ...
... lots of information! here it found an index to use
...
          winningPlan: {
            isCached: false,
            stage: 'GEO_NEAR_2DSPHERE',
            keyPattern: { center: '2dsphere' },
            indexName: 'center_2dsphere',
            indexVersion: 2
          },
          rejectedPlans: []

If we were searching the sample_mflix database :

demo> use sample_mflix
sample_mflix> db.movies.explain().find({"cast.0": "Henry Fonda"})
{
  explainVersion: '1',
  queryPlanner: {
    namespace: 'sample_mflix.movies',
    parsedQuery: { 'cast.0': { '$eq': 'Henry Fonda' } },
    indexFilterSet: false,
    planCacheShapeHash: '1CA69E8E',
    planCacheKey: '3915FA0E',
    optimizationTimeMillis: 0,
    maxIndexedOrSolutionsReached: false,
    maxIndexedAndSolutionsReached: false,
    maxScansToExplodeReached: false,
    prunedSimilarIndexes: false,
    winningPlan: {
      isCached: false,
      stage: 'COLLSCAN',
      filter: { 'cast.0': { '$eq': 'Henry Fonda' } },
      direction: 'forward'
    },
    rejectedPlans: []
    ...

Index benefit demo

> show dbs
blog       0.078GB
fruit      0.078GB
local      0.078GB
m101       0.078GB
mongomart  0.078GB
reddit     0.078GB
school     3.952GB
students   7.950GB
test       0.078GB
video      0.078GB
> use students
switched to db students
> show collections
grades
system.indexes
> db.grades.find().count()
40000000
> db.grades.findOne()
{
	"_id" : ObjectId("590dcd28d3cd31bc1397af82"),
	"student_id" : 0,
	"type" : "exam",
	"score" : 95.12849459888332
}

### ADD explain() to show COLSCAN as method

> db.grades.find({"student_id": 42})

### about 60 seconds later
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02a"), "student_id" : 42, "type" : "exam", "score" : 53.74351917459479 }
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02b"), "student_id" : 42, "type" : "quiz", "score" : 35.72446619192635 }
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02c"), "student_id" : 42, "type" : "homework", "score" : 95.87285258690225 }
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02d"), "student_id" : 42, "type" : "homework", "score" : 24.863383426921626 }

> db.grades.findOne({"student_id": 42})
### very fast since it can stop when it finds the first one.
{
	"_id" : ObjectId("590dcd28d3cd31bc1397b02a"),
	"student_id" : 42,
	"type" : "exam",
	"score" : 53.74351917459479
}

> db.grades.find({"_id" : ObjectId("590dcd28d3cd31bc1397b02d")})
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02d"), "student_id" : 42, "type" : "homework", "score" : 24.863383426921626 }
> db.grades.explain().find({"_id" : ObjectId("590dcd28d3cd31bc1397b02d")})
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "students.grades",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"_id" : {
				"$eq" : ObjectId("590dcd28d3cd31bc1397b02d")
			}
		},
		"winningPlan" : {
			"stage" : "IDHACK"
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "Swamp.local",
		"port" : 27017,
		"version" : "3.4.1",
		"gitVersion" : "5e103c4f5583e2566a45d740225dc250baacfbd7"
	},
	"ok" : 1
}
> db.grades.createIndex({"student_id":1})
### 3 minutes later
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}
db.grades.explain().find({"student_id": 42})
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "students.grades",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"student_id" : {
				"$eq" : 42
			}
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"student_id" : 1
				},
				"indexName" : "student_id_1",
				"isMultiKey" : false,
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 1,
				"direction" : "forward",
				"indexBounds" : {
					"student_id" : [
						"[42.0, 42.0]"
					]
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "Swamp.local",
		"port" : 27017,
		"version" : "3.4.1",
		"gitVersion" : "5e103c4f5583e2566a45d740225dc250baacfbd7"
	},
	"ok" : 1
}
> db.grades.find({"student_id": 42})
### 0.35 seconds later
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02a"), "student_id" : 42, "type" : "exam", "score" : 53.74351917459479 }
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02b"), "student_id" : 42, "type" : "quiz", "score" : 35.72446619192635 }
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02c"), "student_id" : 42, "type" : "homework", "score" : 95.87285258690225 }
{ "_id" : ObjectId("590dcd28d3cd31bc1397b02d"), "student_id" : 42, "type" : "homework", "score" : 24.863383426921626 }
>

I did this on a MongoDB on my laptop.

The original find had this load on the system:

The index generation had this load on the system:

Calling db.grades.explain("executionStats").find({"student_id": 42}) will tell us statistics of the execution itself.

Compound indices

db.grades.createIndex({"student_id":1, "score":-1})

Index direction

Sorting on one field, direction doesn’t matter, but it can matter when sorting on two or more fields.

Consider how the index will be used to decide whether to choose 1 or -1.

If we have a table of past students with indexes on gradYear and GPA, both ascending, we get this:

Student_Id	gradYear	GPA
1	2017	3.1
2	2017	3.2
3	2017	3.3
4	2018	2.9
5	2018	3.0
6	2019	2.9
7	2019	3.3
8	2020	3.0

This may not be the typical access pattern, causing lots of extra wandering about in the B-Tree index. If, instead, the GPA was in decreasing order, then the typical queries would simply be traversing the B-tree which is very fast.

Finding & deleting Indexes

db.values.getIndexes()

Deleting it uses the same description used when creating it. If you didn’t create it or forgot, just look at the key field shown by getIndexes()

> db.values.getIndexes()
[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  { v: 2, key: { stock_symbol: 1 }, name: 'stock_symbol_1' }
]

> db.values.dropIndex('stock_symbol_1')
{ nIndexesWas: 2, ok: 1 }

Foreground vs. Background index creation

Foreground locks the collection (no R or W) until index is created. Fastest. Background does not lock the collection, but runs slower.

Kinds of Indexes

Unique

A unique index ensures that the indexed fields do not store duplicate values. A unique index on a single field ensures that a value appears at most once for a given field. A unique compound index ensures that any given combination of the index key values only appears at most once.

> use school
switched to db school
> show collections
students
> db.students.findOne()
{
	"_id" : 0,
	"scores" : [
		{
			"score" : 1.463179736705023,
			"type" : "exam"
		},
		{
			"score" : 11.78273309957772,
			"type" : "quiz"
		}
	]
}
> db.students.find().count()
??????
> db.students.getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "school.students"
	}
]

> db.scores.find().count()
300
> db.scores.findOne()
{
	"_id" : ObjectId("5919d4a56be6f1d0629a06b5"),
	"student_id" : 0,
	"type" : "exam",
	"score" : 16.24664549774898
}
> db.scores.createIndex({"student_id":1})
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}
> db.scores.getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "school.scores"
	},
	{
		"v" : 1,
		"key" : {
			"student_id" : 1
		},
		"name" : "student_id_1",
		"ns" : "school.scores"
	}
]
> db.scores.find({"student_id": 1})
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b8"), "student_id" : 1, "type" : "exam", "score" : 23.039604476685106 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b9"), "student_id" : 1, "type" : "homework", "score" : 17.88094175983336 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06ba"), "student_id" : 1, "type" : "quiz", "score" : 83.93632816457051 }

> db.scores.insertOne({"student_id": 1, "type": "exam", "score": 99.0 })
{
	"acknowledged" : true,
	"insertedId" : ObjectId("591ee209a50b06d79c52aaef")
}
> db.scores.find().limit(8)
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b5"), "student_id" : 0, "type" : "exam", "score" : 16.24664549774898 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b6"), "student_id" : 0, "type" : "homework", "score" : 21.69372101356738 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b7"), "student_id" : 0, "type" : "quiz", "score" : 22.519566714788407 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b8"), "student_id" : 1, "type" : "exam", "score" : 23.039604476685106 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b9"), "student_id" : 1, "type" : "homework", "score" : 17.88094175983336 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06ba"), "student_id" : 1, "type" : "quiz", "score" : 83.93632816457051 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06bb"), "student_id" : 2, "type" : "exam", "score" : 79.35322861744226 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06bc"), "student_id" : 2, "type" : "homework", "score" : 23.300833708020473 }
> // where is it?
>
> db.scores.find({"student_id": 1})
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b8"), "student_id" : 1, "type" : "exam", "score" : 23.039604476685106 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b9"), "student_id" : 1, "type" : "homework", "score" : 17.88094175983336 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06ba"), "student_id" : 1, "type" : "quiz", "score" : 83.93632816457051 }
{ "_id" : ObjectId("591ee209a50b06d79c52aaef"), "student_id" : 1, "type" : "exam", "score" : 99 }

> db.scores.dropIndex({"student_id":1})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.scores.createIndex({"student_id":1},{unique: true})
{
	"ok" : 0,
	"errmsg" : "E11000 duplicate key error index: school.scores.$student_id_1 dup key: { : 0.0 }",
	"code" : 11000,
	"codeName" : "DuplicateKey"
}
> db.scores.getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "school.scores"
	}
]
> db.scores.deleteOne({"_id" : ObjectId("591ee209a50b06d79c52aaef")})
{ "acknowledged" : true, "deletedCount" : 1 }
> db.scores.createIndex({"student_id":1},{unique: true})
{
	"ok" : 0,
	"errmsg" : "E11000 duplicate key error index: school.scores.$student_id_1 dup key: { : 0.0 }",
	"code" : 11000,
	"codeName" : "DuplicateKey"
}
> db.scores.find({"student_id": 1})
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b8"), "student_id" : 1, "type" : "exam", "score" : 23.039604476685106 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06b9"), "student_id" : 1, "type" : "homework", "score" : 17.88094175983336 }
{ "_id" : ObjectId("5919d4a56be6f1d0629a06ba"), "student_id" : 1, "type" : "quiz", "score" : 83.93632816457051 }
// Can't have an index on student_id !>

You can always keep deleting the offending documents, but that could be slow!

Sparse

Indexes tend to be dense or continuous, without holes or missing keys. However, sometimes the your data just doesn’t cooperate. Sometimes the field you wish to index on simply doesn’t exist for all the data, other times it may not have been determined yet, and still other times some documents may never get a value for the field (e.g., And email address might be blank and blog comments by authors who wish to remain anonymous). Either way, trying to create an index on a field that does not occur in every document poses a problem.

Sparse indices come to the rescue! In a sparse index only those documents having a value for the indexed field will appear in the index.

db.scores.createIndex({“userid”: 1}, {unique: 1, sparse: 1})

HOWEVER, you cannot sort on a sparse key.

Multikey Indexes

If you think about the blog application, and author will often include one or more tags to describe a post. These tags are often stored in an array for convenience. Suppose you wanted to index the blog posts on these tags. You would create the index the same way you would only need non-array field and the result will have each entry of the array appear in the index. That is, multiple index entries and referencing the same document.

You don’t have to specify anything special to identify a multi-key index. Just create the index on the array name and MongoDB be take care of the rest.

Note: you cannot create a compound index using two arrays. MongoDB will not index parallel arrays since that could cause a combinatorial explosion of index entries. Specifically, if there is a compound index on 2 or more fields, you can not insert any document that has arrays in more than one of the fields of that index.

> db.foo.drop()
true
> db.foo.insertOne({"a":1, "b":2})
{
	"acknowledged" : true,
	"insertedId" : ObjectId("591ef231a50b06d79c52aaf4")
}
> db.foo.createIndex({"a":1, "b":1})
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}
> db.foo.explain().find({"a":1, "b": 1})
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "school.foo",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$and" : [
				{
					"a" : {
						"$eq" : 1
					}
				},
				{
					"b" : {
						"$eq" : 1
					}
				}
			]
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"a" : 1,
					"b" : 1
				},
				"indexName" : "a_1_b_1",
				"isMultiKey" : false,
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 1,
				"direction" : "forward",
				"indexBounds" : {
					"a" : [
						"[1.0, 1.0]"
					],
					"b" : [
						"[1.0, 1.0]"
					]
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "Swamp.local",
		"port" : 27017,
		"version" : "3.4.4",
		"gitVersion" : "888390515874a9debd1b6c5d36559ca86b44babd"
	},
	"ok" : 1
}
> db.foo.insertOne({"a":2, "b": [3,4,5]})
{
	"acknowledged" : true,
	"insertedId" : ObjectId("591ef2f1a50b06d79c52aaf5")
}
> db.foo.explain().find({"a":1, "b": 1})
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "school.foo",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$and" : [
				{
					"a" : {
						"$eq" : 1
					}
				},
				{
					"b" : {
						"$eq" : 1
					}
				}
			]
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"a" : 1,
					"b" : 1
				},
				"indexName" : "a_1_b_1",
				"isMultiKey" : true,
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 1,
				"direction" : "forward",
				"indexBounds" : {
					"a" : [
						"[1.0, 1.0]"
					],
					"b" : [
						"[1.0, 1.0]"
					]
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "Swamp.local",
		"port" : 27017,
		"version" : "3.4.4",
		"gitVersion" : "888390515874a9debd1b6c5d36559ca86b44babd"
	},
	"ok" : 1
}
> db.foo.find({"a":1, "b": 1})
> db.foo.find()
{ "_id" : ObjectId("591ef231a50b06d79c52aaf4"), "a" : 1, "b" : 2 }
{ "_id" : ObjectId("591ef2f1a50b06d79c52aaf5"), "a" : 2, "b" : [ 3, 4, 5 ] }
> db.foo.find({"a":2, "b":5})
{ "_id" : ObjectId("591ef2f1a50b06d79c52aaf5"), "a" : 2, "b" : [ 3, 4, 5 ] }
> db.foo.explain().find({"a":2, "b":5})
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "school.foo",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$and" : [
				{
					"a" : {
						"$eq" : 2
					}
				},
				{
					"b" : {
						"$eq" : 5
					}
				}
			]
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"a" : 1,
					"b" : 1
				},
				"indexName" : "a_1_b_1",
				"isMultiKey" : true,
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 1,
				"direction" : "forward",
				"indexBounds" : {
					"a" : [
						"[2.0, 2.0]"
					],
					"b" : [
						"[5.0, 5.0]"
					]
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "Swamp.local",
		"port" : 27017,
		"version" : "3.4.4",
		"gitVersion" : "888390515874a9debd1b6c5d36559ca86b44babd"
	},
	"ok" : 1
}
> db.foo.getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "school.foo"
	},
	{
		"v" : 1,
		"key" : {
			"a" : 1,
			"b" : 1
		},
		"name" : "a_1_b_1",
		"ns" : "school.foo"
	}
]
> db.foo.insertOne({"a":[4,3,2], "b": [11,12,13]})
2017-05-19T09:32:19.870-0400 E QUERY    [thread1] WriteError: cannot index parallel arrays [b] [a] :
WriteError({
	"index" : 0,
	"code" : 171,
	"errmsg" : "cannot index parallel arrays [b] [a]",
	"op" : {
		"_id" : ObjectId("591ef3e3a50b06d79c52aaf6"),
		"a" : [
			4,
			3,
			2
		],
		"b" : [
			11,
			12,
			13
		]
	}
})
WriteError@src/mongo/shell/bulk_api.js:469:48
Bulk/mergeBatchResults@src/mongo/shell/bulk_api.js:836:49
Bulk/executeBatch@src/mongo/shell/bulk_api.js:906:13
Bulk/this.execute@src/mongo/shell/bulk_api.js:1150:21
DBCollection.prototype.insertOne@src/mongo/shell/crud_api.js:242:9
@(shell):1:1
> db.foo.insertOne({"a":[4,3,2], "b": 11})
{
	"acknowledged" : true,
	"insertedId" : ObjectId("591ef3eea50b06d79c52aaf7")
}
> db.foo.find()
{ "_id" : ObjectId("591ef231a50b06d79c52aaf4"), "a" : 1, "b" : 2 }
{ "_id" : ObjectId("591ef2f1a50b06d79c52aaf5"), "a" : 2, "b" : [ 3, 4, 5 ] }
{ "_id" : ObjectId("591ef3eea50b06d79c52aaf7"), "a" : [ 4, 3, 2 ], "b" : 11 }
>

Dot notation and indexes

You can also create indexes on array contents.

use sample_school
switched to db sample_school
> show collections
scores
students
> db.students.find().limit(4)
{ "_id" : 0, "scores" : [ { "score" : 1.463179736705023, "type" : "exam" }, { "score" : 11.78273309957772, "type" : "quiz" } ] }
{ "_id" : 1, "scores" : [ { "score" : 60.06045071030959, "type" : "exam" }, { "score" : 52.79790691903873, "type" : "quiz" } ] }
{ "_id" : 2, "scores" : [ { "score" : 67.03077096065002, "type" : "exam" }, { "score" : 6.301851677835235, "type" : "quiz" } ] }
{ "_id" : 3, "scores" : [ { "score" : 71.64343899778332, "type" : "exam" }, { "score" : 24.80221293650313, "type" : "quiz" } ] }
> db.students.getIndexes()
[ { v: 2, key: { _id: 1 }, name: '_id_' } ]

> db.students.createIndex({"scores.score":1})
scores.score_1

> db.students.getIndexes()
[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  { v: 2, key: { 'scores.score': 1 }, name: 'scores.score_1' }
]

> db.students.find({"scores.score": {"$gt": 90}})

[
  {
    _id: 186,
    name: 'Leonida Lafond',
    scores: [
      { score: 8.125073097960179, type: 'exam' },
      { score: 0.2017888852605676, type: 'quiz' },
      { score: 90.13081857264544, type: 'homework' }
    ]
  },
... lots

> db.students.countDocuments({"scores.score": {"$gt": 90}})
73

> db.students.explain("executionStats").find({"scores.score": {"$gt": 90}})
{
  
  ...
  winningPlan: {
      isCached: false,
      stage: 'FETCH',
      inputStage: {
        stage: 'IXSCAN',
        keyPattern: { 'scores.score': 1 },
  ... 
  executionStats: {
    executionSuccess: true,
    nReturned: 73,
    executionTimeMillis: 0,
    totalKeysExamined: 82,
    totalDocsExamined: 73,
  ...
}

> db.students.findOne({"scores.score": {"$gt": 90}})
{
  _id: 186,
  name: 'Leonida Lafond',
  scores: [
    { score: 8.125073097960179, type: 'exam' },
    { score: 0.2017888852605676, type: 'quiz' },
    { score: 90.13081857264544, type: 'homework' }
  ]
}

> db.students.find({"scores": { $elemMatch: {"score": {"$gt": 90},"type": "exam"}}})
[
  {
    _id: 13,
    name: 'Jessika Dagenais',
    scores: [
      { score: 90.47179954427436, type: 'exam' },
      { score: 90.3001402468489, type: 'quiz' },
      { score: 95.17753772405909, type: 'homework' }
    ]
  },
  {
 ...
> db.students.countDocuments({"scores": { $elemMatch: {"score": {"$gt": 90},"type": "exam"}}})
18
>

Hashed

Hash tables are useful when storing items of variable sizes that don’t happen to match an even distribution. The simplistic hash table idea commonly used at voting precincts is to have several lines each for some contiguous portion of the alphabet. Voters whose last name begins with the letters ‘A’ thru ‘F’ are sent to one line, ‘G’-‘P’ to another, and ‘Q’-‘Z’ to another, for example.

It’s difficult to get these buckets sizes optimal especially when you don’t really know the distribution of the “keys” you will be using. If you get it wrong, one line will be much longer than the others.

We face a similar challenge when it comes to distributing a MongoDB database across shards. You want the distribution of keys to be relatively uniform, and it’s less important that entries with similar keys be near each other. For this reason we have hashed indexes.

db.scores.createIndex({“lastName”: hashed})

Geospatial

the convenient capability is to be able to locate documents representing locations that are near another location. If you crank up your favorite coffee shop application, it might offer you the option of requesting a list of shops that are nearest your current location. Such a query requires the database to be able to do geographic distance calculations. Geospatial indexes can handle this and similar types of queries.

Full text searches

Text indexes support text search queries on fields containing string content. Text indexes improve performance when searching for specific words or multi-word strings within string content.

use full text search
create “… Words:”dog cat rat”
searches have to fully match

sample_school> db.students.findOne()
{
  _id: 0,
  name: 'aimee Zank',
  scores: [
    { score: 1.463179736705023, type: 'exam' },
    { score: 11.78273309957772, type: 'quiz' },
    { score: 35.8740349954354, type: 'homework' }
  ]
}
sample_school> db.students.createIndex({"name": "text"})
name_text
sample_school> db.students.getIndexes()
[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  { v: 2, key: { 'scores.score': 1 }, name: 'scores.score_1' },
  {
    v: 2,
    key: { _fts: 'text', _ftsx: 1 },
    name: 'name_text',
    weights: { name: 1 },
    default_language: 'english',
    language_override: 'language',
    textIndexVersion: 3
  }
]

You can create compound text indexes too:

> use blog_db

> db.blog.createIndex(
   {
      "about": "text",
      "keywords": "text"
   }
)