We learnt how to create unique key/index using {unique: true} option with ensureIndex() method. Now lets see how we can create unique key when there are duplicate entries/documents already present inside the collection.
Insert documents
1 2 3 4 5 6 7 8 9 10 11 12 | > db.system.indexes.find() { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "foo.test" } > db.test.insert({name: "Satish", age: 27}); WriteResult({ "nInserted" : 1 }) > db.test.insert({name: "Kiran", age: 28}); WriteResult({ "nInserted" : 1 }) > db.test.insert({name: "Satish", age: 27}); WriteResult({ "nInserted" : 1 }) |
Here we have 3 documents. First and the last document has same value for “name” and “age” fields.
dropDups() To Remove Duplicate Documents: MongoDB
[youtube https://www.youtube.com/watch?v=aQXdtDWKBiU]
Creating unique key on field “name”
1 2 3 4 5 6 7 8 9 | > db.test.ensureIndex({name: 1}, {unique: true}); { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "ok" : 0, "errmsg" : "E11000 duplicate key error index: foo.test.$name_1 dup key: { : \"Satish\" }", "code" : 11000 } |
This creates error, as the collection “test” already has duplicate entries/documents.
Create Unique Key by dropping random duplicate entries
1 2 3 4 5 6 7 8 9 10 11 | > db.test.ensureIndex({name: 1}, {unique: true, dropDups: true}); { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } > db.test.find(); { "_id" : ObjectId("53d8f1268019dce2ce61eb86"), "name" : "Satish", "age" : 27 } { "_id" : ObjectId("53d8f12f8019dce2ce61eb87"), "name" : "Kiran", "age" : 28 } |
dropDups() method retains only 1 document randomly and deletes/removes/drops all other duplicate entries/documents permanently.
Note: Since the documents are deleted randomly and can not be restored, you need to be very careful while making use of dropDup() method.