mongodb - Finding number of inserted documents in a bulk insert with duplicate keys -
i'm doing bulk-insert mongodb database. know 99% of records inserted fail because of duplicate key error. print after insert how many new records inserted database. being done in python through tornado motor mongodb driver, doesn't matter much.
try: bulk_write_result = yield db.collections.probe.insert(dataarray, continue_on_error=true) nr_inserts = bulk_write_result["ninserted"] except pymongo.errors.duplicatekeyerror e: nr_inserts = ???? <--- should put here?
since exception thrown, bulk_write_result
empty. can (except concurrency issues) count of full collection before , after insert, don't roundtrips database line in logfile. there way can discover how many records inserted?
regular insert continue_on_error can't report info want. if you're on mongodb 2.6 or later, however, have high-performance solution error reporting. here's complete example using motor's bulkoperationbuilder:
import pymongo.errors tornado import gen tornado.ioloop import ioloop motor import motorclient db = motorclient() dataarray = [{'_id': 0}, {'_id': 0}, # duplicate. {'_id': 1}] @gen.coroutine def my_insert(): try: bulk = db.collections.probe.initialize_unordered_bulk_op() # prepare operation on client. doc in dataarray: bulk.insert(doc) # send server @ once. bulk_write_result = yield bulk.execute() nr_inserts = bulk_write_result["ninserted"] except pymongo.errors.bulkwriteerror e: print(e) nr_inserts = e.details['ninserted'] print('nr_inserts: %d' % nr_inserts) ioloop.instance().run_sync(my_insert)
full documentation: http://motor.readthedocs.org/en/stable/examples/bulk.html
heed warning poor bulk insert performance on mongodb before 2.6! it'll still work requires separate round-trip per document. in 2.6+, driver sends whole operation server in 1 round trip, , server reports how many succeeded , how many failed.
Comments
Post a Comment