python 3.4 - What is the preferred way to add many fields to all documents in a MongoDB collection? -


i have have python application iteratively going through every document in mongodb (3.0.2) collection (typically between 10k , 1m documents), , adding new fields (probably doubling/tripling number of fields in document).

my initial thought use upsert entire of revised documents (using pymongo) - i'm questioning that:

  • given revised documents bigger should inserting new fields, or replacing document?
  • also, better perform write collection on document document basis or in bulk?

this great question can solved few different ways depending on how managing data.

if upserting additional fields mean data appending additional fields @ later point in time changes being addition of additional fields? if set ttl on documents old ones drop off on time. keep in mind if want set index sorts results descending _id recent additions selected before older ones.

the benefit of of doing way continually writing data opposed seeking , updating data faster.

in regards upserts vs bulk inserts. bulk inserts faster upserts since bulk upserting requires find original document first.

  • given revised documents bigger should inserting new fields, or replacing document?
    • you need understand data determine best if change data additional fields or changes need considered point forward bulk inserting , setting ttl on older data better method stand point of write operations opposed seek, find , update. when using method want db.document.find_one() opposed db.document.find() current record returned.
  • also, better perform write collection on document document basis or in bulk?
    • bulk inserts faster inserting each 1 sequentially.

Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -