Occasional PostgreSQL "Duplicate key value violates unique constraint" error from Go insert -


i have table unique constraint

create unique index "bd_hash_index" on "public"."bodies" using btree ("hash"); 

i have go program takes "body" values on channel, filters out duplicates hashing, , inserts non-duplicates database. this:

import (     "crypto/md5"     "database/sql"     "encoding/hex"     "log"     "strings"     "time" )  type process struct {     db                  *sql.db     bodieshash          map[string]bool     channel             chan bodyiterface     logger              *log.logger }  func (pr *process) run() {     bodyinsert, err := pr.db.prepare("insert bodies (hash, type, source, body, created_timestamp) values ($1, $2, $3, $4, $5)")     if err != nil {         pr.logger.println(err)         return     }     defer bodyinsert.close()      hash := md5.new()      p := range pr.channel {         nowunix := time.now().unix()          bodystring := strings.join([]string{             p.gettype(),             p.getsource(),             p.getbodystring(),         }, ":")         hash.write([]byte(bodystring))         bodyhash := hex.encodetostring(hash.sum(nil))         hash.reset()          if _, ok := pr.bodieshash[bodyhash]; !ok {             pr.bodieshash[bodyhash] = true              _, err = bodyinsert.exec(                 bodyhash,                 p.gettype(),                 p.getsource(),                 p.getbodystring(),                 nowunix,             )             if err != nil {                 pr.logger.println(err, bodystring, bodyhash)             }         }     }    } 

but periodically error

"pq: duplicate key value violates unique constraint "bd_hash_index""

in log file. can't image how can be, because check hash uniqueness before insert. sure when call go processdebugbody.run() bodies table empty.

the channel created buffered channel with:

    processdebugbody.channel = make(chan bodyiterface, 1000) 

when execute query outside of transaction sql.db, automatically retries when there's problem connection. in current implementation, 10 times. example, notice maxbadconnretries in sql.exec.

now, happens when underlying driver returns driver.errbadconn , specification states following:

errbadconn should returned driver signal sql package driver.conn in bad state (such server having earlier closed connection) , sql package should retry on new connection.

to prevent duplicate operations, errbadconn should not returned if there's possibility database server might have performed operation.

i think driver implementations little bit careless in implementing rule, maybe there logic behind it. i've been studying implementation of lib/pq other day , noticed scenario possible.

as pointed out in comments have ssl errors issued before seeing duplicates, seems reasonable guess.

one thing consider use transactions. if lose connection before committing transaction, can sure rolled back. statements of transactions not retransmitted automatically on bad connections, problem might solved – se ssl errors being propagated directly application though, you'll need retry on own.

i must tell i've been seeing ssl renegotiation errors on postgres using go 1.3 , that's why i've disabled ssl internal db time being (sslmode=disable in connection string). wondering whether version 1.4 has solved issue, 1 thing on changelog the crypto/tls package supports alpn defined in rfc 7301 (alpn states application-layer protocol negotiation extension).


Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -