DocumentDB 'ReplicaSetNoPrimary' error

0

While using AWS Lambda with Node and Mongoose 5.x, we are experiencing randomly (=a group or errors every 10-15 minutes) the following error. Sometimes connection establishes just fine, but other times throws a 'replica set no primary' error.

The DocDB service is in the same VPC with the Lambdas.

Have tried with Mongoose 6.x as well. It performs less well.

As far as I can tell this cannot be a firewall issue (since it works most of the time). Profiler / audit logs do not seem to offer any hints either. Any ideas how to troubleshoot this?

ReplicaSetNoPrimary
MongooseServerSelectionError: Server selection timed out after 5000 ms
at NativeConnection.Connection.openUri (/opt/nodejs/node_modules/mongoose/lib/connection.js:847:32)
at /opt/nodejs/node_modules/mongoose/lib/index.js:351:10
at /opt/nodejs/node_modules/mongoose/lib/helpers/promiseOrCallback.js:32:5
at new Promise (<anonymous>)
at promiseOrCallback (/opt/nodejs/node_modules/mongoose/lib/helpers/promiseOrCallback.js:31:10)
at Mongoose._promiseOrCallback (/opt/nodejs/node_modules/mongoose/lib/index.js:1149:10)
at Mongoose.connect (/opt/nodejs/node_modules/mongoose/lib/index.js:350:20)
at connectToMongoDB (/var/task/app/init/db.js:68:20)
at Object.<anonymous> (/var/task/app/init/db.js:109:26)
at Module._compile (internal/modules/cjs/loader.js:1085:14)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
at Module.load (internal/modules/cjs/loader.js:950:32)
at Function.Module._load (internal/modules/cjs/loader.js:790:12)
at Module.require (internal/modules/cjs/loader.js:974:19)
at require (internal/modules/cjs/helpers.js:93:18)
at Object.<anonymous> (/var/task/app/init/init.js:7:26)
at Module._compile (internal/modules/cjs/loader.js:1085:14)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
at Module.load (internal/modules/cjs/loader.js:950:32)
at Function.Module._load (internal/modules/cjs/loader.js:790:12)
at Module.require (internal/modules/cjs/loader.js:974:19)
at require (internal/modules/cjs/helpers.js:93:18)
at Object.<anonymous> (/var/task/app/init/index.js:1:18)
at Module._compile (internal/modules/cjs/loader.js:1085:14)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
at Module.load (internal/modules/cjs/loader.js:950:32)
at Function.Module._load (internal/modules/cjs/loader.js:790:12)
at Module.require (internal/modules/cjs/loader.js:974:19)

Our configuration looks like this:

url: 'mongodb://**********.cluster-*************.********.docdb.amazonaws.com:27017/',
opts: {
  dbName: '************',
  user: '***************',
  pass: '************',

  tls: true,
  tlsCAFile: caPemFile,

  useNewUrlParser: true,
  useUnifiedTopology: true,

  replicaSet: 'rs0',
  readPreference: 'secondaryPreferred',
  retryWrites: false,
  monitorCommands: true,

  maxPoolSize: 5,
  minPoolSize: 1,

  serverSelectionTimeoutMS: 5000,
  connectTimeoutMS: 5000,

  bufferCommands: false,

  autoCreate: false,
  autoIndex: false,

  authSource: 'admin',
},
  • If you have a multi-node cluster, are you noticing any failovers, or connection drops/memory contention from the metrics around the time you notice these errors?

aleksi
已提问 2 年前2864 查看次数
2 回答
0

For posteriority, this error was caused by how Lambda manages function calls outside of the request handler. Our design was to initialize / connect to the database the moment the initialization module was imported (=outside the request context). Moving the initialization within the request handler solved the problem.

// Don't do this:
let connectionPromise = mongoose.connect();

// Do this:
let connectionPromise = null;

exports.handler = async function handler(event, context) {
  if (!connectionPromise) {
    connectionPromise = mongoose.connect()
  }

  await connectionPromise;
 
  ...
}
aleksi
已回答 2 年前
0

Hi, thank you for reaching out. Generally, the error "ReplicaSetNoPrimary" could occur due to connection configuration issue like not adding correct IP address. However, due to it happening intermittently, I would suggest you to look at the key cluster cloudwatch metrics and review any contention or bottleneck in DatabaseConnections, FreeableMemory, CPUUtilization, etc.

https://docs.aws.amazon.com/documentdb/latest/developerguide/cloud_watch.html#cloud_watch-metrics_list

From your response, it seems the issue was resolved after making changes to the code. If you need further assistance analyzing the DocumentDB cluster performance, please feel free to reach out to AWS support team.

AWS
已回答 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则