2014년 12월 27일 토요일

[mongoose] Using .populate on a sharded database

Greetings!

We are in the process of choosing shard keys in our database in preparation to start turning on sharding.  I find myself uncertain about a possible limitation in the way we use mongoose while trying to choose a shard key.

We have a collection "students" which has a "subscription".  We also have a collection "teachers" which has a "subscription".  We then have a collection "classes" which includes an array of students and a teacher, all using the ObjectId type with a "ref" to the correct collection/schema type.  This works awesome, and had been great!

My question, though, is about using .populate when getting that section.  I've checked and with the size and number of students in the database (or teachers) it may likely be the best idea to use the has of the "subscription" as the shard key for both students and teachers, because generally all operations are going to be students or teachers from the same subscription, so that would keep the query on the same shard.  The problem is, .populate is generally going to grab the document with the _id field, but in this case in order to properly support the shard key it would also need to use the .subscription field, since even though _id uniquely identifies the document .subscription would be needed to correctly determine which shard to send the query to. Without it, it would send the query to all shards, and we'd be better off sharding on the hash of _id even though that would usually hit all shards.

I know mongoose has some small support for shardKey, but I'm assuming that it can't use this to automatically fill in the .populate queries with the correct values. Is there a solution for this?

I'd be happy to do some coding myself to get this working, but I'd probably need some guidance as to where to start =]

Thank you,


댓글 없음:

댓글 쓰기