设计你的数据图式

现在你已经熟悉了Simple-Schema的基本API,应当去考虑一些影响你设计数据图式的关于Meteor系统设计方面的约束。尽管通常讲你可以建立一个与任何MongoDB数据图式类似的Meteor数据图式,但这仍有一些重要的细节需要记住。

首当其冲的是如何使用DDP,它是Meteor的数据加载协议,通过网络传输公文。实现的关键是,当在文档发生变更时,DDP发送顶层的字段。这意味着如果你的文档中有大量复杂的子字段频繁的变更,DDP会发送很多不必要的变更数据。

例如,在“纯”MongoDB中你需要设计视图以便每个list文档都拥有一个叫todos字段,包含todo项目的数组:

Lists.schema = new SimpleSchema({
  name: {type: String},
  todos: {type: [Object]}
});

这个图式的问题是由于刚提及的DDP行为,当每次更改任何list中的todo项时将需要通过网络发送整个list的todos集合。因为DDP没有“更改todos字段中的第三项text”这个概念,而是简单的“把叫做todos的字段改成一个全新的数组”。

反范式化及多集合Denormalization and multiple collections

标题的含义是我们需要创建更多的包含子文档的集合。就Todos这个应用来说,我们需要一个Lists集合与一个Todos集合包含所有的list的todo项。因此我们需要做一些工作,就像你通常在SQL数据库中使用外联(todo.listId)来关联两个文档一样。

In Meteor, it's often less of a problem doing this than it would be in a typical MongoDB application, as it's easy to publish overlapping重叠 sets of documents (we might need one set of users to render one screen of our app, and an intersecting交叉的 set for another), which may stay on the client as we move around the application. So in that scenario方案 there is an advantage to separating分离 the subdocuments子文档 from the parent. 在Meteor中,这往往相比一个标准的MonogoDB应用来说更少一些问题,因为发布重叠的文档集合(我们或许需要一组用户数据渲染应用的一整屏) 所以在那个方案里,这有助于从父文档中分离子文档。

However, given that MongoDB prior to version 3.2 doesn't support queries over multiple collections ("joins"), we typically end up having to denormalize some data back onto the parent collection. Denormalization is the practice of storing the same piece of information in the database multiple times (as opposed to a non-redundant "normal" form). MongoDB is a database where denormalizing is encouraged, and thus optimized for this practice.

In the case of the Todos application, as we want to display the number of unfinished todos next to each list, we need to denormalize list.incompleteTodoCount. This is an inconvenience but typically reasonably easy to do as we'll see in the section on abstracting denormalizers below.

Another denormalization that this architecture sometimes requires can be from the parent document onto sub-documents. For instance, in Todos, as we enforce privacy of the todo lists via the list.userId attribute, but we publish the todos separately, it might make sense to denormalize todo.userId also. To do this, we'd need to be careful to take the userId from the list when creating the todo, and updating all relevant todos whenever a list's userId changed.

前瞻性的设计

An application, especially a web application, is rarely finished, and it's useful to consider potential future changes when designing your data schema. As in most things, it's rarely a good idea to add fields before you actually need them (often what you anticipate doesn't actually end up happening, after all).

However, it's a good idea to think ahead to how the schema may change over time. For instance, you may have a list of strings on a document (perhaps a set of tags). Although it's tempting to leave them as a subfield on the document (assuming they don't change much), if there's a good chance that they'll end up becoming more complicated in the future (perhaps tags will have a creator, or subtags later on?), then it might be easier in the long run to make a separate collection from the beginning.

The amount of foresight you bake into your schema design will depend on your app's individual constraints, and will need to be a judgement call on your part.

Using schemas on write

Although there are a variety of ways that you can run data through a Simple Schema before sending it to your collection (for instance you could check a schema in every method call), the simplest and most reliable is to use the aldeed:collection2 package to run every mutator (insert/update/upsert call) through the schema.

To do so, we use attachSchema():

Lists.attachSchema(Lists.schema);

What this means is that now every time we call Lists.insert(), Lists.update(), Lists.upsert(), first our document or modifier will be automatically checked against the schema (in subtly different ways depending on the exact mutator).

`defaultValue` 和数据清理

Collection2在传输数据至数据库之前做的一件事是“清理”数据。它包括但不限于以下几点:

  1. 强制转型——将字符型转换成数字型
  2. 删除不在图式中的属性
  3. 在图式的定义中基于defaultValue制定默认值

无论如何,有时在将它们插入集合之前,做更复杂的初始化文件是有必要的。例如,在Todos应用里,我们将新的lists命名为List X,其中X是随后一个有效且唯一的字母。

这样做的话,我们可以定义Mongo.Collection的之类及编写自有方法insert()

class ListsCollection extends Mongo.Collection {
  insert(list, callback) {
    if (!list.name) {
      let nextLetter = 'A';
      list.name = `List ${nextLetter}`;

      while (!!this.findOne({name: list.name})) {
        // not going to be too smart here, can go past Z
        nextLetter = String.fromCharCode(nextLetter.charCodeAt(0) + 1);
        list.name = `List ${nextLetter}`;
      }
    }

    // 调用原`insert`方法,将针对图式进行验证
    return super(list, callback);
  }
}

Lists = new ListsCollection('Lists');

插入/更新/删除 的钩子

这个技术也可用于提供"钩子"在集合中的一个位置额外的功能性。例如,当删除一个list时,我们永远想在同一时间删除它所有的todo。

我们可以很好的在这种情况下使用一个子类,重写remove()方法:

class ListsCollection extends Mongo.Collection {
  // ...
  remove(selector, callback) {
    Package.todos.Todos.remove({listId: selector});
    return super(selector, callback);
  }
}

This technique has a few disadvantages: 这项技术也有很多短板:

  1. Mutators can get very long when you want to hook in multiple times.
  2. Sometimes a single piece of functionality can be spread over multiple mutators.
  3. It can be a challenge to write a hook in a completely general way (that covers every possible selector and modifier), and it may not be necessary for your application (because perhaps you only ever call that mutator in one way).

A way to deal with points 1. and 2. is to separate out the set of hooks into their own module, and simply use the mutator as a point to call out to that module in a sensible way. We'll see an example of that below.

Point 3. can usually be resolved by placing the hook in the Method that calls the mutator, rather than the hook itself. Although this is an imperfect compromise (as we need to be careful if we ever add another Method that calls that mutator in the future), it is better than writing a bunch of code that is never actually called (which is guaranteed to not work!), or giving the impression that your hook is more general that it actually is.

Abstracting denormalizers

Denormalization may need to happen on various mutators of several collections. Therefore, it's sensible to define the denormalization logic in one place, and hook it into each mutator with one line of code. The advantage of this approach is that the denormalization logic is one place rather than spread over many files, but you can still examine the code for each collection and fully understand what happens on each update.

In the Todos example app, we build a incompleteCountDenormalizer to abstract the counting of incomplete todos on the lists. This code needs to run whenever a todo item is inserted, updated (checked or unchecked), or removed. The code looks like:

const incompleteCountDenormalizer = {
  _updateList(listId) {
    // Recalculate the correct incomplete count direct from MongoDB
    const incompleteCount = Todos.find({
      listId,
      checked: false
    }).count();

    Lists.update(listId, {$set: {incompleteCount}});
  },
  afterInsertTodo(todo) {
    this._updateList(todo.listId);
  },
  afterUpdateTodo(selector, modifier) {
    // We only support very limited operations on todos
    check(modifier, {$set: Object});

    // We can only deal with $set modifiers, but that's all we do in this app
    if (_.has(modifier.$set, 'checked')) {
      Todos.find(selector, {fields: {listId: 1}}).forEach(todo => {
        this._updateList(todo.listId);
      });
    }
  },
  // Here we need to take the list of todos being removed, selected *before* the update
  // because otherwise we can't figure out the relevant list id(s) (if the todo has been deleted)
  afterRemoveTodos(todos) {
    todos.forEach(todo => this._updateList(todo.listId));
  }
};

We are then able to wire in the denormalizer into the mutations of the Todos collection like so:

class TodosCollection extends Mongo.Collection {
  insert(doc, callback) {
    doc.createdAt = doc.createdAt || new Date();
    const result = super(doc, callback);
    incompleteCountDenormalizer.afterInsertTodo(doc);
    return result;
  }
}

Note that we only handled the mutators we actually use in the application---we don't deal with all possible ways the todo count on a list could change. For example, if you changed the listId on a todo item, it would need to change the incompleteCount of two lists. However, since our application doesn't do this, we don't handle it in the denormalizer.

Dealing with every possible MongoDB operator is difficult to get right, as MongoDB has a rich modifier language. Instead we focus on just dealing with the modifiers we know we'll see in our app. If this gets too tricky, then moving the hooks for the logic into the Methods that actually make the relevant modifications could be sensible (although you need to be diligent to ensure you do it in all the relevant places, both now and as the app changes in the future).

It could make sense for packages to exist to completely abstract some common denormalization techniques and actually attempt to deal with all possible modifications. If you write such a package, please let us know!

results matching ""

    No results matching ""