MongoDB 学习笔记

nodejs mongodb driver、nodejs mongodb driver api

零散笔记

mongodb 有三个特殊的 database，分别是 admin（和授权有关）、local（存储和当前服务器有关的数据）、config（sharded clusters 用这个数据库存储关于每一个 shard 的信息）；
通过名称让集合看起来是 sub-collection，例如 blog.authors、blog.posts；
namespaces 是指把 db 和 collection 拼起来，例如 cms.blog.posts 就是一个 namespace；
安装了 mongodb 服务器之后，可以通过 mongod --dbpath ~/Documents/db 运行 mongodb 服务器，然后通过 mongodb shell 操作这个 mongodb 实例，进入 mongo shell 的方式是输入 mongo；
进入 shell 相当于建立了一个 client，并自动连接到了一个叫 test 的 db，输入 db 会返回 test；
mongodb 的 document 和 json 差不多，但是不完全相同，它支持一些 json 不支持的数据类型，包括：null、boolean、number、Date、regex、array、ObjectId、Binary data、Code；
要在 document 中添加 Date 对象，需要用 new Date()，如果用 Date() 则是添加日期字符串；
进入 mongo shell 之后可以通过 help 获取可用的命令；
.mongorc.js 文件是 mongo 在启动时会自动执行的文件；
$ 是所谓的 positional operator，有三种形式，分别是 $、$[]、$[identifier]，其中第三个是和 arrayFilters 一起使用的，叫做 filtered positional operator；
$[] 表示数组的每一个 item，用于更新操作，文档；
$ 除了可以用作 update operator，还可以用作 projection operator，它表示匹配到的所有 doc 中的第一个；
upsert 是一种特殊的 update，它不同于 insert；
cursor 只是一个查询函数，它还没有向服务器发送请求获取数据，当用 next() 或 toArray() 或 forEach 等方法获取数据的时候，cursor 才会向服务器发送请求，服务器会返回前面的 100 个 doc 或者 4MB 的 doc，当这些 doc 都用完之后，会再次向服务器发送请求，直到所有匹配到的 doc 都用完，toArray() 如果不传入函数，则返回 promise，如果传入了回调函数，那不能在前面加 await；
cursor 的 limit、skip、sort 方法经常用来实现分页，但 skip 可能对性能有较大影响，如果可以，通过其他方式实现分页；
cursor 在 mongodb 的服务器中会占据资源，当 cursor 获取完所有 doc 或者被 close 又或者过了 10 分钟 cursor 都没动静，这些资源才会被释放，如果你不希望 cursor 过了一段时间自动关闭，可以把它设置为 immortal cursor；
没有使用 index 的 query 叫做 collection scan，尽量避免 collection scan；
cursor 的 explain 方法可以解释 query 的过程，可以告诉你这是 collection scan 还是其他；
index 可以大大提高 query 的速度，但是会降低 write 相关 api 的速度，因为这些 api 还需要更新 doc 的 index；
如果用 index 去 query，mongodb 返回的 doc 顺序就是这个 index 的顺序；
可以把 sub-doc 的 key 设置索引；
使用 index 去 query 会产生两次 lookup，一次是 lookup index，另一次是顺着 index 的 pointer 去 lookup document。如果你希望获取所有文档，那用 index 会比没有用 index 更慢；
可以通过 GridFS 把文件保存在 mongodb 数据库中，文档；
对于 mongodb shell，可以用 mongofiles 这个命令把文件保存在数据库中；

mongodb shell 常用命令

和 mongodb 的 nodejs driver 的 api 不完全相同。

db.collection.find({},{}) 第二个参数可以获取 doc 中的指定某些属性，1 表示返回，0 表示不返回，find 返回 cursor，cursor 的方法通常返回另一个 cursor，所以可以拼接在一起；
db.collection.insertOne({})，添加一个文档到数据库；

const { result } = await db.collection("oppo").insertOne({ a: "b" });

db.collection.insertMany([{},{}])，导入数据可以用 mongoimport，默认情况下 insert 顺序就是提供的 doc 顺序，可以增加 {ordered: false} 开启无序增加，无序增加不会因为某个文档出错而导致其他文档添加失败；
mongodb shell 可以用 save 添加文档：db.collection.save({})；
db.collection.deleteOne({}) 这样会匹配所有文档，但是因为是 delete one，所以会删除匹配到的第一个文档；
db.collection.deleteMany({}) ：这样会匹配所有文档，所以会删除所有文档，删除所有文档也可以用 db.collection.drop()；
db.collection.updateOne({}, {})：更新 doc 中的某些属性，第二个参数通常包含所谓的 update operator，如果第一个参数是 {}，则表示更新所有文档，如果需要更新的属性没有，会为文档自动添加这个属性；
如果希望 insertOne if not exists，可以这样：

await db.collection("user").insertOne({ id: 1, name: "tom" });
const doc = {
  id: 2,
  name: "jack",
  hobby: "swimming",
};
const { result: re } = await db
  .collection("user")
  .updateOne({ id: 2 }, { $setOnInsert: { ...doc } }, { upsert: true });

db.collection.updateMany({},{})；
db.collection.replaceOne({},{})；
db.collection.findOneAndUpdate({},{},{})；

operator

update operator

db.collection.updateOne() 中的 operator 叫 update operator，update operator 可以分为 3 类，分别是 field operator、array update operator、bitwise update operator。

上面提到的 $inc、$set、$unset、$setOnInsert 等都是 field operator，$push、$pull、$each、$addToSet、$pop、$slice、$sort 等都是 array update operator。bitwise operator 只有一个，就是 $bit。

$sort 必须和 $each 一起用，如果只想 update 数组的 item 顺序，可以给 $each 提供一个空数组。

query operator

$lt、$lte、$gt、$gte、$ne，经常用来 query date；
$in、$nin
$or
$mod 表示那个 key 的 value，除以 mod 后面的 array 第一个数字之后，得到这个 array 的第二个数字；
$not
null，可以获取所有不存在这个 key 的 doc，或者存在这个 key，但是这个 key 的值也是 null 的 doc；
$regex
$all，用于 array query
$size，用于 array query
$slice
$elemMatch 意思是对 array field 中的 array 进行 query，经常用在这个 array 的 element 是 embedded doc 的情况下；

aggregation

The aggregation framework allows you to analyze your data in real time. Using the framework, you can create an aggregation pipeline that consists of one or more stages. Each stage transforms the documents and passes the output to the next stage.

aggregation 有三种模式，最常用的是 aggregation pipeline framework。

apparition pipeline stages：$match、$project、$count、$sort、$limit、$skip、$set 等等；
apparition pipeline operator：$abs、$add、$concat、$ceil、$lt、$eq 等等，有非常多；
可以用 mongodb compass 构建 aggregation pipeline，可视化构建，构建完成之后可以导出多种语音的代码；
aggregation 在 nodejs 的写法是 db.collection().aggregate([{},{},{},{}...])，aggregate() 接收一个数组，数组的元素是一个一个的 stage。