建立 GraphQL API (9) |

Cache 伺服端快取

在高負載的應用程式中，快取經常扮演著關鍵性的角色，而且幾乎在網路上四處可見。從網頁、圖片與樣式表等靜態資源，以及資料庫查詢結果這類純粹的資料，都經常會涉及到快取的使用。

快取模式在伺服器端，可以緩解硬體 I/O 問題，在客戶端則可緩解網路 I/O 問題。我們在 Node.js 教育訓練中學到的伺服器端的快取機制，就可以拿來應用。

首先我們來看看會產生效能問題的地方，我們看下面這個 GraphQL query:

GraphQL query listEmployees

query listEmployees {
  totalEmployees
  allEmployees {
    empno
    ename
    hiredate
    income
    department {
      dname
      loc
    }
  }
}

在這個 GraphQL query listEmployees 中的 allEmployees 你會發覺問題應會出現在 department 的邊(edge)連結上，看一下他的解析函式，我們在第 7 行它從資料庫實際返回資料的地方加入一個 console.log 來偵測它的行為:

resolvers/employee.js

...
Employee: {
  department: (parent, args, { orademo }) => {
    return orademo.department
      .findByDeptno(parent.deptno)
      .then(result => {
        console.log(result.rows[0]);
        return result.rows[0]
      });
  },
  ...

現在重新從客戶端的 GraphQL Playground 執行 listEmployees query，你可以從伺服器端的終端螢幕上看到:

GraphQL Server console.log

{ deptno: 10, dname: '會計部', loc: '紐約' }
{ deptno: 30, dname: 'SALES', loc: 'CHICAGO' }
{ deptno: 10, dname: '會計部', loc: '紐約' }
{ deptno: 20, dname: 'RESEARCH', loc: 'DALLAS' }
{ deptno: 20, dname: 'RESEARCH', loc: 'DALLAS' }
{ deptno: 20, dname: 'RESEARCH', loc: 'DALLAS' }
{ deptno: 30, dname: 'SALES', loc: 'CHICAGO' }
{ deptno: 20, dname: 'RESEARCH', loc: 'DALLAS' }
{ deptno: 30, dname: 'SALES', loc: 'CHICAGO' }
{ deptno: 40, dname: 'OPERATIONS', loc: 'BOSTON' }
{ deptno: 30, dname: 'SALES', loc: 'CHICAGO' }
{ deptno: 20, dname: 'RESEARCH', loc: 'DALLAS' }
{ deptno: 10, dname: '會計部', loc: '紐約' }
{ deptno: 30, dname: 'SALES', loc: 'CHICAGO' }
{ deptno: 70, dname: '資訊部', loc: '永康 B4' }
{ deptno: 70, dname: '資訊部', loc: '永康 B4' }
{ deptno: 40, dname: 'OPERATIONS', loc: 'BOSTON' }
{ deptno: 70, dname: '資訊部', loc: '永康 B4' }
{ deptno: 10, dname: '會計部', loc: '紐約' }

這裡有 19 筆員工，因此對資料庫送出了 19 次的要求，資料庫執行了 19 次的查詢，但我們其實只有 5 個部門，所以有 14 次的資料庫讀取操作是重複的，如果你有幾萬個員工，那會怎麼樣? 這是利用快取模式最好的地方，讓我們將快取機制加進來。

首先將 Node.js 教育訓練文件拿出來複習一下，這裡會用到文件第 98 頁的 Promise 快取模式。為了保留原始的程式碼，我們產生一個新的 resolvers/employee-cache.js 程式檔，把快取機制加進來。

在非同步控制流程模式裡，承諾(Promise)能夠大幅簡化非同步程式碼，不過在處理批次及快取處理時，它也能夠提供不錯的助益。若回想一下，承諾有兩項特性能夠讓我們加以應用

多個 then( ) 監聽器可附加至相同的承諾。

then( ) 監聽器保證只會被呼叫一次，並且即便是在承諾已解析(resolved)後才附加，也依然可以運作。不僅如此，then() 也保證一定以非同步的方式被呼叫。

前述的第一項特性便是在批次處理請求時所需要的，而第二項特性即表示承諾已經是已解析值的快取，也就能夠自然以一致的非同步方式，回傳快取值。基於以上設想，這就表示以承諾來實現批次及快取處理，是相當簡單明瞭的。

resolvers/employee-cache.js

const cache = {};

module.exports = {
  Query: {
    totalEmployees: (parent, args, { orademo }) =>
      orademo.employee.findAll().then(result => result.rows.length),
    allEmployees: (parent, args, { orademo }) => {
      return orademo.employee.findAll()
        .then(result => result.rows );
    }  
  }, 
  Employee: {
    department: (parent, args, { orademo }) => {
      if (cache[parent.deptno]) {
        console.log(`cache hit => ${parent.deptno}:${parent.empno}`);
        return cache[parent.deptno];
      }

      cache[parent.deptno] = orademo.department
        .findByDeptno(parent.deptno)
        .then(result => {
          console.log(`${JSON.stringify(result.rows[0])} => ${parent.deptno}:${parent.empno}`);
          setTimeout(() => { delete cache[parent.deptno] }, 30 * 1000);

          return result.rows[0]
        }).catch (err => {
          delete cache[parent.deptno];
        });

      return cache[parent.deptno];
    },
    income: (parent) => {
      let sal = !isNaN(parseFloat(parent.sal)) ? parseFloat(parent.sal) : 0;
      let comm = !isNaN(parseFloat(parent.comm)) ? parseFloat(parent.comm) : 0;
      return sal + comm;
    }
  },
};

在這個例子中我們將使用記憶體存儲快取，第一行我們使用一個物件 cache 當成快取。
我們在第 13 行 department 解析函式中我們加入了快取模式。第 19 行我們對相同部門的資料庫查詢(使用 Promise)存入快取中，這在第 14 行中檢查這個相同部門的 Promise 快取是否已存在，如果存在，則可直接使用這個相同的 Promise，不用再對資料庫提出相同的要求。
第 23 行加入了一個到期時間機制，30 秒後會自動刪除快取。

在這裡，當承諾解析時，設定快取清除時間為 30 秒，並回傳資料庫返回的結果，將這個結果傳遞給附加在承諾上的其它 then( ) 監聽器。

然後要修改 resolvers/index.js 改用 employee-cache.js 解析函式。

resolvers/index.js

1
2
3

...
const employeeResolvers = require('./employee-cache');
...

重新啟動 GraphQL 伺服器後，重新執行 query listEmployees，伺服器端的終端螢幕將可以看到快取機制的運作。

query allEmployees GraphQL server console

cache hit => 10:7782
cache hit => 20:7788
cache hit => 20:7902
cache hit => 20:7369
cache hit => 30:7499
cache hit => 30:7654
cache hit => 30:7844
cache hit => 20:7876
cache hit => 30:7900
cache hit => 10:7934
cache hit => 70:7607
cache hit => 70:7609
cache hit => 40:9011
cache hit => 10:8907
{ deptno: 10, dname: '會計部', loc: '紐約' }
{ deptno: 30, dname: 'SALES', loc: 'CHICAGO' }
{ deptno: 20, dname: 'RESEARCH', loc: 'DALLAS' }
{ deptno: 40, dname: 'OPERATIONS', loc: 'BOSTON' }
{ deptno: 70, dname: '資訊部', loc: '永康 B4' }

這裡我們只對資料庫讀取了 5 次，其他則來自快取。

這種快取方便又優雅，但是它是使用記憶體來儲存快取，與 GraphQL 伺服器共用約 1.5G 的記憶體，如果你的快取非常大，就不適合了。要想辦法將快取移出伺服器，最佳的選擇就是 Redis。

使用 Redis 實作快取模式

使用 Redis 實作快取可以解決記憶體的問題，也可以跨不同的 GraphQL 伺服器共享快取機制。最好在專案開始就要考慮效能的問題，將快取模式加進來。當然最好的選擇就是 Redis。

我們要新增一個解析函式檔來實作 Redis Cache 模式，resolvers/employee-redis.js。但在這之前我們要先做一個 Redis 操作的抽象層。

首先要安裝 node_redis 套件，這是一個 Redis 客戶端套件:

redis install

1	npm install redis --save

在 models 子目錄下新建程式檔 models/redisCache.js:

models/redisCache.js

const redis = require("redis");

const client = redis.createClient({
  host: "10.11.xx.xxx",
  port: 6379,
  password: "iscat",
  db: 0
});

const cachePrefix = "graphql:orademo:cache:";

function RedisCache(cacheName, expire = 30) {
  this._cache = cachePrefix + cacheName + ':';
  this._expire = expire;
}

RedisCache.prototype.set = function(item, value) {
  return new Promise((resolve, reject) => {
    const _this = this;
    client.set(_this._cache + item, value, "EX", _this._expire, (err, result ) => {
      if (err) return reject(err);
       
      return resolve(result);
    });
  });
}

RedisCache.prototype.get = function(item) {
  return new Promise((resolve, reject) => {
    const _this = this;
    client.get(_this._cache + item, (err, result) => {
      if (err) return reject(err);
  
      return resolve(result);
    });
  });
};

RedisCache.prototype.quit = function() {
  client.quit();
};

module.exports = RedisCache;

第 10 行將會是 Redis key 的一部分，稍後我們會看實際存儲到 Redis 時 key 樣子。在 Redis 生態圈習慣上使用冒號 ( : ) 來分隔名稱的不同部分，以此來構建命名空間 (namespace)。
第 12 行 RedisCache 是一個建構式，expire 引數可以訂到期时间，預設值是 30 秒。30 秒一到 Redis 將自動清除資料。
第 17 行 set 方法會將資料存入 Redis。如果原來的鍵(key)已存在，則會將舊資料(value)蓋掉。存入時可同時設定到期時間。
第 28 行 get 方法可以從 Redis 依鍵(key)取出資料，如果資料不存在(鍵不存在)會返回 null 值。

接著我們將這個抽象層應用在 resolvers/employee-redis.js 中。

resolvers/employee-redis.js

const RedisCache = require("../models/redisCache");
const cache = new RedisCache('department', 30);
const queues = {};

module.exports = {
  Query: {
    totalEmployees: (parent, args, { orademo }) =>
      orademo.employee.findAll().then(result => result.rows.length),
    allEmployees: (parent, args, { orademo }) => {
      return orademo.employee.findAll()
        .then(result => result.rows );
    }  
  }, 
  Employee: {
    department: async ({ deptno, empno }, args, { orademo }) => {
      let value = await cache.get(deptno);

      if (value) {
        console.log(`cache hit:  ${value} => ${deptno}:${empno}`);
        return JSON.parse(value);
      }

      if (queues[deptno]) {
        console.log(`Batching operation: ${deptno}:${empno}:queues:${Object.keys(queues).length}`);
        return queues[deptno];
      }

      queues[deptno] = orademo.department
        .findByDeptno(deptno)
        .then(result => {
          cache.set(deptno, JSON.stringify(result.rows[0]))
            .then((ack) => {
              console.log(`${ack}: ${JSON.stringify(result.rows[0])}`);
              process.nextTick(() => {
                delete queues[deptno];
              });
            });
          return result.rows[0];
        }).catch (err => {
          delete queues[deptno];
        });
      
      return queues[deptno];
    },
    income: (parent) => {
      let sal = !isNaN(parseFloat(parent.sal)) ? parseFloat(parent.sal) : 0;
      let comm = !isNaN(parseFloat(parent.comm)) ? parseFloat(parent.comm) : 0;
      return sal + comm;
    }
  },
};

第 2 行使用 RedisCache 建構式建立一個 department 快取，這是個 Redis cache。
第 3 行我們要搭配記憶體快取一起使用。這會在記憶體中存儲一些 Promise 的快取。
在非同步控制流程模式裡，對資料庫發出的請求，併不會等待資料的回覆，所以當 GraphQL 伺服器對資料庫發出所有的請求時，有可能資料庫都尚未回覆，Redis 中也不會有快取的資料，所以如果不將已經請求過的相同請求(相同部門的請求)存入記憶體快取，就會造成資料庫的重複請求，所以這裡要搭配前面例子記憶體的 Promise 快取，攔截對資料庫的重複請求。一旦資料從資料庫返回並存入 Redis 快取，就會將記憶體中的 Promise 快取消除，之後的請求就會從 Redis 快取取得資料。使用變數 queues 當成記憶體快取。
第 16 行第一優先從 Redis 快取取得資料，如果快取不存在 Redis 會返回 null 值。
第 23 行則是攔截已經送出但尚未回覆的相同資料庫請求，如果存在則直接返回 queues 記憶體快取中相同的 Promise。
第 28 行則是實際對資料庫發出請求的地方，31 行在實際資料返回後將資料存入 Redis 快取中，接著在 35 行刪除記憶體中的 Promise 快取。

現在修改 resolvers/index.js :

resolvers/index.js

1
2
3

...
const employeeResolvers = require('./employee-redis');
...

從新啟動 GraphQL 伺服器後，重新送出先前的 GraphQL listEmployees query:

GraphQL query listEmployees

query listEmployees {
  totalEmployees
  allEmployees {
    empno
    ename
    hiredate
    income
    department {
      dname
      loc
    }
  }
}

現在可從 GraphQL 伺服器終端螢幕上看到:

GraphQL Server console

Batching operation: 10:7782:queues:2
Batching operation: 20:7788:queues:3
Batching operation: 20:7902:queues:3
Batching operation: 20:7369:queues:3
Batching operation: 30:7499:queues:3
Batching operation: 30:7654:queues:4
Batching operation: 30:7844:queues:4
Batching operation: 20:7876:queues:4
Batching operation: 30:7900:queues:4
Batching operation: 10:7934:queues:4
Batching operation: 70:7607:queues:5
Batching operation: 70:7609:queues:5
Batching operation: 40:9011:queues:5
Batching operation: 10:8907:queues:5
OK: {"deptno":10,"dname":"會計部","loc":"紐約"}
OK: {"deptno":30,"dname":"SALES","loc":"CHICAGO"}
OK: {"deptno":40,"dname":"OPERATIONS","loc":"BOSTON"}
OK: {"deptno":20,"dname":"RESEARCH","loc":"DALLAS"}
OK: {"deptno":70,"dname":"資訊部","loc":"永康 B4"}

因為我們只有 5 個不同的部門，所以只有最後 5 個 OK 是實際從資料庫返回併寫入 Redis 快取時發出的訊息，前面 14 個 Batching operation 則是從記憶體快取中發出的，這裡都沒有用到 Redis 快取，因為資料還來不及寫入 Redis 快取，GraphQL 伺服器就都已經發出所有的請求了。

但是你如果在 30 秒鐘內再從 GraphQL Playground 發出第二次 listEmployees query，則 GraphQL 伺服器的終端螢幕會是:

GraphQL Server console

cache hit:  {"deptno":10,"dname":"會計部","loc":"紐約"} => 10:7839
cache hit:  {"deptno":30,"dname":"SALES","loc":"CHICAGO"} => 30:7698
cache hit:  {"deptno":10,"dname":"會計部","loc":"紐約"} => 10:7782
cache hit:  {"deptno":20,"dname":"RESEARCH","loc":"DALLAS"} => 20:7566
cache hit:  {"deptno":20,"dname":"RESEARCH","loc":"DALLAS"} => 20:7788
cache hit:  {"deptno":20,"dname":"RESEARCH","loc":"DALLAS"} => 20:7902
cache hit:  {"deptno":20,"dname":"RESEARCH","loc":"DALLAS"} => 20:7369
cache hit:  {"deptno":30,"dname":"SALES","loc":"CHICAGO"} => 30:7499
cache hit:  {"deptno":40,"dname":"OPERATIONS","loc":"BOSTON"} => 40:7608
cache hit:  {"deptno":30,"dname":"SALES","loc":"CHICAGO"} => 30:7654
cache hit:  {"deptno":30,"dname":"SALES","loc":"CHICAGO"} => 30:7844
cache hit:  {"deptno":20,"dname":"RESEARCH","loc":"DALLAS"} => 20:7876
cache hit:  {"deptno":30,"dname":"SALES","loc":"CHICAGO"} => 30:7900
cache hit:  {"deptno":10,"dname":"會計部","loc":"紐約"} => 10:7934
cache hit:  {"deptno":70,"dname":"資訊部","loc":"永康 B4"} => 70:9006
cache hit:  {"deptno":70,"dname":"資訊部","loc":"永康 B4"} => 70:7607
cache hit:  {"deptno":70,"dname":"資訊部","loc":"永康 B4"} => 70:7609
cache hit:  {"deptno":40,"dname":"OPERATIONS","loc":"BOSTON"} => 40:9011
cache hit:  {"deptno":10,"dname":"會計部","loc":"紐約"} => 10:8907

這裡全部從 Redis 快取中取得了資料，完全沒有對 Oracle Database 發出請求。

從 Redis 中可以查詢這些快取資料:

Redis using redis-cli

127.0.0.1:6379> scan 0 match graphql*
1) "0"
2) 1) "graphql:orademo:cache:department:40"
   2) "graphql:orademo:cache:department:10"
   3) "graphql:orademo:cache:department:70"
   4) "graphql:orademo:cache:department:30"
   5) "graphql:orademo:cache:department:20"
127.0.0.1:6379> get graphql:orademo:cache:department:20
"{\"deptno\":20,\"dname\":\"RESEARCH\",\"loc\":\"DALLAS\"}"
127.0.0.1:6379> ttl graphql:orademo:cache:department:20
(integer) 16
127.0.0.1:6379> exists graphql:orademo:cache:department:20
(integer) 0

“graphql:orademo:cache:department:20” 就是我們設定的鍵(key)，第 8 行用鍵取出設定的值，這裡我們存入的是 JSON 字符串。第 10 行可以使用 ttl 在鍵過期前查看剩餘的時間，等待 16 秒後可以使用 exists 判斷鍵是否還存在，這時將會返回 0。

在 Redis 生態圈習慣上使用冒號 ( : ) 來分隔名稱的不同部分，以此來構建命名空間 (namespace)。另外還有一些常見的分隔符，例如句號 ( . )、斜線 ( / )，有些人甚至還會使用管道符號 ( | )，無論使用哪個符號來做分隔符，都要保持分隔符號的一致性。

存在 Redis 的快取可以在各不同的伺服器中分享，是伺服端快取模式的最佳選擇。

過 30 秒後再度查詢，資料已不見了，Redis 會幫我們管理這個到期時間策略。

Redis using redis-cli

127.0.0.1:6379> scan 0 match graphql*
1) "0"
2) (empty list or set)
127.0.0.1:6379>

Redis 是一個速度非常快的非關聯式資料庫 (no-relational database)，它可以存儲鍵 (key) 與 5 種不同類型的值 (value) 之間的映射 (mapping)，可以將存儲在記憶體的鍵值對數據持久化 (persistence) 到硬碟，可以使用複製來擴展讀取的性能，還可以使用客戶端分片 (client-side sharding) 來擴展寫入的性能。這裡我們只用到一種型態的值 String，存儲的是 JSON 格式字串。

Redis 的數據結構致力於幫助用戶解決問題，而不會像其他資料庫那樣，要求用戶扭曲問題來適應資料庫。Redis 是一個可以用來解決問題的工具，它既擁有其他資料庫不具備的數據結構，又擁有記憶體存儲 ( 這使的 Redis 速度非常快 )、遠程 ( 這使的 Redis 可以與多個客戶端和伺服器進行連接 )、持久化 ( 這使的 Redis 在重啟之後仍然保持重啟之前的數據 ) 和可擴展性 ( 透過主從複製和分片 ) 等多個特性，這使的我們可以以熟悉的方式為各種不同的問題構建解決方案。