JavaScript merkle-patricia-tree/secure - 遍歷大型（~70Gb）狀態時的 OutOfMemory 嘗試

March 7, 2018

我有下面的 JavaScript 程式碼，它從 geth 的 leveldb 遍歷乙太坊主網塊 #5200035 的狀態樹。
經過一段時間的 CPU 使用率高並保留了大約 6Gb 的 RAM 後，它以FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory.
看起來該庫在主網數據庫的大小方面遇到了問題（快速同步後約為 70Gb）。一個簡單的鍵/值查找db.get工作正常。如果我嘗試遍歷較小的東西，例如合約的儲存樹（嘗試使用0xd0a6E6C54DbC68Db5db3A091B171A77407Ff7ccf），它也可以正常工作。
知道如何解決這個問題嗎？
var Trie = require('merkle-patricia-tree/secure');
var levelup = require('levelup');
var leveldown = require('leveldown');

//Connecting to the leveldb database
var db = levelup(leveldown('../datadir/geth/chaindata'));

//Adding the "stateRoot" value from the block so that we can inspect the state root at that block height.
var root = "0x4d675087a16f13fa5f61af74d79e08c82de7cf200e63d5b225b4a7937705a3e2"; // Block #5200035

//Creating a trie object of the merkle-patricia-tree library
var trie = new Trie(db, root);

//Creating a nodejs stream object so that we can access the data
var stream = trie.createReadStream()

//Turning on the stream (because the node js stream is set to pause by default)
stream.on('data', function (data){
 console.log(data)
});
這個問題與如何收集所有合約實例併計算字節碼冗餘有關？

拉取修復請求https://github.com/ethereumjs/merkle-patricia-tree/pull/38
上一個答案
我不是 100% 確定，但我認為問題在於使用 async.forEachOf()baseTrie.js在processNode函式https://github.com/ethereumjs/merkle-patricia-tree/blob/中遍歷節點的子節點dc436426d717fed408f4d46fed23f6d26d03d39d/baseTrie.js#L447：
function processNode (nodeRef, node, key, cb) {
   if (!node) return cb()
   if (aborted) return cb()
   var stopped = false
   key = key || []

   var walkController = {
     stop: function () {
       stopped = true
       cb()
     },
     // end all traversal and return values to the onDone cb
     return: function () {
       aborted = true
       returnValues = arguments
       cb()
     },
     next: function () {
       if (aborted) {
         return cb()
       }
       if (stopped) {
         return cb()
       }
       var children = node.getChildren()
       async.forEachOf(children, function (childData, index, cb) {
         var keyExtension = childData[0]
         var childRef = childData[1]
         var childKey = key.concat(keyExtension)
         self._lookupNode(childRef, function (childNode) {
           processNode(childRef, childNode, childKey, cb)
         })
       }, cb)
     },
     only: function (childIndex) {
       var childRef = node.getValue(childIndex)
       self._lookupNode(childRef, function (childNode) {
         var childKey = key.slice()
         childKey.push(childIndex)
         processNode(childRef, childNode, childKey, cb)
       })
     }
   }
   onNode(nodeRef, node, key, walkController)
 }
如果在這裡使用非同步，那麼如果使用同步遍歷，則樹將首先遍歷廣度，而不是深度優先（廣度優先基本上意味著在您獲得流中的第一個鍵之前，您將整個樹載入到記憶體中，目前有大約 3000 萬個帳戶在狀態下它可以輕鬆超過6GB）。我會嘗試將其更改為一個簡單的 for 循環，看看它是否解決了問題（不幸的是，現在無法測試它，因為我沒有主網鏈數據）。
或者嘗試使用另一個庫來閱讀 Patricia Merkle Trie。

引用自：https://ethereum.stackexchange.com/questions/41749

JavaScript merkle-patricia-tree/secure - 遍歷大型（~70Gb）狀態時的 OutOfMemory 嘗試

相關問答

如何查看一段時間內智能合約的乙太坊餘額和唯一交易數？

如何估計完成同步的剩餘時間？

geth 如何通過控制台訪問合約變數？

我應該為 geth 和 leveldb 使用什麼文件系統？

TypeError：在元組（布爾，字節記憶體）中進行參數相關查找後，成員“gas”未找到或不可見

如何使用來自 etherscan API 的 ABI 動態載入合約數據？