Go-Ethereum

懸空雜湊節點參考:打開的文件太多

  • April 7, 2016

我正在與客戶端一起執行兩個超級節點geth。在高峰時段,他們有 500-800 個連接,每小時處理大約 16GB 的流量。

這就是我執行節點的方式:

geth --networkid "1" --identity "5chdn-supernode-deneb" --maxpeers "1024" --rpc console 2>>/tmp/geth.log

現在它每隔幾天就會崩潰too many open files,鏈數據上出現錯誤:

E0405 10:04:40.631505   17793 trie.go:309] Dangling hash node ref 3c356161363764616266326264373164613363306632313861373037393930376533633962616537313338333732383161623537656534333063396630363636313e20: open /home/user/.ethereum/chaindata/100673.ldb: too many open files
E0405 10:04:40.632653   17793 trie.go:309] Dangling hash node ref 3c386639363738343930626135663333623135656434616339663135373938623530313564393533323664626639333865383237623633383164616232666166333e20: open /home/user/.ethereum/chaindata/100596.ldb: too many open files
E0405 10:04:40.634643   17793 trie.go:309] Dangling hash node ref 3c353531373463643738393431336166666338313331633038313736656531663039383735326130353034306533326162323563663762363137376664633662643e20: open /home/user/.ethereum/chaindata/100662.ldb: too many open files
E0405 10:04:40.634690   17793 trie.go:309] Dangling hash node ref 3c353531373463643738393431336166666338313331633038313736656531663039383735326130353034306533326162323563663762363137376664633662643e20: open /home/user/.ethereum/chaindata/100662.ldb: too many open files
E0405 10:04:40.634817   17793 trie.go:309] Dangling hash node ref 3c353531373463643738393431336166666338313331633038313736656531663039383735326130353034306533326162323563663762363137376664633662643e20: open /home/user/.ethereum/chaindata/100662.ldb: too many open files
E0405 10:04:40.635057   17793 trie.go:309] Dangling hash node ref 3c643533653331616464343066653864633666363766373530646462626435636565306233383337643539643639626562636463356138613662366465346432633e20: open /home/user/.ethereum/chaindata/100627.ldb: too many open files
E0405 10:04:40.639013   17793 trie.go:309] Dangling hash node ref 3c633838383230656236353763303762663366366335343133386635366664656337323030623166333361363464353862306464393639636137643964383738373e20: open /home/user/.ethereum/chaindata/100622.ldb: too many open files
E0405 10:04:40.639076   17793 trie.go:309] Dangling hash node ref 3c633838383230656236353763303762663366366335343133386635366664656337323030623166333361363464353862306464393639636137643964383738373e20: open /home/user/.ethereum/chaindata/100622.ldb: too many open files
E0405 10:04:40.639281   17793 trie.go:309] Dangling hash node ref 3c633838383230656236353763303762663366366335343133386635366664656337323030623166333361363464353862306464393639636137643964383738373e20: open /home/user/.ethereum/chaindata/100622.ldb: too many open files
E0405 10:04:40.639367   17793 trie.go:309] Dangling hash node ref 3c323762626362616166306236663131386230333534313862303831373539653935616637333532383564323337656263623262643734363638643836333733363e20: open /home/user/.ethereum/chaindata/100219.ldb: too many open files
E0405 10:04:40.640494   17793 trie.go:309] Dangling hash node ref 3c326166613631323661333034656265356538643265633165623130316135366561303932373566373139353736306234373965313266663065323930643864643e20: open /home/user/.ethereum/chaindata/100544.ldb: too many open files
E0405 10:04:40.640977   17793 trie.go:309] Dangling hash node ref 3c613362623765633164313033316464633163366536316438313534396265613639633536336632653431353233653739633833303232303238343263373163633e20: open /home/user/.ethereum/chaindata/100605.ldb: too many open files
E0405 10:04:40.644731   17793 trie.go:309] Dangling hash node ref 3c316139653230373531643062346136383163323863616631353461376633396334643165623037396464646163343834393465343532393961396437663531363e20: open /home/user/.ethereum/chaindata/100065.ldb: too many open files
E0405 10:04:40.644986   17793 trie.go:309] Dangling hash node ref 3c663633343166326138393366313636356461336235353732666137386535373839626161386237373633643534653063623865303336396233343035666665333e20: open /home/user/.ethereum/chaindata/100638.ldb: too many open files
E0405 10:04:40.645818   17793 trie.go:309] Dangling hash node ref 3c323361343762373937356438626363636638643431616336666130336332653939353539646434396436336463333330656138353638323466663763656337383e20: open /home/user/.ethereum/chaindata/100211.ldb: too many open files
E0405 10:04:40.645878   17793 trie.go:309] Dangling hash node ref 3c323361343762373937356438626363636638643431616336666130336332653939353539646434396436336463333330656138353638323466663763656337383e20: open /home/user/.ethereum/chaindata/100211.ldb: too many open files
E0405 10:04:40.646031   17793 trie.go:309] Dangling hash node ref 3c323361343762373937356438626363636638643431616336666130336332653939353539646434396436336463333330656138353638323466663763656337383e20: open /home/user/.ethereum/chaindata/100211.ldb: too many open files
E0405 10:04:40.649087   17793 trie.go:309] Dangling hash node ref 3c356231383235393062666662663436643061653261303530653430313738333435616437646630303334643062666538333231343831396238343730663464613e20: open /home/user/.ethereum/chaindata/100675.ldb: too many open files
E0405 10:04:40.649151   17793 trie.go:309] Dangling hash node ref 3c356231383235393062666662663436643061653261303530653430313738333435616437646630303334643062666538333231343831396238343730663464613e20: open /home/user/.ethereum/chaindata/100675.ldb: too many open files
E0405 10:04:40.649349   17793 trie.go:309] Dangling hash node ref 3c356231383235393062666662663436643061653261303530653430313738333435616437646630303334643062666538333231343831396238343730663464613e20: open /home/user/.ethereum/chaindata/100675.ldb: too many open files
E0405 10:04:40.650758   17793 trie.go:309] Dangling hash node ref 3c393434663365613033363436313739663734326635636365366139323430343466383139663334363838386239643034323936336339656666626361396231643e20: open /home/user/.ethereum/chaindata/100598.ldb: too many open files
E0405 10:04:40.651084   17793 trie.go:309] Dangling hash node ref 3c353864343237306166363438373062643739626639363465653139363331633132376336376538356161326531316333636264383037343730333464306663363e20: open /home/user/.ethereum/chaindata/100669.ldb: too many open files
E0405 10:04:40.651348   17793 trie.go:309] Dangling hash node ref 3c633231643033396639393134353864316430336366666134663063653366636161363466383361616434646436346139343635333731356431663764666266373e20: open /home/user/.ethereum/chaindata/100620.ldb: too many open files
E0405 10:04:40.663574   17793 trie.go:309] Dangling hash node ref 3c663633343166326138393366313636356461336235353732666137386535373839626161386237373633643534653063623865303336396233343035666665333e20: open /home/user/.ethereum/chaindata/100638.ldb: too many open files
E0405 10:04:40.665738   17793 trie.go:309] Dangling hash node ref 3c346461643532333134616566613162313039613339626235373532376531326635653333366232303438333635333233623565663536636334383464613862323e20: open /home/user/.ethereum/chaindata/100643.ldb: too many open files
E0405 10:04:40.667364   17793 trie.go:309] Dangling hash node ref 3c343336383939613631616334336633396333366561636334353339376562643436396538303763356464356264623561653736373431663061306364346565333e20: open /home/user/.ethereum/chaindata/100511.ldb: too many open files
E0405 10:04:40.669370   17793 trie.go:309] Dangling hash node ref 3c356539623665376230376532643838643535626336343765646562653931303638613531646466363332333761323863303162623865323332373637666261313e20: open /home/user/.ethereum/chaindata/100683.ldb: too many open files
E0405 10:04:40.669724   17793 trie.go:309] Dangling hash node ref 3c383761313431313131303938646665613533393165353036653239383163373135663835633165336434366633353137396633363134333437623733383761323e20: open /home/user/.ethereum/chaindata/100593.ldb: too many open files
E0405 10:04:40.671738   17793 trie.go:309] Dangling hash node ref 3c366538653865633965616466356437313832663431333065666362623566346330663734326232653336383038306364643231646239363734356339663433383e20: open /home/user/.ethereum/chaindata/100574.ldb: too many open files

這是為什麼?如何解決這個問題?我想執行一個穩定的節點。

檢查打開文件描述符限制

您可能需要增加 Linux 文件限制以處理您的節點正在服務的連接數。這裡有一些關於檢查和設置限制的文章。有硬性和軟性、系統範圍和每個使用者的限制。

要檢查 Linux 系統上配置的最大文件描述符數:

user@Kumquat:~$ cat /proc/sys/fs/file-max
793317

檢查使用者打開文件描述符的最大數量(硬限制):

user@Kumquat:~$ ulimit -Hn
4096

檢查使用者打開文件描述符的最大數量(軟限制):

user@Kumquat:~$ ulimit -Sn
1024

要檢查 geth 實例的限制:

user@Kumquat:~$ ps -ef | grep geth
user      6492  6479 20 09:53 pts/6    00:00:01 geth console
user      6511 30948  0 09:53 pts/2    00:00:00 grep --color=auto geth
user@Kumquat:~$ cat /proc/6492/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             30995                30995                processes 
Max open files            1024                 4096                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       30995                30995                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

要檢查 geth 實例使用的打開文件描述符的數量:

user@Kumquat:~$ lsof -p 6492 | wc -l
78
user@Kumquat:~$ lsof -p 6492
COMMAND  PID USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
geth    6492  user cwd    DIR              252,1     4096 18748035 /home/user/ESE/WhoisBug
geth    6492  user rtd    DIR              252,1     4096        2 /
geth    6492  user txt    REG              252,1 17848784  7734154 /usr/bin/geth
geth    6492  user mem    REG              252,1    47712 21894327 /lib/x86_64-linux-gnu/libnss_files-2.19.so
geth    6492  user mem    REG              252,1    47760 21894319 /lib/x86_64-linux-gnu/libnss_nis-2.19.so
geth    6492  user mem    REG              252,1    97296 21894315 /lib/x86_64-linux-gnu/libnsl-2.19.so
geth    6492  user mem    REG              252,1    39824 21890648 /lib/x86_64-linux-gnu/libnss_compat-2.19.so
geth    6492  user mem    REG              252,1  1840928 21894324 /lib/x86_64-linux-gnu/libc-2.19.so
geth    6492  user mem    REG              252,1   141574 21894316 /lib/x86_64-linux-gnu/libpthread-2.19.so
geth    6492  user mem    REG              252,1  1071552 21890641 /lib/x86_64-linux-gnu/libm-2.19.so
geth    6492  user mem    REG              252,1   149120 21894317 /lib/x86_64-linux-gnu/ld-2.19.so
geth    6492  user   0u   CHR              136,6      0t0        9 /dev/pts/6
geth    6492  user   1u   CHR              136,6      0t0        9 /dev/pts/6
geth    6492  user   2u   CHR              136,6      0t0        9 /dev/pts/6
geth    6492  user   3uW  REG              252,1        0 17305945 /home/user/.ethereum/chaindata/LOCK
geth    6492  user   4w   REG              252,1   445185 17305532 /home/user/.ethereum/chaindata/LOG
geth    6492  user   6u  0000               0,10        0     8681 anon_inode
geth    6492  user   7u  IPv4             301021      0t0      TCP 192-168-1-14.tpgi.com.au:36388->77.70.125.246:30303 (SYN_SENT)
geth    6492  user   8uW  REG              252,1        0 17175139 /home/user/.ethereum/nodes/LOCK
geth    6492  user   9w   REG              252,1     1288 17171201 /home/user/.ethereum/nodes/LOG
geth    6492  user  10w   REG              252,1   899579 17313769 /home/user/.ethereum/chaindata/MANIFEST-656380
geth    6492  user  12u  IPv4             301019      0t0      TCP 192-168-1-14.tpgi.com.au:51259->pool-98-113-173-46.nycmny.fios.verizon.net:30303 (ESTABLISHED)
geth    6492  user  13r   REG              252,1  1752712 17313448 /home/user/.ethereum/chaindata/656104.ldb
geth    6492  user  14u  IPv4             300637      0t0      TCP 192-168-1-14.tpgi.com.au:57807->cpc78189-warw17-2-0-cust572.3-2.cable.virginm.net:30303 (ESTABLISHED)
geth    6492  user  15u  IPv4             301126      0t0      TCP 192-168-1-14.tpgi.com.au:47302->195-154-165-137.rev.poneytelecom.eu:30303 (ESTABLISHED)
geth    6492  user  16r   REG              252,1    22345 17313871 /home/user/.ethereum/chaindata/656914.ldb
geth    6492  user  17uW  REG              252,1        0 17047070 /home/user/.ethereum/dapp/LOCK
geth    6492  user  18w   REG              252,1      396 17039394 /home/user/.ethereum/dapp/LOG
geth    6492  user  19w   REG              252,1        0 17040771 /home/user/.ethereum/dapp/000420.log
geth    6492  user  20w   REG              252,1       43 17040772 /home/user/.ethereum/dapp/MANIFEST-000421
geth    6492  user  21u  IPv4             302150      0t0      TCP 192-168-1-14.tpgi.com.au:58595->adsl-203.91.140.34.tellas.gr:30303 (ESTABLISHED)
geth    6492  user  23r   REG              252,1  2109837 17305672 /home/user/.ethereum/chaindata/481514.ldb
geth    6492  user  24u  IPv4             298824      0t0      TCP 192-168-1-14.tpgi.com.au:36823->user-24-96-159-205.knology.net:30303 (ESTABLISHED)
geth    6492  user  26u  IPv6             299586      0t0      UDP *:30303 
geth    6492  user  27w   REG              252,1   581824 17171210 /home/user/.ethereum/nodes/001081.log
geth    6492  user  28w   REG              252,1     2049 17171217 /home/user/.ethereum/nodes/MANIFEST-001082
geth    6492  user  29r   REG              252,1  2128363 17171214 /home/user/.ethereum/nodes/001083.ldb
geth    6492  user  30u  IPv6             297760      0t0      TCP *:30303 (LISTEN)
geth    6492  user  31u  unix 0x0000000000000000      0t0   297761 /home/user/.ethereum/geth.ipc
geth    6492  user  32r   REG              252,1  2124899 17171216 /home/user/.ethereum/nodes/001084.ldb
geth    6492  user  33r   REG              252,1  2125504 17171225 /home/user/.ethereum/nodes/001085.ldb
geth    6492  user  34r   REG              252,1  2121011 17171229 /home/user/.ethereum/nodes/001086.ldb
geth    6492  user  35r   REG              252,1  2124190 17313829 /home/user/.ethereum/chaindata/658913.ldb
geth    6492  user  37u  IPv4             298768      0t0      TCP 192-168-1-14.tpgi.com.au:53786->sky.loxal.net:30303 (ESTABLISHED)
geth    6492  user  38u  IPv4             299790      0t0      TCP 192-168-1-14.tpgi.com.au:40468->static.177.39.9.176.clients.your-server.de:30303 (ESTABLISHED)
geth    6492  user  39r   REG              252,1    18099 17313873 /home/user/.ethereum/chaindata/656916.ldb
geth    6492  user  40r   REG              252,1  1709208 17313834 /home/user/.ethereum/chaindata/658914.ldb
geth    6492  user  42u  IPv4             297767      0t0      TCP 192-168-1-14.tpgi.com.au:40764->63.224.55.73:30303 (ESTABLISHED)
geth    6492  user  45r   REG              252,1  2129807 17310863 /home/user/.ethereum/chaindata/656209.ldb
geth    6492  user  46r   REG              252,1  2129957 17306835 /home/user/.ethereum/chaindata/656245.ldb
geth    6492  user  48u  IPv4             301020      0t0      TCP 192-168-1-14.tpgi.com.au:41864->217.96.253.11.ipv4.supernova.orange.pl:30303 (SYN_SENT)
geth    6492  user  49u  IPv4             300960      0t0      TCP 192-168-1-14.tpgi.com.au:52510->41.142.88.193:30303 (ESTABLISHED)
geth    6492  user  51r   REG              252,1  2145665 17309081 /home/user/.ethereum/chaindata/656168.ldb
geth    6492  user  52r   REG              252,1    12749 17313877 /home/user/.ethereum/chaindata/656920.ldb
geth    6492  user  53w   REG              252,1  3007664 17313949 /home/user/.ethereum/chaindata/658851.log
geth    6492  user  54r   REG              252,1  2130155 17312570 /home/user/.ethereum/chaindata/655955.ldb
geth    6492  user  55r   REG              252,1  2131971 17313153 /home/user/.ethereum/chaindata/655988.ldb
geth    6492  user  57r   REG              252,1  2130863 17313239 /home/user/.ethereum/chaindata/656279.ldb
geth    6492  user  58r   REG              252,1    28822 17313865 /home/user/.ethereum/chaindata/656908.ldb
geth    6492  user  59r   REG              252,1    25844 17313876 /home/user/.ethereum/chaindata/656919.ldb
geth    6492  user  64r   REG              252,1  2130876 17313198 /home/user/.ethereum/chaindata/656361.ldb
geth    6492  user  65r   REG              252,1    42947 17313875 /home/user/.ethereum/chaindata/656918.ldb
geth    6492  user  67r   REG              252,1  2129128 17312345 /home/user/.ethereum/chaindata/656240.ldb
geth    6492  user  68u  IPv4             300629      0t0      TCP 192-168-1-14.tpgi.com.au:39338->c-76-101-62-156.hsd1.fl.comcast.net:30303 (ESTABLISHED)
geth    6492  user  69r   REG              252,1     8376 17313861 /home/user/.ethereum/chaindata/656904.ldb

檢查 geth 實例使用的打開文件描述符數量的更快方法:

user@Kumquat:~$ ls /proc/6491/fd
0  10  12  14  16  18  2   24  26  28  3   31  33  35  37  39  40  43  45  47  50  53  55  6   63  65  68  7  9
1  11  13  15  17  19  20  25  27  29  30  32  34  36  38  4   41  44  46  5   51  54  56  60  64  67  69  8
user@Kumquat:~$ ls /proc/6491/fd | wc -l
66

更改 Geth 實例的軟限制

ulimit您可以通過在啟動 geth 之前執行命令來更改 geth 的軟限制。最大軟限制數是硬限制數。

user@Kumquat:~$ ulimit -n 5000
bash: ulimit: open files: cannot modify limit: Operation not permitted
# Can only change soft limit up to the hard limit
user@Kumquat:~$ ulimit -n 4000
user@Kumquat:~$ geth console
# In a separate window
user@Kumquat:~$ $ ps -ef | grep geth
user      6649  6479 14 10:14 pts/6    00:00:01 geth console
user      6667 30948  0 10:14 pts/2    00:00:00 grep --color=auto geth
user@Kumquat:~$ cat /proc/6649/limits
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             30995                30995                processes 
Max open files            4000                 4000                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       30995                30995                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

如果您只需要更改低於預設 4096 硬限制的軟限制,請創建一個 bash 腳本文件來執行 geth 並ulimit在啟動 geth 之前添加命令。

更改使用者的硬限制

在上一節中,您不能將軟限制更改為硬限制以上。要更改硬限制,請將以下內容添加到/etc/security/limits.conf

* soft nofile 14096
* hard nofile 14096

請注意, * 將更改所有非 root 使用者的限制。要更改 root 使用者的限制,請將 * 替換為“root”。例如:

* soft nofile 14096
* hard nofile 14096
root soft nofile 24096
root hard nofile 24096

要使上述內容生效,您必須編輯/etc/pam.d/common-session*(多個文件)並添加:

session required pam_limits.so

可能需要註銷/登錄或重新啟動才能使更改生效。

減少同時網路連接的數量

您也可以使用該參數減少每個節點處理的對等連接數,--maxpeers並且可能不會超過您的打開文件描述符限制。

網路連接也屬於文件限制。看:

可能損壞的文件

由於超出文件限制,您可能有一些損壞的文件。這取決於 trie.go 程式碼在需要打開文件但失敗時如何處理錯誤情況,以及是否打開文件進行讀取和/或寫入。

潛在的錯誤

您遇到的問題可能是由於一次打開的文件描述符過多而無法為目前流量提供服務,也可能是由於文件描述符未正確關閉。如果是後者,那麼增加限制只會延遲問題的發生。

為了測試這一點,您可以設置一個 cron 作業來定期記錄日期、打開的文件描述符的計數和網路連接的計數,如果存在潛在的錯誤,您應該會看到打開的文件描述符的增加趨勢。

一些參考資料

引用自:https://ethereum.stackexchange.com/questions/2694