從高熵文件生成密碼
假設
- 我們想生成一個 32 字節的密碼,使用密鑰文件作為源,但不是直接使用它,而是通過中間步驟;
所以:
- 我們有一個 100MB 的高熵文件
- 我們要求使用者提供 2 個值,(A) 一個是文件的偏移量,第二個 (B) 是塊大小
然後我們:
- Open the file and skip to offset (A)
- Read as many chunks of size (B) until end of file
- Per chunk scrypt it into value (scrypt set to take ~1 sec), crc the out block to a integer value and add it to the 32 byte array (increment index for each chunk, reset to 0 when at 32)
If I’m correct it would take an attacker ~10 years to go through the whole keyspace with a brute force attack. Would this be correct or am I missing something here (probably) - as in for each byte (offset) there are 100MB/32 possible variations in size (there are some details that are important which I’m skipping over - but it’s the general idea I’m thinking of)
J
If I’m correct it would take an attacker ~10 years to go through the whole keyspace with a brute force attack. Would this be correct or am I missing something here
No
First let’s consider simple fact: The more power you have the shorter attack will take. There is no such thing as “it would take attacker 10 years”. Also, 10 years is very, very short compared to proper key, which is very far from measurement in life-times.
Also, you seem to imply that fact that attacker will have to try multiple files will make it hard. Imagine that I have botnet with 40 000 machines. I can assign one machine for each of your files. So it will just take me as long as it took you to decrypt it (if not better). I can then try most common choices of A and B which would me take maybe day if i’m unlucky (considering that your decryption takes you minute, which is massive amount of time for password-based encryption).
While when you had used password with just 6 characters, I would have somewhat harder time recovering those.
If you assume that attacker knows your file, your whole process is basically useless. There simply isn’t enough entropy in A, B and file selection. If you assume attacker doesn’t know your file then all you do is still useless, because you would be just as secure in getting first 256-bytes of that file (since you assume that this file is high-entropy, if it isn’t consistently high-entropy then you simply hash that file and end in same thing).
Also, one advantage of using passwords is that user himself choose how secure he is. If he uses 4-character password, it will be broken. If he uses 20-character password then there is good chance it will never be broken. With your scheme everyone who doesn’t have 20TB worth of storage for keys is limited to 4 characters of security (and with 20TB you won’t be much better).
TL;DR: Use established best practices. With your scheme you are arguably worse than with md5.
Also speaking of your scheme, it seems to come from world of “security by obscurity”. There is no point in using crc-16 to shorten your key. Scrypt returns high-entropy mixed key, which you can simply cut. But your scheme doesn’t spread entropy trough your returned key (which is terrible - you can know part of key with part of file). Also, you use scrypt to convert key to key, which is something pointless, since your key generation will take longer with longer file. So long in fact that user will get bored. Just imagine 60bytes file with A=0, B=1. It will take one minute to get key. You would be far better with just scrypting hash of whole file (you would still be able to configure scrypt time to something reasonable).
It seems you are mainly after two-factor authentication. In that case: perform scrypt (or any other password hashing scheme) and use the resulting symmetric key to wrap a pre-generated private key. Now you can store that private key on USB. Alternatively you could use a smart card or USB HSM.
Now you can authenticate using the private key instead of a passphrase. If private key authentication is not possible you could also encrypt a passphrase (making sure you do not leak the size of the pass-phrase).
PGP would come to mind as it encrypts your private key as described for file encryption. For transport security SSH can do something similar as well.