Attempt to automatically correct any typographical errors in a BIP39 mnemonic
phrase, returning correction information for the closest matching valid
mnemonic phrase.
Note that by design, BIP39 allows seed derivation from any NFKD-normalized
string, including phrases containing an incorrect or unrecognized checksum.
Indiscriminate use of this function during BIP39 mnemonic phrase import
would prevent the use of seeds derived from such nonstandard phrases.
Instead, this function should be used to offer an end user the best possible
correction for the provided mnemonic phrase, e.g.:
[Button: "Use corrected phrase"] [Button: "Use phrase with errors"]
This function attempts the following corrections, returning a
Bip39MnemonicCorrection as soon as a BIP39 mnemonic phrase with a
valid checksum is produced:
Trim whitespace from the beginning and end of the phrase
Convert the phrase to lowercase characters
Identify the best candidate word list from possibleWordLists with which
to correct errors by counting exact prefix matches, e.g. aban is a prefix
match for both English and French. If two word lists share the same number of
matches, the earlier index in possibleWordLists is prioritized. (Note that
100 words are shared between the French and English word lists, and 1275
words are shared between the Chinese Traditional and Chinese Simplified word
lists. Because this function is intended to correct standard BIP39 mnemonic
phrases, we assume that all correct words are found in a single word list.
Deduplicate spaces between words, ensuring the expected space separator is
used (for the Japanese word list, an ideographic space separator: \u3000).
Attempt to verify the checksum for all valid subsets of the phrase by
slicing the phrase at 24, 21, 18, 15, and 12 words. In these cases, the
additional words may have been entered in error, or they may be part of a
passphrase recorded in the same location as the phrase. The returned
Bip39MnemonicCorrection.description indicates that the user should
review the source material to see if the deleted word(s) are a passphrase.
For every word where an exact match is not found, develop a ranked list of
possible matches:
Attempt to extend the word by finding all word(s) in the selected word
list with a matching prefix (i.e. only the first few characters of the
correct word were included in the incorrect phrase). If multiple prefix
matches are found, rank them in word list order (in later steps, all of these
matches are considered to have a similarity of 1).
If no prefix matches were found, compute the Jaro similarity between
the unknown word and every word in the candidate word list, adding all
words to the ranked list in descending-similarity, then word list order.
Attempt to find a corrected phrase with the minimum possible correction by
validating the checksum for each candidate combination in ranked order:
Beginning with a similarity target of 1, create candidate combinations
by replacing unknown words with all possible matches having a similarity
equal to or greater than the target.
If any unknown words have no matches meeting the similarity target, lower
the similarity target to the value of the next-most-similar match for that
word, using only that match in this correction iteration.
If no phrases with a valid checksum are found, repeat these steps with
the lowered similarity target, excluding previously-tried combinations.
If no plausible matches are found, or if the provided phrase has an invalid
word count (after attempting to correct whitespace errors), an error (string)
is returned.
Note, this method does not attempt to correct mnemonic phrases with an
incorrect word count; in these cases, the user should be asked to either
identify and provide the missing words or use a dedicated brute-forcing tool
(if words have been lost).
TODO: not yet implemented; see also: attemptCashAddressFormatErrorCorrection
Attempt to automatically correct any typographical errors in a BIP39 mnemonic phrase, returning correction information for the closest matching valid mnemonic phrase.
Note that by design, BIP39 allows seed derivation from any NFKD-normalized string, including phrases containing an incorrect or unrecognized checksum.
Indiscriminate use of this function during BIP39 mnemonic phrase import would prevent the use of seeds derived from such nonstandard phrases.
Instead, this function should be used to offer an end user the best possible correction for the provided mnemonic phrase, e.g.:
This function attempts the following corrections, returning a Bip39MnemonicCorrection as soon as a BIP39 mnemonic phrase with a valid checksum is produced:
possibleWordLists
with which to correct errors by counting exact prefix matches, e.g.aban
is a prefix match for both English and French. If two word lists share the same number of matches, the earlier index inpossibleWordLists
is prioritized. (Note that100
words are shared between the French and English word lists, and1275
words are shared between the Chinese Traditional and Chinese Simplified word lists. Because this function is intended to correct standard BIP39 mnemonic phrases, we assume that all correct words are found in a single word list.\u3000
).1
).If no plausible matches are found, or if the provided phrase has an invalid word count (after attempting to correct whitespace errors), an error (string) is returned.
Note, this method does not attempt to correct mnemonic phrases with an incorrect word count; in these cases, the user should be asked to either identify and provide the missing words or use a dedicated brute-forcing tool (if words have been lost).