libreoffice / asking for cell password / brute force
While we were editing a provider-supplied Excel document using LibreOffice, at seemingly random times, it would show a popup asking us for a password to a cell. This popup would only go away if we set a new (non-blank) password on it. Annoying!
Apparently, it has to do with Sheet and Cell protection whereby an editing user is disallowed to edit certain cells/rows/sheets in a document.
Having certain cells marked read-only, sure. But protected using a password? That doesn’t make sense. We’re editing the document, it’s ours now.
So, you can disable sheet protection using the Tools menu, where there is a Protect sheet checkbox. Disabling it should give us full access. Unfortunately, in this case the correct non-blank password is needed.
As you may already know, an xlsx
file, is just a PK zipped archive
with XML and other content in it. Let’s see if we can find a password
in there:
$ cd $(mktemp -d)
$ unzip ~/Documents/annoying_document.xlsx
Archive: /home/walter/Documents/annoying_document.xlsx
inflating: xl/workbook.xml
inflating: xl/styles.xml
...
$ grep password . -r
./xl/worksheets/sheet1.xml:...
...</sheetData><sheetProtection sheet="true" password="ca5b"
formatColumns="false" formatRows="false"
insertRows="false"/><mergeCells count="7">...
Okay, there’s the password: ca5b
. Except it’s not a password. It’s a
(very short) hash.
We can simply remove the entire <sheetProtection/>
tag and zip it up
again. (Easy fix.)
Or, we could go for the hard fix, and try to brute force a password. This also helps if you have multiple documents with the same “protection”.
OpenOffice.org’s documentation has this to say about the password hash:
The PASSWORD record contains the hash value of the password used to protect the sheet.
[…]
The length of the password is restricted to 15 characters.
ALGORITHM Get_Password_Hash( password ) 1) hash <-0 ; char_index <-char_count <-character count of password 2) char_index <-char_index - 1 3) char <-character from password with index char_index {0 is leftmost character} 4) hash <-hash XOR char 5) rotate the lower 15 bits of hash left by 1 bit 6) IF char_index > 0 THEN JUMP 2) 7) RETURN hash XOR char_count XOR CE4BH
In Python, that might look like this:
def f(s): # takes a string
h = 0 # step 1
for idx in range(len(s) - 1, -1, -1): # step 1 and 2 and 6
h ^= ord(s[idx]) # step 3 and 4
h <<= 1 # step 5 (left shift)
if h & 0x8000: # step 5 (check high bit)
h = (h & 0x7fff) | 1 # step 5 (truncate + rotate)
return (h ^ len(s)) ^ 0xce4b # step 7
And running it on the string abcdefghij
yields FEF1
(in hex):
>>> hex(f('abcdefghij'))
'0xfef1'
>>> assert f('zzyw') == f('BBAb')
>>> assert f('zzyw') == f('pqpp')
>>> assert f('zzyw') != f('pqpr')
If we want to brute force it, we’ll optimize away the 0xce4b
and the
ord()
calls first:
def f_tuple(t): # takes a tuple of ints
h = 0 # step 1
for idx in range(len(t) - 1, -1, -1): # step 1 and 2 and 6
h ^= t[idx] # step 3 and 4
h <<= 1 # step 5 (left shift)
if h & 0x8000: # step 5 (check high bit)
h = (h & 0x7fff) | 1 # Step 5 (truncate + rotate)
return h ^ len(t) # step 7
>>> hex(f_tuple([ord(i) for i in 'abcdefghij']) ^ 0xce4b)
'0xfef1'
>>> import timeit
>>> abcdefghij_tuple = tuple(ord(i) for i in 'abcdefghij')
>>> result_tuple = 0xfef1 ^ 0xce4b
>>> timeit.timeit(lambda: f('abcdefghij') == 0xfef1)
1.3861091136932373
>>> timeit.timeit(lambda: f_tuple(abcdefghij_tuple) == result_tuple)
1.2009429931640625
And we can use the nice generator pattern provided by Python:
def generator():
"Yields 'A', 'B', 'C', .., 'Z', 'AA', 'AB', etc. as tuples of ints"
valid_tokens = tuple([ord(i) for i in (
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
'abcdefghijklmnopqrstuvwxyz'
'0123456789'
# You're free to add more tokens, but it may have an adverse effect
# on the search. YMMV.
# '!@#$%^&*()' ',./<>?;:[]{}-_=+|'
)])
max_token = len(valid_tokens)
# "The length of the password is restricted to 15 characters."
for length in range(1, 16): # length 1 through length 15
state = length * [0] # index to valid_tokens
while True:
pass_ = tuple(valid_tokens[i] for i in state)
yield pass_ # as tuple of ints, because f_tuple is faster than f
# Increase last element of state by one:
for tailidx in range(length - 1, -1, -1): # edit last index first
state[tailidx] += 1
if state[tailidx] != max_token:
break
state[tailidx] = 0
else:
# We did not break out of the loop, so all of state was
# max_key. Try the next length.
break
So.. that keeps repeating forever, until all combinations have been tried:
>>> g = iter(generator())
>>> [''.join(chr(ch) for ch in next(g)) for i in range(10)]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
.. and after 10,000 iterations, it’s at:
>>> [''.join(chr(ch) for ch in next(g)) for i in range(10000)][-10:]
['BkS', 'BkT', 'BkU', 'BkV', 'BkW', 'BkX', 'BkY', 'BkZ', 'Bka', 'Bkb']
Wrapping it all up, a function to find a valid password for a specified hash:
def find_password_for(wanted):
expected = wanted ^ 0xce4b
for t in generator():
if f_tuple(t) == expected:
pwd = ''.join(chr(ch) for ch in t)
print('found password for 0x{:x}: {} [{!r}]'.format(
wanted, pwd, pwd))
return
>>> find_password_for(0xca5b)
found password for 0xca5b: BBAy ['BBAy']
.. or, if we want to know what it’s doing, we can add a bit of SIGALRM
magic:
def find_password_for(wanted):
# vvvvv-- print status
import signal
def print_status(*args):
print('(generator is at {!r})'.format(''.join(chr(ch) for ch in t)))
signal.alarm(1)
signal.signal(signal.SIGALRM, print_status)
signal.alarm(1)
try:
# ^^^^^-- print status
expected = wanted ^ 0xce4b
for t in generator():
if f_tuple(t) == expected:
pwd = ''.join(chr(ch) for ch in t)
print('found password for 0x{:x}: {} [{!r}]'.format(
wanted, pwd, pwd))
return
# vvvvv-- print status
finally:
signal.alarm(0) # stop alarm
Now you can see what’s going on (and how slow it is):
>>> find_password_for(f('ABBBO'))
(generator is at 'BuvU')
(generator is at 'EaWh')
(generator is at 'HGAG')
... 18 lines elided ...
(generator is at '5fau')
(generator is at '8Exw')
(generator is at 'AAmr9')
found password for 0xc014: ABBBO ['ABBBO']
So, that was a sort-of interesting side track. It’s noteworthy that it’s
still quite slow. I stopped the iterations before finding a collision
for 0xfef1
. Removing the <sheetProtection/>
is generally your best
bet.
Maybe this could be an interesting toy project to try out Rust-lang on…