Hacking Word Documents via Other Formats
https://www.cs.dartmouth.edu/~sws/word/index.shtml
Last modified: 09/22/02 03:07:07 PM
(a work-in-progress)
We've been kicking around an interesting way to
manipulate Word documents:
- exporting into a human-readable source format, like HTML
- editing that source
- bringing it back into Word.
Why I think this is interesting
A couple of reasons...
Initially, this is fun because it's a way
to use official, advertised functionality
in a surprising way---and easily defeat
a security mechanism. (However, the mechanism is not one
designed to withstand a dedicated adversary anyway...)
More importantly:
- Word is the defacto standard for
electronic documents (even for security institutes!).
- Market forces have resulted in overly rich functionality.
- Users have an informal mental model of what an electronic
document is and what various clicks and actions should do.
- However, the overly rich functionality results
in a significant difference between this mental model,
and the reality.
Some recent examples:
- The University of Denver discovery of being able to plant
"bugs" in Word, that report (via Web queries) when and from what
machine a document is read.
- Our lab's discovery of ways to make Word documents change
in usefully malicious ways without invalidating digital signatures
on them. Even with macros disabled.
- The interesting historical relics left in the binary, if
"Fast Save" is enabled.
- The interesting historical relics left in the binary, even if
"Fast Save" is disabled.
- The recently publicized ways to use "fields" to steal files.
This new line of attack suggests a number of interesting ways
to go beyond the user interface.
- can we add extra functionality to fields?
- can we turn untrusted macros into trusted ones?
- can we change metadata in usefully malicious ways?
- for various Word-embedded signature formats: how much of the
Word document is covered? is the signature expressed in the
exported version? can we change something important about what
it covers?
If this stuff is going to be the standard,
then reachable states needs to be well-specified and thought out.
In the long run, "formal methods for office tools" could be an interesting
line of work.
Forms Protection in Word
The Motivation
Suppose Alice wants to send Bob a complex document with questions that
Bob should answer. Bob should read the questions, fill in the appropriate
answers in the spaces provided, and send the document back.
In this scenario, to make it easier for Bob to do the right thing
(and make it harder for him to surreptitiously change the questions
before returning the form), Alice would like to write-protect the entire
document, EXCEPT the spaces for the answers.
The IRS got burned by such an attack, back in the days of paper;
a client changed the wording of a waiver, signed it, and sent it back;
since the IRS neither objected nor noticed, the courts held that the
client's altered version was binding.
How it works
Word has two types of forms protection: with and without password.
Go to View->Toolbars->Forms to open the forms toolbar.
Here's how the transitions work:
- 1. To take an unprotected doc to full protection, select
"Tools", then "Protect Document," then enter your password.
- 2. To bring it back, select "Tools," then "Unprotect Document."
Then you'll need to enter the password again.
- 3. To take an unprotected doc to no-password protection,
select "Tools" then "Protect Document," but enter no password.
- 4. To bring this doc back,
just select "Tools," then "Unprotect Document."
HTML and RTF Workarounds
Word permits easy export into other formats,
but expresses much Word-internal structure there,
to facilitate bringing the document back.
However, these other formats permit some interesting
ways to manipulate ther Word doc that Word itself doesn't
allow.
- 5. Export into HTML or RTF format. (With HTML, you'll get a
warning saying the password will be lost. With RTF, you'll get no warnings.)
- 6. You could try editing the protected material directly here, but
you don't get the WYSIWYG functionality of Word.
No problem: in both HTML and RTF, you'll see
some text expressing the fact that forms protection is turned on.
Delete that text.
- 7. Import the document back into Word.
If you edited out the protections, you now have an unprotected document.
Even if you didn't edit them out, you now have a password-free protected
document, which can be easily turned into an unprotected doc.
Open Office Workarounds
A student reports that Open Office doesn't even understand the
the forms protection.
- 8. Open the document with OpenOffice.
- 9. Save it as Word.
Contributors so far
- Dave Johnson noticed the html-edit path, and told me about it.
- Carl Ellison helped do some initial poking around.
- Sanket Agrawal noticed the Open Office angle.