Decryptum
 
Home: About Decryptum: Tech. Info: Technical Information
About Decryptum
FAQ
Pricing
Contact Us
Legal Items
Links
Start decryption Start decryption
Try it for Free.
Download your file Download your file
Download decrypted file.

Password protection of MS Word/Excel documents is not as secure as once thought. Even though default crypto algorithms MS Office uses are relatively strong, contemporary methods of data recovery allow access to file contents almost instantly, regardless of the password length and complexity.

There is no need to panic though. In the "XP" and "2003" MS Office versions, Microsoft introduced Additional Cryptographic Providers for document protection. This allows users to choose encryption type and the size of a key, according to one's security reliability requirements. If one chooses, say, 128-bit key, it becomes practically impossible to decrypt the file, taking into account the capabilities of today's computing resources.

However, all MS Word and Excel versions since the "97" version (including both "XP" and 2003 versions) use the standard document encryption mode with 40-bit key length by default.

MS Word and Excel use MD5 and RC4 cryptographic algorithms for standard 40-bit documents encryption and password verification.

The MD5 algorithm, developed by R. Rivest, is one of the most popular hashing algorithms today. At the input of the algorithm, there can be arbitrary data, and its output would be a 128-bit value. In spite longstanding efforts of cryptanalysts, no methods of MD5 algorithm inversion (an algorithm that would be substantially more effective than an exhaustive input data search) have been developed at the moment. The complication of an exhaustive search (2^128) noticeably exceeds the existing computational capabilities. *

The RC4 algorithm, also developed by R. Rivest for the RSADSI Corporation, today is widely used in applications for data security. Since the moment the algorithm source codes appeared in open sources in 1994, its strength was actively examined by many cryptanalysts. At that time, a number of interesting results were obtained, which allow decreasing complexity of decryption RC4, in comparison with an exhaustive search of all possible keys. Nevertheless, the huge size of keyspace (256!, i.e. approximately 2^1684) makes practical application of these methods impossible, given today's resources. **

Thus, in terms of used cryptographic algorithms, the standard document protection technique used in Word and Excel seems to be quite reliable.

The easiest way of password recovery is to verify all possible alternatives in succession (brute-force attack). This method, with different variations, is realized in most of the commercial products developed for password recovery. Thus, as far back as in 1998, the Passware Inc, released the MSOfPass97 software product (known today as Office Key), which specifically allows password recovery for MS Word/Excel documents.

The description of the algorithm for file-open password verification is available in open sources. Two check sequences are stored in each encrypted document. The first is random sequence, while the second one is computed according both to the first sequence and to the document encryption key. Thus, the second sequence is a key-based hash of the first. During the password verification, the encryption key is determined, and the check sequences are tested if they correspond to each other. During the computation, the MD5 and RC4 algorithms are used.

At first sight, the usage of strong algorithms makes brute-force attack the only way to do a password search. Notice that obtaining a result, when using an exhaustive-search attack, is not guaranteed to be accomplished within a reasonable period of time. For instance, the usage of a password, which consists of more than 11 characters of lowercase and uppercase letters, digits, and special characters, turns the search into a desperately long operation.

However, there is a way to decrypt document contents without password restoring-that is the so-called keyspace attack. Keep in mind that a protected MS Word/Excel document is encrypted with the help of RC4 algorithm, on a key, computed by a password. The encryption key in the "97/2000 Compatible" mode, has a relatively small length-40 bit. This allows the sorting out of all 2^40 key values, to find a true value of the key, and to decrypt the document. Such an approach requires considerable, yet limited computer power, and a fortiori leads to a result.

There are several commercial services developed for decryption of MS Office documents, realizing keyspace attack. As a rule, it takes about a week to restore a document. However, in many cases such a length of time is too long for a user.

Limited keyspace size allows the waiting period to be shortened owing to the usage of precomputations. Nevertheless, the presence of a random sequence in the password verification scheme does not allow the carrying out of such an optimization.

At the same time, for verification of the encryption key, a user can, in addition to the document control sequences, use encrypted file contents. After decryption of the document is accomplished, and the result meets pre-formulated conditions, we can determine if the key is wittingly spurious. At the same time, the usage of encryption algorithm features allows a noticeable decrease in the complexity of the attack. The RC4 algorithm represents a byte-oriented stream cipher, which works in OFB mode (output feedback). This means that during encryption each byte of source plaintext is exposed by the XOR operation with a byte of pseudorandom gamma, which is generated from the key. The gamma byte depends on the key and the position in a stream, and does not depend on the stream being encrypted. This, in its turn, means that key-gamma adequacy can be precomputed.

Many records in an MS Office document's structure have predefined (or partially predefined) values. Having an available encrypted document makes it possible to make a number of suppositions concerning gamma values, and therefore, about the encryption key as well. Assume that we know that in bytes of plaintext, located in the "m" and "n" positions respectively, high bits coincide with each other. Then high bits in corresponding bytes of the encrypted text during an XOR operation would have the same value, as the gamma bits do. This allows us to precompute all 2^40 gamma values and subdivide the keys into two categories-those for which, as a result of an XOR operation, gamma bytes in the "m" and "n" positions result in a high bit equaling 0, and those where the high bit equals 1. Now, when we have an encrypted document, we can easily figure out in which category we should search for the encryption key. Thus, the amount of necessary computations is halved. By using similar methods, we can noticeably speed up key search.

When using keyspace attack schemes, which are based on precomputations, you may encounter technical difficulties, bounded up with the necessity of operating with large amounts of data. Thus, simple storing of a full gamma-key list requires up to 10 Tb. However, thanks to using specially developed algorithmic and software/hardware solutions, it becomes possible to provide a high speed of operating with the data precomputed.

If you are interested in acquainting yourself with the results of using the approach described above, please visit the http://www.decryptum.com website. The MS Word/Excel password protected documents decryption service, provided by Passware Inc., will instantly cope with almost any file. See details about features of the service on the http://www.decryptum.com website.

* Hashing algorithm MD5 is described in RFC1321 (http://www.ietf.org/rfc/rfc1321.txt).

** Stream encryption algorithm RC4 is the property of the RSADSI Corporation.

Copyright © 2005-2016 Passware Inc. All rights reserved.
Home | Customer support