Bugs item #473009, was opened at 2001-10-19 21:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=473009&group_id=5470 Category: Python Library Group: Python 2.1.1 Status: Open Resolution: None Priority: 5 Submitted By: Dave Cinege (dcinege) Assigned to: Nobody/Anonymous (nobody) Summary: binascii_b2a_base64() improper str limit Initial Comment: Modules/binascii.c binascii_b2a_base64() contains the following restrictive code: if ( bin_len > BASE64_MAXBIN ) { PyErr_SetString(Error, "Too much data for base64 line"); return NULL; } This is an error. The base64 method of encoding data has no length limitation. The MIME message RCF has such a limitation of base64 encoded data. The function should not assume it's only input must be MIME compatible. The base64 python module itself is designed for MIME I/O only, and properly limits itself. The binascii function should be left raw. binascii_a2b_base64() properly accepts input of any size. How I came across this bug: I use base64 to ascii armor binary data in log entries in a distributed network monitoring system. For the sake of ease of parsing (human and machine) all log entries are delimited by a single line. I commonly have unbroken base64 encoded fields of 64KB in size or greater. Unfortunatly I am unable to encode this data like this: result64 = binascii.b2a_base64(s) I must do this: result64 = re.sub('[ |\n]','',base64.encodestring(s)) Which is *much* slower. : < I feel this is an outright bug and should be corrected. If their is some argument for backward compatibly an optional function argument should be present to allow bypassing this limitation. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-10-21 18:26 Message: Logged In: YES user_id=6380 I'm with David. It's up to the higher level code (e.g. the base64 module) to avoid writing lines longer than 76 characters; the underlying function in binascii doesn't have to act as a policeman here. There may be other applications of the same encoding where the 76-char limit does not apply. ---------------------------------------------------------------------- Comment By: Dave Cinege (dcinege) Date: 2001-10-20 21:34 Message: Logged In: YES user_id=314434 >Can you cite any relevant standard that defines base64 to >work in that way? Base64 is defined in RFC 2045 section >6.8., which clearly says >The encoded output stream must be represented in lines >of no more than 76 characters each. This is difficult to do because base64 itself has not (yet) been seperatly defined in it's own RFC. It should be and this issue has been brought up recently on the W3 lists. IE: http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0212.html http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0210.html The part of the RFC you have quoted is relevent to the use of base64 encoding in the context of MIME, the purpose clearly being to ensure compatibly with email (SMTP, POP3, MUA, etc) standards. However this 76 character line length rule is irrelevent when dealing with arbitary binary data, not meant for MIME encapulated transmission. This is clearly seen the describtion of the actual base64 algorithms itself: The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right, a 24-bit input group is formed by concatenating 3 8bit input groups. These 24 bits are then treated as 4 concatenated 6-bit groups, each of which is translated into a single digit in the base64 alphabet. When encoding a bit stream via the base64 encoding, the bit stream must be presumed to be ordered with the most-significant-bit first. That is, the first bit in the stream will be the high-order bit in the first 8bit byte, and the eighth bit will be the low-order bit in the first 8bit byte, and so on. ... In base64 data, characters other than those in Table 1, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances. Additionally the use of 'unlimited length' base64 encoding of binary data has reached critical mass. For braod based example HTTP based authorization 'encrypts' the username:password in base64. However no length limit can be used, else it would arbiltarily limit the amount of data that could be passed without interfering with the HTTP protocol itself. IE: (Lines should not appear wrapped) 'Logging in' to a webserver with Username: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY Z0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUV WXYZ0123456789 Password: test Will have the web broswer send the AUTH request header as follows: Authorization: Basic YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWjAxMjM0NTY3ODlhYmNkZWZnaGlqa2xtbm9wcXJzdHV2d3h5ekFCQ0RFRkdISUpLTE1OT1BRUlNUVVZXWFlaMDEyMzQ1Njc4OTp0ZXN The latter field is an 'unlimited' length base64 encoding. (Testing done with KDE Konqueror, other browsers may vary) Due to it's simple application you will find many a reference stating: ''The Base64 algorithm has become "the standard" for encoding binary data.'' Clearly line length limitation are counter productive to such use. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-10-20 06:30 Message: Logged In: YES user_id=21627 Can you cite any relevant standard that defines base64 to work in that way? Base64 is defined in RFC 2045 section 6.8., which clearly says The encoded output stream must be represented in lines of no more than 76 characters each. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=473009&group_id=5470
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4