Affects: \Up to 5.3.12
Here's a web application based on Spring Webflux.
When receiving multipart/form-data POST requests from clients, the body parts will be parsed by org.springframework.http.codec.multipart.MultipartParser
. As reading the source code, we could find that the main logic of the MultipartParser is a state machine. It changes state between the state of header parsing and the state of body parsing for most time. Now take the following request body as an example:
------WebKitFormBoundarySOIUhndhpIQUSFc1
Content-Disposition: form-data; name="size"25015919
------WebKitFormBoundarySOIUhndhpIQUSFc1
Content-Disposition: form-data; name="parentId"
------WebKitFormBoundarySOIUhndhpIQUSFc1 Content-Disposition: form-data; name="file"; filename="data.zip" Content-Type: application/x-zip-compressed(binary data)
------WebKitFormBoundarySOIUhndhpIQUSFc1--
Upstream provides DataBuffer
s to MultipartParser sequently. MultipartParser handles these DataBuffer
s and transit among states. When parsing the body above, PREAMBLE -> HEADERS -> BODY -> HEADERS -> BODY -> HEADERS -> BODY -> DISPOSED should be performed. First, we get header Content-Disposition: form-data; name="size"
, then body 25015919
, then header Content-Disposition: form-data; name="parentId"
, so far so good.
But here comes up a malformed situation. Let's say if the current DataBuffer
contains the value of the "parentId" part (which is an empty string), and ends with the boundary plus CRLF. i.e. the byte array under the DataBuffer
should be looked like:[..., 45, 45, 45, 45, 45, 45, 87, 101, 98, 75, 105, 116, 70, 111, 114, 109, 66, 111, 117, 110, 100, 97, 114, 121, 83, 79, 73, 85, 104, 110, 100, 104, 112, 73, 81, 85, 83, 70, 99, 49, 13, 10]
Now let's check out the part of codes of BodyState
about handling the body of the part.
int endIdx = this.boundary.match(buffer); if (endIdx != -1) { if (logger.isTraceEnabled()) { logger.trace("Boundary found @" + endIdx + " in " + buffer); } int len = endIdx - buffer.readPosition() - this.boundary.delimiter().length + 1; if (len > 0) { // buffer contains complete delimiter, let's slice it and flush it DataBuffer body = buffer.retainedSlice(buffer.readPosition(), len); enqueue(body); enqueue(null); } else if (len < 0) { // buffer starts with the end of the delimiter, let's slice the previous buffer and flush it DataBuffer previous = this.previous.get(); int prevLen = previous.readableByteCount() + len; if (prevLen > 0) { DataBuffer body = previous.retainedSlice(previous.readPosition(), prevLen); DataBufferUtils.release(previous); this.previous.set(body); enqueue(null); } else { DataBufferUtils.release(previous); this.previous.set(null); } } else /* if (sliceLength == 0) */ { // buffer starts with complete delimiter, flush out the previous buffer enqueue(null); } DataBuffer remainder = MultipartUtils.sliceFrom(buffer, endIdx); DataBufferUtils.release(buffer); changeState(this, new HeadersState(), remainder); } else { enqueue(buffer); requestBuffer(); }
Since the current buffer contains the complete boundary, the endIdx
should be greater than -1. In this example the value of endIdx
should be the length of the buffer minus 2 (exclude the tail CRLF). As we could see the remainder
is sliced from the original buffer started from position endIdx + 1
. That means the remainder
contains two bytes of CRLF in this example. Next, the state is changed to HEADERS.
Let's check out the part of codes of HeadersState
long prevCount = this.byteCount.get(); long count = this.byteCount.addAndGet(buf.readableByteCount()); if (prevCount < 2 && count >= 2) { if (isLastBoundary(buf)) { if (logger.isTraceEnabled()) { logger.trace("Last boundary found in " + buf); } if (changeState(this, DisposedState.INSTANCE, buf)) { emitComplete(); } return; } } else if (count > MultipartParser.this.maxHeadersSize) { if (changeState(this, DisposedState.INSTANCE, buf)) { emitError(new DataBufferLimitException("Part headers exceeded the memory usage limit of " + MultipartParser.this.maxHeadersSize + " bytes")); } return; } int endIdx = this.endHeaders.match(buf); if (endIdx != -1) { if (logger.isTraceEnabled()) { logger.trace("End of headers found @" + endIdx + " in " + buf); } DataBuffer headerBuf = MultipartUtils.sliceTo(buf, endIdx); this.buffers.add(headerBuf); DataBuffer bodyBuf = MultipartUtils.sliceFrom(buf, endIdx); DataBufferUtils.release(buf); emitHeaders(parseHeaders()); changeState(this, new BodyState(), bodyBuf); } else { this.buffers.add(buf); requestBuffer(); }
The buf
here is just exactly the same as the remainder
sliced from the buffer
during the previous state BODY, i.e. contains two bytes of CRLF. For now the value of prevCount
is 0, count
is 2, and this.byteCount
is added to 2. Since the buf
doesn't contain the endHeaders
, endIdx
is -1, and another DataBuffer
is requested at line 2 count from backward.
When new DataBuffer
is provided, the codes above will be excuted again. This time since the this.byteCount
is added to 2, the prevCount
is assigned to 2. By default the readableByteCount()
of the fresh DataBuffer
could be (probably) up to 8192 (8KB), so we assume that here the readable byte count of the buf
is 8192, and the count
is added to 8194. However, the default value of MultipartParser.this.maxHeadersSize
is set to 8192, so the condition of the second if
is matched. DataBufferLimitException("Part headers exceeded the memory usage limit of " + MultipartParser.this.maxHeadersSize + " bytes")
is emitted. This is apperently not the expected behavior.
This issue could not be 100% reproduced, but occasionally occurs after serveral (plenty?) attempts. The temporary workaround is set maxHeadersSize
to 8194. It's an ugly workaround. In my opinion the default setting should be work well for most cases. So probably I'm using it in a wrong way or I've misunderstood something. Correct me if I make a mistake.
Thanks for any help.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4