Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request.form empty for certain form POST request payloads #2734

Closed
dividebysandwich opened this issue Jun 17, 2023 · 12 comments
Closed

request.form empty for certain form POST request payloads #2734

dividebysandwich opened this issue Jun 17, 2023 · 12 comments
Milestone

Comments

@dividebysandwich
Copy link

I have a weird one, and I admit I am not entirely sure if it's a problem in Werkzeug or maybe somewhere deeper:

I have an ecowitt weather station and am happily getting data pushed to a python script that works great. Recently someone else wanted to do the same, installed my script but... it just didn't work. We found that the only difference was that I was still on python 3.6.9 and Werkzeug 2.0.3, whereas he was on Python 3.9.2 and Werkzeug 2.3.6

What happens is that in the route handler, request.form is empty depending on how the form request looks in detail.

Here's the code snippet:

@app.route('/weather/report', methods=['POST'])
def receiveEcoWitt():
    print("Data size: " + str(len(request.form)))
    data = request.form
    for key in data:
        print(f"{key}: {data[key]}\n")

So far so easy, right? Now sending a POST request using a simple html form works fine as long as it looks like this:

POST /weather/report HTTP/1.1
Host: 192.168.178.208:8100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 20
Origin: null
Connection: keep-alive
Upgrade-Insecure-Requests: 1

fname=John&lname=Doe

With that request, request.form is filled and the data can be parsed normally.
However, the Ecowitt station sends requests that have a different set of header fields, which looks like this:

POST /weather/report HTTP/1.1
HOST: 192.168.178.208
Connection: Close
Content-Type: application/x-www-form-urlencoded
Content-Length:502

PASSKEY=78D8DCE861B2164BAD38A473167ACFD2&stationtype=GW2000A_V2.2.4&runtime=1247&dateutc=2023-06-17+10:34:52&tempinf=87.98&humidityin=43&baromrelin=29.158&baromabsin=29.158&tempf=71.60&humidity=57&winddir=44&windspeedmph=0.00&windgustmph=0.00&maxdailygust=1.12&solarradiation=5.52&uv=0&rrain_piezo=0.000&erain_piezo=0.000&hrain_piezo=0.000&drain_piezo=0.000&wrain_piezo=0.000&mrain_piezo=0.000&yrain_piezo=0.000&ws90cap_volt=2.2&ws90_ver=132&wh26batt=0&wh90batt=2.98&freq=868M&model=GW2000A&interval=16

Sending that request results in an empty request.form. It's also no use to try and use request.get_data() or stuff like that. The method in both requests is POST.

What should happen: Both requests should work and request.form should be filled in both cases.

This works fine in Python 3.6.9 and Werkzeug 2.0.3

Environment:

Python v3.9.2
Werkzeug v2.3.6

@dividebysandwich
Copy link
Author

dividebysandwich commented Jun 18, 2023

Update: I recently updated the already-running weather station and now it has the same problem there with Werkzeug v2.0.3.
I don't know exactly which headers Ecowitt used to send back then, but there's clearly been a change in their headers in the latest firmware. Still, I don't think the form data should be empty in such cases, especially considering the request is otherwise accepted just fine.

@davidism
Copy link
Member

In your second request example, you have Content-Length:502 without a space, is that a typo? I don't think it should matter (and Werkzeug isn't the thing parsing the headers at that point), but it's the only thing that looks weird.

I can't reproduce this issue, although I don't have a way to reproduce that typo. request.form is populated correctly.

@djjudas21
Copy link

I have the same issue (which I logged here pallets/flask#5168 and also discussed with another weather station hacker here bentasker/Ecowitt_to_InfluxDB#1).

I'm on Python v3.11.3, Flask v2.3.2 and Werkzeug v2.3.6.

It seems we can deduce that something has changed with the format that the Ecowitt is sending in, but it also looks like this should be parsed correctly by Flask/Werkzeug

I'm running an Ecowitt WS2910 with firmware version v5.1.1, and the top item in the firmware changelog is "Optimize the http server" 🤦‍♂️

@djjudas21
Copy link

Just to add, I've worked around this issue by placing an NGINX reverse proxy in front of Flask, with no special config, and it seems to magically fix the dodgy header in flight.

Both NGINX and the http echo server I used for testing were able to accept and correct the header, so it feels like Flask should be able to, too.

@davidism
Copy link
Member

You still haven't provided a way to reproduce the issue locally, so I still can't do anything. Saying you "feel like Flask should be able" to handle it isn't enough.

It sounds like you might be running the development server, and assuming that is "Flask". This would only be an issue with Werkzeug if it also happens when running a production server such as Gunicorn, Waitress, or uWSGI. And you'd still need to provide a reproducible example.

The development server is a thin wrapper around Python's built-in http.server, issues with how it parses headers should be reported (again with a minimal reproducible example) to https://github.com/python/cpython.

@bentasker
Copy link
Contributor

Based on the description, this should be quite easy to repro - you just need to use something like netcat to place the request manually rather than curl (which'll likely fix things for you).

However, I'm not so sure it is that missing space (and if it is, the problem is in the header parsing as spaces are optional, per RFC 7231).

If it were the space, this should repro it

Stand up flask

from flask import Flask, request

app = Flask(__name__)


@app.route('/', methods=["POST"])
def hello():
    return request.get_data(as_text=True)
 
app.run(host="0.0.0.0", port=8090, debug=True)    

If the issue were the space, we should get an empty response, but

ben@optimus:~/tmp$ printf "POST / HTTP/1.1\r\nHost:127.0.0.1:8090\r\nContent-Length:15\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\nfoo=bar&sed=zoo" | nc 127.0.0.1 8090
HTTP/1.1 200 OK
Server: Werkzeug/2.3.6 Python/3.10.6
Date: Wed, 21 Jun 2023 13:35:50 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 15
Connection: close

foo=bar&sed=zoo

That being said, I'm running Python 3.10.6, so perhaps that's why

$ docker run --rm -it python:3.11.2 bash
... dep installs happen ...

root@103c3467b0d8:/# printf "POST / HTTP/1.1\r\nHOST:127.0.0.1:8090\r\nContent-Length:15\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\nfoo=bar&sed=zoo" | nc 127.0.0.1 8090
HTTP/1.1 200 OK
Server: Werkzeug/2.3.6 Python/3.11.2
Date: Wed, 21 Jun 2023 13:48:25 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 15
Connection: close

foo=bar&sed=zoo

@djjudas21 be worth seeing if that repro works for you though, there might be something different in my env.

Just run a capture of traffic from my Ecowitt and minne also lacks the space on Content-Length, so I'm assuming it should be capable of reproing. I don't see anything awry in the request, it's using CRLF between headers as it should etc.

@bentasker
Copy link
Contributor

bentasker commented Jun 21, 2023

Got it!

It's not visible in the description (for obvious reasons) but that content length header is not content-length:461, it's content-length:461 (trailing space):

We have a repro:

ben@optimus:~/tmp$ printf "POST / HTTP/1.1\r\nHost:127.0.0.1:8090\r\nContent-Length:15 \r\nContent-Type: application/x-www-form-urlencoded\r\n\r\nfoo=bar&sed=zoo" | nc 127.0.0.1 8090
HTTP/1.1 200 OK
Server: Werkzeug/2.3.6 Python/3.10.6
Date: Wed, 21 Jun 2023 16:11:30 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 0
Connection: close

So, the header parsing is non-RFC compliant (7231 says the value can be prefixed and suffixed by optional whitespace).

Unless I've gone awry, Python's header parsing relies on the parsing in it's email class (here, originally called from here).

Once we get down the callchain, the parsing class is here, but it ultimately calls header_source_parse()

If we take a copy of that and pass headers into it:

>>> header_source_parse(["content-length:15 "])
('content-length', '15 ')

Notice that it hasn't stripped the trailing whitespace.

This'll bubble all the way back up through the callchain until we're back in Werkzeug. When handling form parsing, Werkzeug calls _plain_int

That applies the following regex

_plain_int_re = re.compile(r"-?\d+", re.ASCII)

Which will fail to match because of that trailing space

>>> r = re.compile(r"-?\d+", re.ASCII)
>>> b[1]
'15 '
>>> 
>>> r.fullmatch(b[1])
>>> 

As a result, the response body isn't actually read.

So, there are two things that can/could be done

  • Get a PR in against Python to fix the header parser so that it strips trailing whitespace
  • Werkzeug could mitigate in the meantime by adjusting that regex to permit optional trailing whitespace (the subsequent caste with int() will strip it).

@bentasker
Copy link
Contributor

I'll look at raising a PR for the first in a bit.

@davidism let me know if you'd be willing to accept a PR for the second

@ThiefMaster
Copy link
Member

Considering this regex is meant to remove stuff that's allowed in int() but not in a plain number, I think allowing surrounding whitespace makes perfect sense

@davidism
Copy link
Member

PR to fix that is fine, probably just do value = value.strip() before the regex check.

bentasker added a commit to bentasker/werkzeug that referenced this issue Jun 21, 2023
bentasker added a commit to bentasker/werkzeug that referenced this issue Jun 21, 2023
@bentasker
Copy link
Contributor

PR posted here: #2738

I'll look at getting one into Python for the root cause soon (or possibly tomorrow)

@pgjones
Copy link
Member

pgjones commented Aug 12, 2023

Closed by #2764

@pgjones pgjones closed this as completed Aug 12, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants