Hello, I am using a Render Static Site with rewrites in front of two 2 Web Services. One service provides some HTTP streaming responses for an LLM chat application.
When I request the streaming endpoint against the API directly and compare that to hitting the same streaming endpoint through the Render static site rewrite, the response through the Render static site rewrite appears to have some buffering characteristics. That is, the response is delivered as a smaller number of larger chunks.
Here is a comparison of the per-line timings from hitting the API directly vs through the static site: 0-measurement-curl.sh · GitHub
With the static site rewrite, lines 0-20 are all received at 1569ms, then lines 21-43 at 2206ms, etc.
Is it possible to control or disable this buffering in some way, either through HTTP response headers from the API backend, or configurations, or something else? If not, do you have a recommended reverse proxy to run on Render, eg nginx? (I have had difficulty configuring Caddy).
Here are HTTP response headers from hitting the API directly on Render:
HTTP/2 200
date: Fri, 31 May 2024 21:12:32 GMT
content-type: text/event-stream; charset=utf-8
cf-ray: 88c9f672fe711736-SJC
cf-cache-status: DYNAMIC
access-control-allow-credentials: true
rndr-id: b4efbbc1-9dd8-40fa
x-render-origin-server: uvicorn
server: cloudflare
alt-svc: h3=":443"; ma=86400
And the HTTP response headers when requesting through the Static Site Rewrite:
HTTP/2 200
date: Fri, 31 May 2024 21:13:06 GMT
content-type: text/event-stream; charset=utf-8
cf-ray: 88c9f73c8d7e1739-SJC
cf-cache-status: DYNAMIC
cache-control: max-age=0
vary: Accept-Encoding
access-control-allow-credentials: true
alt-svc: h3=":443"; ma=86400
rndr-id: ae173047-7a5b-4f20
x-render-origin-server: uvicorn
server: cloudflare
Thank you for any suggestions!