I’m encountering issues with web scraping on my Render deployment. Despite multiple attempts, the scraping process fails consistently without generating any logs or errors in the output. I’ve verified my code logic and configurations locally, which work perfectly. However, upon deployment, the scraping process doesn’t initiate as expected and no error messages are logged. Seeking guidance to troubleshoot this issue effectively on Render.
I am using a Dockerfile to install the required dependencies for scraping in a production environment. Below is a snippet of my docker file:
FROM ghcr.io/puppeteer/puppeteer:19.7.2
USER root
RUN apt-get update && \
apt-get install -y xvfb
USER node
ENV DISPLAY=:99
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true \
PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm ci
COPY . .
# Start xvfb and run the application
CMD xvfb-run --auto-servernum --server-args="-screen 0 1280x1024x24" npx nodemon app.js
And this builds and deploys on render without any issues. Here is my scraping logic file:
import * as cheerio from "cheerio";
import puppeteer from "puppeteer";
import * as dotenv from "dotenv";
dotenv.config();
export default async function scrapeProduct(url) {
if (!url) return;
const browser = await puppeteer.launch({
args: [
"--disable-setuid-sandbox",
"--no-sandbox",
"--single-process",
"--no-zygote",
],
headless: "new",
executablePath:
process.env.NODE_ENV === "production"
? process.env.PUPPETEER_EXECUTABLE_PATH
: puppeteer.executablePath(),
});
try {
//scrapping logic here...
} catch (error) {
console.error("Error in Puppeteer script:", error);
console.log(error);
} finally {
await browser.close();
}
}
This logic seems to be failing on the puppeteer.launch()
even though I think I have passed the correct/necessary parameters and what is strange is that when my API route hits the scraping function there are no errors on the Render logs. I suspect that the problem could be my environment variable PUPPETEER_EXECUTABLE_PATH
.
I would really appreciate if there’s a solution to what I am facing and if anyone can help me find what I am missing in my logic.
Thanks!