Cron Jon - Storing last data entry point

I have a cron job running a pythton script, that pulls data from a website, parses the data and then inserts the data into my database. The cron job is running once day at the same time. Now the data that I get from the website is in a dataframe format and gets bigger everyday but the old content does not change. I would like a way to store the last dataframe index that was inputted into the database so that I do not input duplicate data into my database. Basically, if today I entered 20 rows of data, the number 20 is stored, and tomorrow, the program finds the number 20, deletes the first 20 rows from the dataframe, enters the remaining data and updates the last row entered variable.

Easiest way to do this would be a json file, but this does not work since with each run time the json file returns to original point where it was deploy from github.

I would also like to avoid checking the last row entered using the database itself for various reasons.

Any ideas how best to achieve this please?

I would be querying the database to get the last record you inserted and determining it that way - why would you avoid doing that? You need to figure out that last record and querying it from the database would be most reliable,


John B
Render Support, UTC :uk:

I know that’s the most logical way, but not all data entry will be done using this way, hence why using a database is not the best idea. Still, given the circumstances, it is probably still worth figuring out a workaround and using the database to know my next data entry point

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.