Google Sheets – Faster Data Exports Options, Including CSVs

Intro

I recently came across this thread on Twitter, which captured my interest:

If you have Twitter embeds blocked, the Tweet thread is a discussion of an issue with Google Sheets, where the public “published” CSV feed URL does not return up-to-date data when fetched via curl, or at least not consistently and quickly.

I fired up a demo sheet, and lo-and-behold, I could easily reproduce the same issue they were seeing; Google Sheets was:

  • Taking a decent amount of time to update the published CSV
  • The published data would sometimes fluctuate, and rows would randomly disappear from the CSV feed
  • Both of the above issues were occurring randomly; I could not determine a consistent pattern

Now, I used to use the published CSV/TSV feature all the time and rarely ever saw this, but was also not relying on real-time data feeds, so my guess is that it is a replication issue that goes away with time; James is looking for data ASAP, and the Google servers are likely still passing around the data as it is added (could be multiple layers, like CDN + replication + sharding).

But, if you are in James position, and you are looking for the ability to have a CSV feed that always returns an up-to-date fresh export from your Google Sheet, is there an option for you? Yes! Several!

Option A – Use the Official Sheet APIs (v4)

This (using the Official v4 Sheets API) was suggested pretty quickly in the Twitter thread:

But, if you follow the linked StackOverflow answer, you’ll see that there are some limitations with the new v4 API:

  • You need an API key (which requires using Google Cloud Platform)
  • Data is always returned as JSON

However, this is your best bet for a more robust, best-practices, “future-proof” choice, as v3 of the API is being deprecated very soon.

Option B – Google Sheets Public Feed URLs (v3)

To start with, if you have a Google Sheet and want a public CSV feed, you can use the Publish to the Web option, and get a feed URL that looks like this:

https://docs.google.com/spreadsheets/d/e/{UNIQUE_ID}/pub?gid={SHEET_ID}&single=true&output=csv

… where {UNIQUE_ID} is not the normal document ID, but instead a unique random publication ID. {SHEET_ID} does however, correspond to normal sheet (pages of the spreadsheet) IDs.

As already noted, an issue with the above feed can be a delay in accurate data being reflected after the source is updated. However, that is not the only feed URL that is available to us without an API key. Until Google deprecates v3 of the Sheets API, there are still v3 API endpoints that you can use to get fresh data, without needing an API key:

🚨 WARNING 🚨: All these URLs will likely stop working on June 8, 2021, which is when Google is planning on shutting down the v3 version of the API. They are not likely to extend the deadline, as they already did so once already.

  • All values on a sheet
    • XML: https://spreadsheets.google.com/feeds/cells/{DOC_ID}/{PAGE_NUM}/public/full
    • JSON: https://spreadsheets.google.com/feeds/cells/{DOC_ID}/{PAGE_NUM}/public/full?alt=json
  • Specific range of values
    • Use the same endpoints as above, but control range with:
      • min-row=
      • max-row=
      • min-col=
      • max-col=

There are a lot of other advanced feeds you can get with the v3 API – see Docs: “Manage List-based and Cell-based Feeds”.

📄 You can find a more “step-by-step” guide here.

Option C – Google Apps Scripting

All this research into Google Sheets got me thinking about Google Apps Scripts (aka GAS), which I have used many times before, and wondering if it could be fitted to this task. GAS can live with a spreadsheet document, access cell values, and can even be published as a web app and respond to web requests!

I spent a little time and threw together a fun mini-API that can export just the last X number of rows of data, or the entire document, as CSV, TSV, or JSON! If you don’t want to require a Google login, but also don’t want it 100% public, you can even lock it down with an authentication key, entered into a named range of your document.

Of course, after finishing my script, I found that someone had already put together a similar solution. Feel free to check that out as well: gist.github.com/ronaldsmartin/47f5239ab1834c47088e.

Leave a Reply

Your email address will not be published. Required fields are marked *