Skip to content

Writing some JavaScript to download files from the web, with help from an AI chatbot

A real-life example of boosting your productivity as a web developer with ChatGPT.

ChatGPT really does supercharge your productivity as a developer. Code that would have taken maybe an hour or two (or more) to write and test using the power of human thought can now be produced by said human explaining the problem to ChatGPT and having it write the code.

For this blog post I’ve chosen a practical, real-life example from my archive of ChatGPT conversations. The problem in this case: I wanted to download a bunch of files (PDFs mostly) for which I had a list of URLs. My initial thought was to look into whether I could run something in the browser, for three reasons:

  1. I’m a front-end dev whose natural habitat is the browser, so I should be able to whip up a solution quickly.
  2. You don’t need to worry about authentication (if the files are only accessible when you’re logged in) as you can run the code from within an authenticated browser session.
  3. The code can be run by others easily on their own machines.

On being presented with this problem, rather than start from a blank slate, I had a conversation with ChatGPT (GPT-4) about it. I’ve reproduced a lightly edited version of it below.

I’m aware that in blogging about a ChatGPT conversation I could be accused of being a bore (like somebody going on about the dream they had last night). However, large language model-based products, like ChatGPT and GitHub Copilot, look like they’re ushering in a major change in how sofware developers work, so I thought why not write about how I’m using them.

ChatGPT maintains context throughout the conversation so I can ask it questions about the information it’s given me (‘What does async do?’ and ‘Is that called “destructuring”?’), and ask it to refactor code to meet new requirements (‘Can you refactor downloadPDF() to handle multiple PDFs?’, ‘Can I determine the MIME type?’).

If you’re in a hurry, you can jump straight to the final iteration of the code.


JavaScript function to fetch and download PDF.

To fetch and download a PDF using JavaScript, you can use the fetch function to get the PDF data as a Blob and then create a download link using that Blob. Here’s an example function that takes a URL and a filename as parameters:

async function downloadPDF(url, filename) {
try {
// Fetch the PDF data as a Blob
const response = await fetch(url);
const blob = await response.blob();

// Create a download link and set its attributes
const link = document.createElement('a');
link.href = URL.createObjectURL(blob);
link.download = filename;

// Append the link to the document, click it, and remove it
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
} catch (error) {
console.error('Error while downloading PDF:', error);
}
}

// Usage example
const pdfURL = 'https://example.com/path/to/your/pdf-file.pdf';
const pdfFilename = 'downloaded-pdf-file.pdf';
downloadPDF(pdfURL, pdfFilename);

Make sure to replace the pdfURL variable with the URL of the PDF file you want to download. The pdfFilename variable is the name that will be given to the downloaded file.

[Tangential question removed.]

Re the downloadPDF() function you wrote, can you refactor to handle a list of PDFs to download, each with a specific, defined value[?]

Certainly! Here’s a modified version of the downloadPDF function that accepts an array of objects, each containing the URL and filename for a PDF. The function will download all PDFs one after another.

async function downloadPDFs(pdfs) {
// [Deleted]

for (const pdf of pdfs) {
const { url, filename } = pdf;
try {
// Fetch the PDF data as a Blob
const response = await fetch(url);
const blob = await response.blob();

// Create a download link and set its attributes
const link = document.createElement('a');
link.href = URL.createObjectURL(blob);
link.download = filename;

// Append the link to the document, click it, and remove it
document.body.appendChild(link);
link.click();
document.body.removeChild(link);

// Add a delay to avoid triggering browser's security mechanisms
await new Promise((resolve) => setTimeout(resolve, 1000));
} catch (error) {
console.error(`Error while downloading ${filename}:`, error);
}
}
}

// Usage example
const pdfList = [
{
url: 'https://example.com/path/to/your/pdf-file1.pdf',
filename: 'downloaded-pdf-file1.pdf',
},
{
url: 'https://example.com/path/to/your/pdf-file2.pdf',
filename: 'downloaded-pdf-file2.pdf',
},
// Add more PDFs as needed
];

downloadPDFs(pdfList);

Replace the pdfList array with the URLs and filenames for the PDF files you want to download. The function will iterate through the list and download each PDF file with the specified filename.

Please note that downloading multiple files in quick succession might be flagged by some browsers as suspicious behavior, potentially causing the downloads to be blocked. To avoid this, a delay is added between downloads using setTimeout. You can adjust the delay time as needed.

What does the async before the function definition mean?

The async keyword before a function definition indicates that the function is asynchronous. An asynchronous function is a function that can perform operations without blocking the execution of other code, allowing other tasks to run concurrently.

Asynchronous functions use the await keyword within their bodies to pause the execution of the function until a Promise is resolved or rejected. When a Promise is resolved, the function continues execution from the point where it was paused.

Here’s a simple example to illustrate how async functions work:

async function fetchData() {
console.log('Fetching data...');

// Simulate an asynchronous operation (e.g., fetching data from an API)
const result = await new Promise((resolve) =>
setTimeout(() => resolve('Data fetched!'), 1000)
);

console.log(result);
}

console.log('Starting...');
fetchData();
console.log('Waiting for data...');

The output will be:

Starting...
Fetching data...
Waiting for data...
Data fetched!

As you can see, the async function does not block the execution of the rest of the code. While it’s waiting for the Promise to resolve, other code (in this case, the console.log('Waiting for data...')) can run.

In the downloadPDFs function example, the async keyword allows the function to wait for the fetching and downloading of each PDF without blocking the execution of other code. This is useful when working with network requests or other asynchronous operations.

In the refactored downloadPDFs() function, what does const { url, filename } = pdf; do? Is that called ‘destructuring’ or something?

Yes, you’re correct! The line const { url, filename } = pdf; is an example of object destructuring in JavaScript. Object destructuring is a syntax feature that allows you to extract properties from an object and create new variables that hold the values of these properties.

In the downloadPDFs() function, the pdfs parameter is an array of objects, where each object has a url and filename property. The line const { url, filename } = pdf; is used to extract the url and filename properties from the current pdf object in the loop. This creates two new variables url and filename holding the respective values, which can then be used in the following lines of code.

Here’s a simple example to illustrate object destructuring:

const person = {
name: 'John',
age: 30,
};

const { name, age } = person;

console.log(name); // Output: 'John'
console.log(age); // Output: 30

In this example, the name and age properties are extracted from the person object, and new variables name and age are created with the corresponding values.

Can you do this with arrays, for example if pdfList was an array of arrays, each PDF array being the format ['https://example.com/pdfs/pdf1.pdf', 'pdf1.pdf']?

Yes, you can use array destructuring to handle the pdfList if it’s an array of arrays. Array destructuring works similarly to object destructuring but is used to extract elements from arrays. Here’s the modified version of the downloadPDFs() function that works with an array of arrays:

async function downloadPDFs(pdfs) {
for (const pdf of pdfs) {
const [url, filename] = pdf; // Array destructuring
try {
// Fetch the PDF data as a Blob
const response = await fetch(url);
const blob = await response.blob();

// Create a download link and set its attributes
const link = document.createElement('a');
link.href = URL.createObjectURL(blob);
link.download = filename;

// Append the link to the document, click it, and remove it
document.body.appendChild(link);
link.click();
document.body.removeChild(link);

// Add a delay to avoid triggering browser's security mechanisms
await new Promise((resolve) => setTimeout(resolve, 1000));
} catch (error) {
console.error(`Error while downloading ${filename}:`, error);
}
}
}

// Usage example
const pdfList = [
[
'https://example.com/path/to/your/pdf-file1.pdf',
'downloaded-pdf-file1.pdf',
],
[
'https://example.com/path/to/your/pdf-file2.pdf',
'downloaded-pdf-file2.pdf',
],
// Add more PDFs as needed
];

downloadPDFs(pdfList);

In this version, the line const [url, filename] = pdf; uses array destructuring to extract the URL and filename from each array element in pdfList. The rest of the function remains the same.

Say some of the files were Word docs (.docx); can I determine the MIME type from the response so I can give the file to be downloaded the correct file extension?

Yes, you can determine the MIME type of the fetched file from the Content-Type header in the response object. Based on the MIME type, you can set the appropriate file extension for the downloaded file. Here’s the modified version of the downloadPDFs() function that handles different file types and sets the correct file extension:

async function downloadFiles(files) {
for (const file of files) {
const [url, filename] = file;
try {
// Fetch the file data as a Blob
const response = await fetch(url);
const blob = await response.blob();

// Get the MIME type from the response headers
const contentType = response.headers.get('content-type');
let extension;

// Determine the file extension based on the MIME type
switch (contentType) {
case 'application/pdf':
extension = '.pdf';
break;
case 'application/vnd.openxmlformats-officedocument.wordprocessingml.document':
extension = '.docx';
break;
// Add more cases for other MIME types if needed
default:
extension = ''; // Fallback to no extension if MIME type is unknown
}

// Create a download link and set its attributes
const link = document.createElement('a');
link.href = URL.createObjectURL(blob);
link.download = filename + extension;

// Append the link to the document, click it, and remove it
document.body.appendChild(link);
link.click();
document.body.removeChild(link);

// Add a delay to avoid triggering browser's security mechanisms
await new Promise((resolve) => setTimeout(resolve, 1000));
} catch (error) {
console.error(`Error while downloading ${filename}:`, error);
}
}
}

// Usage example
const fileList = [
[
'https://example.com/path/to/your/pdf-file1.pdf',
'downloaded-pdf-file1',
],
[
'https://example.com/path/to/your/word-file1.docx',
'downloaded-word-file1',
],
// Add more files as needed
];

downloadFiles(fileList);

In this version, the function is renamed to downloadFiles() to reflect that it can handle different file types. The MIME type is obtained from the response headers using response.headers.get('content-type'), and a switch statement is used to determine the appropriate file extension based on the MIME type. The link.download attribute is set to the filename concatenated with the determined file extension.

Please note that this approach relies on the server providing the correct Content-Type header for the files. If the server doesn’t provide the correct MIME type, you might need to use an alternative method, such as checking the file extension in the URL or using a library to determine the file type based on the content.

And that was me with the code I needed. I manually tidied it up a bit, and this is the format of the final version that I ran:

async function downloadFiles(files) {

// MIME type: file extension
const fileExtensions = {
'application/pdf': 'pdf',
'application/vnd.openxmlformats-officedocument.wordprocessingml.document': 'docx',
'application/msword': 'doc',
'application/zip': 'zip'
}

const mimeTypes = [];

// Collate MIME types
Object.keys(fileExtensions).forEach(key => {
mimeTypes.push(key);
});

for (const file of files) {
const [url, filename] = file;

try { // Fetch the file data as a Blob

const response = await fetch(url);
const contentType = response.headers.get('Content-Type');

if (response.ok && mimeTypes.includes(contentType)) {
const blob = await response.blob();

// Create a download link and set its attributes
const link = document.createElement('a');
link.href = URL.createObjectURL(blob);
link.download = `${filename}.${fileExtensions[contentType]}`;

// Append the link to the document, click it, and remove it
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
} else {
console.error(`Error while downloading ${filename}: Invalid content type (${response.headers.get('Content-Type')}) or response error`);
}

// Add a delay to avoid triggering browser's security mechanisms
await new Promise((resolve) => setTimeout(resolve, 300));
} catch (error) {
console.error(`Error while downloading ${filename}:`, error);
}
}
}

// Usage example
var fileList = [ // `const` → `var` so we run in single browser console session more than once
['https://example.com/path/to/your/pdf-file1.pdf', 'pdf-file1'],
['https://example.com/path/to/your/word-file1.docx', 'word-file1'],
];

downloadFiles(fileList);

This was a pretty straightforward series of prompts. The problem was a relatively simple one, and ChatGPT was able to output what I wanted from start to finish. In more complex scenarios, it might only get you started, or you’ll have enough code you want to ask it about that you’ll run out of tokens. ChatGPT also didn’t have any hallucinations in this session. When that happens you can politely let it know it’s havering and should provide a new answer.

And as to the way I went about solving this problem (running a script from the browser console), some may think it a bit inelegant or hacky. You’re probably right! But it did the job and that’s what matters. And thanks to ChatGPT it took less time and effort than it would have otherwise.