wget Command: Tutorial & Examples
Download files from the web
wget is a command-line utility for downloading files from the web. It supports various protocols such as HTTP, HTTPS,
and FTP, and can be used to download files from websites, servers, and other resources on the internet.
Here's the basic syntax for using wget:
wget [options] URL
URL is the web address of the file you want to download.
options are optional flags that you can use to modify the
Here are a few examples of how you can use
To download a file from a website, you can use the following command:
To download a file and save it with a different name, you can use the
wget -O new_name.zip http://www.example.com/files/example.zip
There are many other options available for
wget, such as
-c for continuing interrupted downloads,
specifying the number of retries, and
-q for running wget in quiet mode.
One powerful option of
wget is the ability to mirror a complete website.
Here's the basic syntax:
wget --mirror -p --convert-links -P /path/to/local/directory http://www.example.com
Here's an explanation of the different flags used in this command:
--mirror: This flag tells
wget to download the website in a way that replicates the directory structure of the
-p: This flag tells
display the website locally.
--convert-links: This flag tells
wget to convert links in the downloaded HTML files to work locally, rather than
pointing to the original website.
-P: This flag specifies the local directory where
wget should save the mirrored website.
--no-parent: Do not ascend to the parent directory while recursively downloading files.
Keep in mind that mirroring a website can be a time-consuming process, especially if the site is large or has many
links. You may want to use the
-w flag to specify the amount of time
wget should wait between requests, or
--random-wait flag to make
wget wait a random amount of time between requests. This can help reduce the load on
the server and avoid overloading it with requests.
wget --mirror -p --convert-links -w 2 --random-wait -P /path/to/local/directory https://www.example.com
If the process gets interrupted, you may want to restart it, skipping files which are already there. You can use a command like this:
wget -r --wait=1 --reject=*.bak --no-parent -nc --compression=auto --no-check-certificate https://www.example.com/dir
The important parameter is
-nc which actually skips the already downloaded files. Other interesting parameters
--reject to exclude certain files,
--no-parent to avoid crawling parent directories and
use efficient data transfer if the server supports it.