WebPicDownloader is a website image scraping application. Allowing you to quickly download all the images of a website while avoiding the anti robot protection
This repository has been archived on 2023-11-29. You can view files and clone it, but cannot push or open issues or pull requests.
Go to file
2022-09-12 22:41:11 +02:00
tests Fix bugs, add update system + tests + fix deployment/build system 2022-09-12 00:29:39 +02:00
webpicdownloader Fix bugs, add update system + tests + fix deployment/build system 2022-09-12 00:29:39 +02:00
.gitignore Update readme 2022-09-12 22:33:56 +02:00
app_metadata.yml Added unit tests, created functions to check the availability of updates. 2022-09-11 13:38:36 +02:00
build_config.json Fix bugs, add update system + tests + fix deployment/build system 2022-09-12 00:29:39 +02:00
build_tool.py Fix bugs, add update system + tests + fix deployment/build system 2022-09-12 00:29:39 +02:00
LICENSE First commit 2022-08-30 12:27:33 +02:00
main.py Fix bugs, add update system + tests + fix deployment/build system 2022-09-12 00:29:39 +02:00
README.md fix readme issue referencement 2022-09-12 22:41:11 +02:00
run_tests.py Added unit tests, created functions to check the availability of updates. 2022-09-11 13:38:36 +02:00
VERSION Restructuration + init test system 2022-09-11 11:52:41 +02:00

WebPicDownloader

Donate Website

What is webpicdownloader ?

WebPicDownloader is a scraping tool that allows you to download all the images of a website. Basically WebPic is a Python script around which a graphical interface has been added to make it easier to use.

You will find here utility information to use the Windows application WebPicDownloader.exe. And here information to use or implement the Python script WebPicDownloader.py in your application (without the graphical interface).

Windows application

To use WebPic on windows nothing more simple, download the executable .exe of the last release here (be careful to download the latest release and not a pre-release).

Execute the file WebPicDownloader.exe and enjoy it! 👌

Use Python script

To start, find the script to use or to add to your code here.

CLI Run Requirements

To use the script check the following prerequisites.

  • Python >= 3.10.6 ;
  • beautifulsoup4 >= 4.11.1 ;
  • bs4 (BeautifulSoup) >= 0.0.1 ;
  • urllib3 >= 1.26.12 ;

Console Use ?

If you just want to use the console version of the script without the built-in GUI then you just need to check the prerequisites and run the script as follows:

python3 WebPicDownloader.py

Integrate to your code ?

First of all you have to know that WebPicDownloader has a deamon worker that downloads all the images asynchronously (this allows you not to block your program when a download is in progress). This same worker will be automatically killed as soon as your program finishes. WebPicDownloader therefore provides a blocking stop function allowing you to wait for the end of the download. See the information below. The prerequisites are the same as if you were running the script from the command line, see prerequisites.

Step 1

Instantiate your WebPicDownloader object like this:

from WebPicDownloader import WebPicDownlodaer, MessageType

webpic = WebPicDownloader()

The constructor can take several parameters (path: str, headers: dict, messenger, success, failure) (see the documentation).

Step 2

Define the WebPicDownloader callback functions. There are 3 main ones, the first (messenger callback) will be called at each system event and takes the following parameters (message: str, type: MessageType). The second (success callback) will be called at the end of processing if no major errors occur, it takes the following parameters (message: str). The third and last function (failure callback) will be called if a major error occurs or the download fails, it takes the following parameter (message: str).

By default, these functions print their results with a simple print(message) in the console. In case you implement WebPicDownloader in a graphical program, you should by convention remove all printing from your application and therefore define your own callback functions for WebpicDownloder. Below is an example:

from WebPicDownloader import WebPicDownlodaer, MessageType

# Consider instantiating before the main loop of your program is launched.
webpic = WebPicDownloader()

# Pay attention to the signature of the functions
webpic.set_success_callback(lambda message: print(f"Success ! [{message}]."))
webpic.set_failure_callback(lambda message: print(f"Success ! [{message}]."))
webpic.set_messenger_callback(lambda message, msg_type: print(f"[{msg_type}]: {message}."))

Step 3

Once WebPicDownloader instantiated and the callback functions configured, we have to launch the download and stop it. It is important to know that the script does not have a function to stop a download in progress, in fact the stop function will allow you to wait for the end of the download and then turn off the program or to kill the worker automatically when the main thread dies.

from time import sleep
from WebPicDownloader import WebPicDownlodaer, MessageType

webpic = WebPicDownloader()

# ... callbacks ...

# Webpic will give the task to its worker and start downloading the images
webpic.start_downloading('https://www.endmove.eu/', 'EndMove-website-images')

# We wait for the worker to start the task (once the task has started it cannot be stopped)
sleep(1)

# Webpic will ask the program to stop in blocking mode (it will join the worker to wait for the end of its execution)
webpic.stop_downloading(True)

Improvement (TODO LIST)

Here you will find some improvements I would like to add to the program, you can also participate by forking the repository and submitting a pull request.

  • Check for updates button.
  • Integrated file explorer.
  • Viewing the downloads already made.
  • Redo WebPicDownlodaer script to support concurrent downloads, to be able to launch workers and share tasks via a download pool.

This program is only a free utility tool and has not been developed in depth. In a future version it would be interesting to manage concurrent downloads in a thread pool.