Hi
Here is script, that create text data files by year and solid txt file one per station from page
www.tutiempo.net/en/Climate/. There are more than 300 selected stations with synoptic ID (from GSOD database). It contain all Czech Slovak and Poland stations and more than 200 selected stations from world.
This script too create yearly graphs (Temperature, Preception, Wind, Visiblity, Pressure, Humidity and Occurence of rain, snowing, thunderstorm and fog). Graphs are png images with maximum compression (1280x960, 256 colors).
Finally the script create 7zip archives (Maximum compression LZMA/Ultra/1024 MiB/273) - Archive with solid station files, with yearly files, with yearly files and graphs and with graphs.
Script run in Linux. You need programs - Gnuplot, 7zip, wget, gawk or awk, image magic for convert images.
7zip use 1024 MB vocabulary and it required about 11 GiB ram. You can use smaller vocabulary (16 32 48 64 96 128 256 384 512 768 MiB) and it takes smaller ammount of ram. Used disk space is about 2-3,5 GB (lees after compression). Time of script run is abbout 10-25 hours (6-9 hours downloading, 3-9 hours creating and converting graphs,1-4 hours 7z compressing).
Number of files during procces is about 200 000 - 400 000 (most of it are graphs).
The bash script and scripts for gnuplot are aviable here:
http://meteotommy.twilightsparkle.cz/Tutiempo_SCRIPTS.7zFirst - You have to define your user name at the begin of bash script. You have to change user name and
${W} variablein bash script. Data are /home/
${USER_NAME}/TUTIEMPO/ and scripts (Gnuplot and bash) are in /home/${USERNAME}/SKRIPT/Tutiempo/
Do not run this scripts paralell in many places - it might overload server
www.tutiempo.net