gusl | bleg: web mirroring tool (Reply)

My current project is to annotate web pages. Since these pages could change, go down, etc, I need to make a static mirror.

I have used WebSuck+WebGet, which mirror the HTMLs found. I imagine this worked great 10 years ago, before the era of dynamically-generated web content.

It has a few problems:
* if it visits a page that ends in "/" (i.e. index.html or similar), it won't know to save the file as index.html.
* if it visits a dynamically-generated page, it won't save the content as an HTML file. If I wanted to save PHPs as PHP, I would need some way to set up a server, etc, which is a bad idea. The ideal solution is to rename the saved PHP (it's saved statically) and fix the links.
* it won't fix the links to point to content in the mirror. This shouldn't be too hard to do with a search&replace script.

Any ideas?

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29

Gustavo Lacerda

bleg: web mirroring tool

Profile

February 2020

Most Popular Tags

Style Credit

Expand Cut Tags