gusl: (Default)
[personal profile] gusl
My current project is to annotate web pages. Since these pages could change, go down, etc, I need to make a static mirror.

I have used WebSuck+WebGet, which mirror the HTMLs found. I imagine this worked great 10 years ago, before the era of dynamically-generated web content.

It has a few problems:
* if it visits a page that ends in "/" (i.e. index.html or similar), it won't know to save the file as index.html.
* if it visits a dynamically-generated page, it won't save the content as an HTML file. If I wanted to save PHPs as PHP, I would need some way to set up a server, etc, which is a bad idea. The ideal solution is to rename the saved PHP (it's saved statically) and fix the links.
* it won't fix the links to point to content in the mirror. This shouldn't be too hard to do with a search&replace script.

Any ideas?
(will be screened)
(will be screened if not validated)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags