• kif@lemmy.nz
    link
    fedilink
    arrow-up
    8
    ·
    16 hours ago

    Not certain, but I’m guessing it’s something to do with how archive.org archives. I’d say it probably captured some JavaScript which uses window.location.host, which would resolve to the original (say lemmy.nz) on the original page but web.archive.org on the snapshot.

    • pet the cat, walk the dog@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      8 hours ago

      Update: you seem to be at least partially correct about JS being involved: the webpage source as downloaded from archive.org has a shitton of data in JS structures, while the actual final HTML of that element is nowhere to be found. Meaning the DOM is assembled from the JS data on the fly. Now, the page url, as I predicted, doesn’t seem to figure in this, because the data itself contains numerous instances of ‘web.archive.org’ in it. I’m guessing that Archive’s algorithm replaced the site domain to be prefixed with Archive’s domain and went a bit overboard about it, which seems then to have confused Lemmy’s JS into using the web.archive.org domain as the instance domain when rendering the page.

      For better or worse, I don’t use stimulants harder than tea, and amn’t so young anymore as to reverse-engineer this thing further.

    • pet the cat, walk the dog@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      13 hours ago

      That’s very doubtful, seeing as since the post might belong to any of the multitude of instances, the api must send the instance name in the response to the client. The client’s host doesn’t figure in that. Additionally, the screenshot looks like the default UI, whatever it’s called (at least default for lemmy.world), and it doesn’t show the instance domain for communities on the same instance (at least doesn’t show currently).

      But I’ll look into the JS hypothesis later. Weird shit is afoot there.