• R00bot@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    16
    ·
    1 year ago

    i tried to get access to facebook’s api to mess around (as a student) but they declined my request. i ended up making a bot that ran in a headless browser wasting far more of facebooks resources and i used it to create shitposts that updated the post with the number of reactions lmao.

  • b3nsn0w@pricefield.org
    link
    fedilink
    English
    arrow-up
    12
    ·
    1 year ago

    fun fact: on the r-site, you can still append .json to the end of any path (before the query params) to get the formatted data

    fun fact 2: on the same site you get a similar json if you grab the script that says id="data" (trivial with jsdom if you run nodejs), eval it in a sandbox (node’s built-in vm package), and look for your passed global object’s $.___r param

    fun fact 3: also on the same site, if you use the old interface it’s full of data tags intended for css, jsdom goes brrr

    fun fact 4: even if they stopped all of this you could use a headless browser and grab the data in flight from the api calls (virgin dom scrubber vs chad api capturer)

    i don’t know much about the t-site and can’t check right now because you can’t even access it the normal way, lol

  • Shit@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    This cracked me up. Especially the 10 minute delay and rate limiting making it better to just scrape.

  • Jackolantern@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Can someone eli5 me. What’s scraping and how does it work? Like for example in the context of twitter with their current limitations, will scraping still work?

    • 1rre@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 year ago

      Scraping is getting a webpage as if you’re a normal user going to that page in firefox/chrome and extracting the bits you want from it. If Twitter makes you sign in to view tweets (which I guess it will now?) then scraping won’t help much, otherwise it probably will, however it may take a fair bit of trickery to get working