I was told that I should post this here.

cross-posted from: https://lemmy.world/post/932750

Say you decide to self-host a Lemmy instance. When you create that instance, do you immediately need to download and store all the data that has ever been posted to all federated Lemmy instances? Or perhaps you only need to download and store everything that is posted to the federated Lemmy instances from that point forward? Or better yet, do you only store what the users on that instance do (i.e. their posts, and posts to the communities hosted on that instance)?

  • Max-P@lemmy.max-p.me
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    It’s not quite as bad, because you’re still being pushed what you subscribe to. So while you do indeed get a fair bit of content you might never see, it’s necessary for you to be able to browse those communities and even being able to compute what threads are active/trending/hot/updated or whatever else filter you use. Because that’s all computed locally on your instance.

    It’s also an efficiency advantage: if your instance has a lot of users, having everything locally means that you offer a much smoother experience, and also you’re contributing to the remote instance not being so busy with traffic as you’re not just proxying everything to it and increasing the remote’s load.

    For your storage concerns, there’s nothing preventing you from purging content older than a week or two regularly via a cronjob.

    It’s not that bad so far:

    8,0K    volumes/lemmy-ui
    887M    volumes/pictrs
    646M    volumes/postgres
    1,5G    total
    
    • Lodion 🇦🇺@aussie.zone
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      1 year ago

      Your instance must be very new, very few users, very inactive… or all of the above. I stood up aussie.zone just under a month ago, Postgres DB is currently 9.6GB.