Anyone else getting some random errors and 500s on here recently?

    • crbn@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Just happy you have a server running. No rush, get the sleep you need, and thanks for powering things (:

    • Speex@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      You got some monitoring in place? Can offer some assistance with monitoring ideas if you need, is part of what I do.

      Also take care of yourself. We can go outside if we can’t log in. Or go back to work…

        • Speex@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          I can give a brief(ish) overview sure.

          Monitor everything :P

          But really monitor meaningfully. CPU usage matters but a high CPU usage doesn’t indicate an issue. High load doesn’t mean an issue.
          High CPU for a long period of time or outside normal time frames does mean something. High load outside normal usage times could indicate an issue. Or when the service isn’t running. Understand your key metrics and what they mean to failures, end user experience, and business expectation.

          Start all projects with monitoring in mind, the earlier to you begin monitoring the easier it is to implement. Re configuring code and infrastructure after the fact is a lot of technical debt. If you are willing and can guarantee that debt will be handled at a later time then good luck. But we know how projects go.

          Assign flags to calls. If your application runs results in a response that’s started from and ends up at an end user, Send an identifying flag. Let that flag travel the entire call and you are able to break down traces and find failures… Failures don’t have to be in error outs, time outs. A call that takes 10x longer than the rest of the calls can cascade and shows the inefficiency and realiability.

          Spend time on log and error handling. These are your gatekeepers to troubleshooting. The more time spent upfront making them valuable, the less time you have to look at them when shit hits the fan.

          Alerts and Monitors MUST mean something. Alert fatigue is real, you experience it everyday I’m sure. That email that comes in that has some kind of daily/weekly status information that gets right clicked and marked as read. That’s alert fatigue. Alerts should be made in a way that scales.

          • Take a Look as a time allows - logs with potential issues
          • Investigate as something could be wrong - warnings
          • Shits down fix it - Alert

          APM matters Collect that data, you want to see everything from processor to response times, latency, and performance. These metrics will help you identify not only alerting opportunities but also efficiency opportunities. We know users can be fickle. How long are people willing to sit and wait for a webpage to load…. Unlike the 1990’s 10-30 seconds is not groovy. Use the metrics and try to compare and marry them with business key performance indicators(KPI). What is the business side looking for to show things are successful. How can you use application metrics and server metrics to match their KPIs.

          Custom scripts are great. They are part of the cycle that companies go through.
          Custom scripts to monitor —> Too much not enough staff —> SAAS Solutions (Datadog, Solar Winds, Prometheus, Grafana, New Relic) —>. Company huge SAAS costs high and doesn’t accurately monitor our own custom applications —> and we’re back to custom scripts. Netflix, Google, Twitter all have custom monitoring tools.

          Many of the SAAS solutions are low cost and have options and even free tiers. The open source solutions also have excellent and industry level tools. All solutions require the team to actively work on them in a collaborative way. Buy in is required for successful monitoring, alerting, and incident response.

          Log everything, parse it all, win.

  • GhostedIC@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Is anyone else having trouble posting? I’ve had a lot of times when posts/comments fail to go through (“post” button stays as a spinning circle). Sometimes I can get it after a couple times, but I’ve been stubbornly trying to reply to one comment in c/games all day with no success, despite posting a thread there and making some other posts.

    • crbn@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      Yup just ran into this problem. First reply gave me the endless spin. Second reply on the same sub went through no problem.

    • Zeppo@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Both on here and lemmy.world, yes. Comments and edits successfully post, but the spinner sometimes goes forever.

  • Tree6024@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I know this is an older post but I’ll comment anyway.

    I haven’t been getting any errors, but sometimes when I browse, I’d be scrolling through the comments of a post, and suddenly the post would change.

    I didn’t dig into it much, but if it starts happening more often, I’ll record my network traffic (for the browser) and attach the .har file.

  • Red@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    Yeah, this is specifically on sh.itjust.works for those viewing this thread elsewhere, but I’m sure other instances are struggling too. I’m not even seeing 500s, just a generic Firefox “Cannot complete request” on mobile and connection resets on Jerboa.

    Edit: ah, yep, here come the 500s mixed in there as well.

    • imaqtpie@sh.itjust.worksM
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      We are absolutely testing server capacity right now. And the reddit blackout is just starting. Gotta be prepared.

      Edit: Apparently we are not testing server capacity just yet, there was a configuration that needed to be updated on the back end, as per the Dude.

  • PCChipsM922U@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    Yeah, experiencing the same as well.

    This is gonna get quickly out of hand… there should have been an announcement that you should NOT join the most popular instances and just scroll down the list to join an instance that is preferably at the very bottom of the list.

    • God@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      When I joined this one was one of the smaller ones lol. This instance has been alive for like 6 days.

        • God@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          1 year ago

          same, I’m always in places like this, my e-mails are all from cock.li. I just love giving someone something like “yeah hi, e-mail me the job offer to [email protected]”. And together with being a bit annoyed at the inefficiency of other lemmy instances and how much they focused on which political party and ideology they should follow, this one was a breath of fresh air :)

          PS. I still find it to be the best working instance out of all that I have tried. The big ones are slow, the small ones have wrong setups and are buggy, this one is just perfect 💯

          God’s stamp of approval