@Badland9085

Badland9085@lemm.ee · 9 months ago

I’m probably replying to a troll, but I will do so anyways for the sake of those who need to read this.

If we aren’t in any way bothered to see such narrow-minded reactions to a wrong being righted, then humankind is definitely headed for a few horrible decades ahead, filled with unnecessary strife and conflict out of pure indifference to each other’s backgrounds and current understanding of the world. And I’d even imagine it’d be worse than what we’re already seeing this decade. I suggest you go back and rethink what really matters as humans, instead of focusing on just some narrow definition of what a win is.

Badland9085@lemm.ee · 9 months ago

Wow, wtf is wrong with this comment section? People don’t realize how laws made in the past just stay around until someone steps up to change it? Or y’all don’t have the capacity to look at the world through a different mindset, even if you disagree with the mindset? As much as we all hope that people around the world are accepting, it doesn’t just happen, and you can’t just hope people who don’t understand your PoV will just realize something’s wrong waking up one day.

Either those, or y’all have either grown too cynical or are trying to be cynical just for the sake of it.

Can’t y’all just celebrate the fact that this is happening in Japan, an infamous nation that usually tries fervently to preserve their tradition and status quo?

Badland9085@lemm.ee · 9 months ago

Doesn’t mean the defaults shouldn’t be sane

Badland9085@lemm.ee · 9 months ago

When you have a large population with a strongman government, having little appetite for upheavals is likely impossible. Interacting with the Chinese, I’ve learned that their government is always actively monitoring online spaces to silence dissidents amongst their own people, such that their people are used to codify their languages, and switch whenever a code gets added to the list of words that would trigger the attention of authorities. Some years ago, it was something related to meditation (it was used as a code for their gatherings), and so the Buddhists had to change the word they use to avoid unnecessary trouble with the authorities. These don’t get mentioned very often in Western news sources.

Badland9085@lemm.ee · 9 months ago

It’s just the CCP imo. Most Chinese people I’ve met are just regular folks. Governments rarely represent their people culturally these days, if at all. Pretty odd when you think about it.

Badland9085@lemm.ee · 10 months ago

For those who’re interested: https://this-week-in-rust.org/

Badland9085@lemm.ee · 10 months ago

Fancy speak for “cheap salesman who has a large network” /0.5s

Badland9085@lemm.ee · 11 months ago

That’s why I get out of my car, look if I’m getting in someone’s way, and adjust as needed.

Takes no more than a minute to be civil and nice to other people, especially to those with special needs.

Badland9085@lemm.ee · edit-2 11 months ago

Kinda don’t like how my handwavy idea is just taken for the most naive turn. I’m not even trying to give precise solutions. I’ve never worked with software at scale, and I expect the playing ground to be pretty different, but I think you’re exaggerating.

Storing all 18 years worth of data in all its iterations is ridiculous in the first place, and should never cross the mind of any dev worth their salt for more than a mere nanosecond. Cut off all that data down to to 3 years, 1 year, or even just a few months, and that’s probably all Reddit needs for backup and analytics. Have separate strategies for backup and analytics if needed. They’ve been doing ads and analytics stuff for a while now, so I expect them to have some architecture in place for that.
Dealing with deleted comments is easy — just unmark them for deletion (hard delete is generally not a thing). It’s most probably not in a backup. It’s just not a user accessible feature to unmark deletion. Even if they do get deleted eventually, what’s the time frame for a cleanup like? Every day? A few months? They still need an entry for that comment for the threads feature to work, so at best, they null the content of the comment out.
ChatGPT is just an example. No need to beat a bad example to death and use that as an argument against a whole argument. And I’m pretty sure you’ve not read the rest of the last comment.
I think you’re over-estimating how much of an impact the API pricing fiasco had, and once again, you don’t seem to have read my previous comment and acknowledged that. Nobody in their right mind is going to do this comment read and scan for every single Reddit user. Not manually for sanity. Not programmatically for cost. It’s why they need some way(s) to identify which users to watch out for. They’re not going to do that manually though, right? That would be costly too, from a manpower’s perspective, and human labor is expensive, and scales much worse than programs.
Common sense would ask that if all they did is to restore their database to a certain state, how do they deal with new comments and changes that were added between the PiTR and whenever they make the restore? Are they just gone now? Isn’t that bad, cause they’re potentially losing new, quality content?

Look buddy, all I want to say is that I don’t think your method against Reddit would work. It’s basically gamble though, so I’m definitely not against attempt at it. I just want to point out the possibility of it not working. I don’t think there are surefire ways against their attempt at restoring content.

Badland9085@lemm.ee · edit-2 11 months ago

It’s hard to say that without knowing what their infrastructure’s like, even if we think it’s expensive. And if they built their stack with OLAP being an important part of it, I don’t see why they wouldn’t have our comment edit histories stored somewhere that’s not a backup, and maybe they just toss dated database partitions into some cheap cold storage that allows for occasional, slow reads. They’re not gonna make a backup of their entire fleet of databases for every change that happens. That would be literally insane.

Also, tracking individual edit and delete rates over time isn’t expensive at all, especially if they just keep an incremental day-by-day, maybe more or less frequent, change over time. Or, just slap a counter for edits and deletes in a cache, reset that every day, and if either one goes higher than some threshold, look into it. There are probably many ways to achieve something similar in a cheap way.

And ChatGPT is just an example. I’m sure there already are other out-of-fashion-but-totally-usable language models or heuristics that are cheap to run and easy to use. Anything that can give a decent amount of confidence is probably good enough.

At the end of the day, the actual impact of their business from the API fiasco is just on a subset of power users and tech enthusiasts, which is vanishingly small. I know many that still use Reddit, some begrudgingly, despite knowing the news pretty well. Why? Cause the contents are already there. Restoring valuable content is important for Reddit, so I don’t see why they wouldn’t want to sink some money into ensuring that they keep what makes em future money. It’s basically an investment. There are some risks, but the chances to earn em back with returns on top of the cost is high.

Badland9085@lemm.ee · edit-2 11 months ago

Just China things.

Hold up

Badland9085@lemm.ee · 11 months ago

You misunderstood my comment. Reddit probably has every version of your edits, so all they need to do is to put all your past comments through ChatGPT or something, by time in descending order. The first sensible one gets accepted. In some sense, that’s just like how a person would do it. This way, they don’t have to deal with individual approaches to obfuscating or messing with their data.

I was gonna just wait till this whole fiasco dies down, let it sit for a couple of months to a year, before going ahead and slowly remove my comments over time. It’s easy to build triggers for individual users to detect attempts at mass edit or mass deletion of comments after all, which may trigger some process in their systems. Doing it the low profile way is likely the best way to go.

Badland9085@lemm.ee · 11 months ago

Not too hard to defeat this solution though: put your comments through something like ChatGPT and if it can understand what you wrote, it’s probably good enough for em to restore it.

Maybe the answer is to write some nonsensical answer that’s understood by human readers as utter nonsense, but still recognized by LLMs as a “good comment”.

Badland9085@lemm.ee · edit-2 11 months ago

Won the battle in a Pyrrhic victory, but (maybe an “and” instead?) lost the war

Badland9085@lemm.ee · 11 months ago

Just like my luck IRL! :’)

Badland9085@lemm.ee · 11 months ago

You are correct. This notion of “size” of sets is called “cardinality”. For two sets to have the same “size” is to have the same cardinality.

The set of natural numbers (whole, counting numbers, starting from either 0 or 1, depending on which field you’re in) and the integers have the same cardinality. They also have the same cardinality as the rational numbers, numbers that can be written as a fraction of integers. However, none of these have the same cardinality as the reals, and the way to prove that is through Cantor’s well-known Diagonal Argument.

Another interesting thing that makes integers and rationals different, despite them having the same cardinality, is that the rationals are “dense” in the reals. What “rationals are dense in the reals” means is that if you take any two real numbers, you can always find a rational number between them. This is, however, not true for integers. Pretty fascinating, since this shows that the intuitive notion of “relative size” actually captures the idea of, in this case, distance, aka a metric. Cardinality is thus defined to remove that notion.

Badland9085@lemm.ee · 11 months ago

Not sure why artists are brought up here but I guess that’s one of the highly affected groups.

Just to talk about that particular consequence, however, I don’t agree with your take. There are AI trained on works of specific artists, and the end result is that the AI is really good at producing work that’s similar to that artist’s work, effectively creating an alternative to that artist, even if it’s of slightly lesser quality and a lack of depth of the original. While this would likely not affect the artist in the short term, in the long term, new prospects who don’t yet know the artist well enough would likely be unable to tell the difference in quality, and may even go straight to the AI model since that’s distributed cheaply or even free. It may also negatively reflect on the original artist to people who don’t know the artist, as the works from the AI would likely be more abundant, and people not in the know may think that the original artist was in fact just producing their works through AI. It is highly discouraging for artists who have worked hard to hone their craft, only to have people think that their works have little difference or even a mimicry (don’t underestimate misinformation).

There has been many instances where such training was done without the knowledge of the artist. Imagine just waking up one day, and finding that there’s someone or something that can very closely reproduce your works, one’s you’ve taken many years of practice to produce, of which its quality is almost unique to yourself. There’s a blatant lack of respect for the hard work that people put into their craft, one that seemingly belittles their blood and tears, and could even be a mockery of their existence. Some artists don’t have other jobs; their art and craft is their job, and some may have even sacrificed learning the skills needed for other jobs to pursue their passion.

Saying that AI is not intended to replace artists, but to improve accessibility, is like saying ATMs weren’t meant to replace bank tellers. True, there’s much less skill required for bank tellers, and getting cash out of banks is an important process that should be swift with almost no errors, so replacing bank tellers with ATMs is a general good, except for the bank tellers, which then banks can retrain them for other jobs. Since then, the job has virtually gone extinct, and almost nobody would want to become a bank teller, and if anyone would like to, they would need to perform better than ATMs. Artists require great skills and creativity, many of which are not easily trained or obtained. Seeing an automated system produce works that are acceptable by most people would either greatly discourage new artists or perhaps even entirely remove the idea of becoming an artist for most people. It raises the barrier to becoming an artist: not only do you need to stand out, you also need to be good enough such that people can’t just train an AI model on your work to produce results that are highly indistinguishable from yours. How many more years do people need to train to be that good? For those with a job but wish to become an artist, abandoning their job to focus on their craft will likely become a much more difficult choice to make. Also, I don’t doubt this would further rise the prices of commissions due to how much work artists would have to put in, and this would only get worse at a rate that’s much faster than a scenario without AI.

So a line should be drawn somewhere. AI trained on public works or artist-approved works are definitely okay. All other options will likely need further discussion and scrutiny. We’re talking about the possibility of ruining an already perilous career path, whose works are coveted.

Badland9085@lemm.ee · 11 months ago

They’re literally burning bridges after crossing them huh. Web scraping is illegal? Their fucking search engine was powered by a web scraper.

WEI is plain anti-competition to me now. Most, if not all, of their stated reasons are now just facade to me.

Fuck Google. I know this isn’t constructive or helpful, but fuck em.

Badland9085@lemm.ee · 11 months ago

And so you end up driving up food and housing demand, with no guarantee that the revived population can provide to the supply side. :P