Slashdot reader Unpopular Opinions requests suggestions from the Slashdot community:
Lately a boom of companies decided to play their “nice guy” card, providing us with a trove of information about our own sites, DNS servers, email servers, pretty much anything about any online service you host.
Which is not anything new… Companies have been doing this for decades, except as paid services you requested. Now the trend is basically anyone can do it over my systems, and they are always more than happy to sell anyone, me included, my data they collected without authorization or consent. It’s data they never had the rights to collect and/or compile to begin with, including data collected thru access attempts via known default accounts (Administrator, root, admin, guest) and/or leaked credentials provided by hacked databases when a few elements seemingly match…
“Just block those crawlers”? That’s what some of those companies advise, but not only does the site operator have to automate it themself, not all companies offer lists of their source IP addresses or identify them. Some use multiple/different crawler domain names from their commercial product, or use cloud providers such as Google Cloud, AWS and Azure â” so one can’t just block access to their company’s networks without massive implications. They also change their own information with no warning, and many times, no updates to their own lists. Then, there is the indirect cost: computing cost, network cost, development cost, review cycle cost. It is a cat-and-mice game that has become very boring.
With the raise of concerns and ethical questions about AI harvesting and learning from copyrighted work, how are those security companies any different from AI, and how could one legally put a stop on this?
Block those crawlers? Change your Terms of Service? What’s the best fix… Share your own thoughts and suggestions in the comments.
How can you stop security firms from harvesting your data?
Read more of this story at Slashdot.