|
damn spammers
Saturday 19 November, 2005 at 8:29AM (Nereus) :: permalink :: comments (11) I have switched off trackbacks completely due to the sheer amount of spam I've been getting hit with. Nothing was getting published thanks to SpamLookUp, but the CPU load was getting hogged by the SpamLookUp process - something I predicted would happen when 6Apart (creators of MT) decided to get rid of MT-Blacklist in favor of SpamLookUp. I am really not impressed - in fact, pissed would be a better description. I hardly get any comments to this weblog now since only TypeKey registered comments get published immediately, which kills any flow of conversation due to the nature of visitors to this blog - something I also predicted would happen when 6Apart dropped MT-Blacklist. If I opened up commenting again, this weblog would be inundated with spam comments and trackbacks in no time - something that MT-Blacklist was able to control that SpamLookUp cannot. I can only presume that traffic to this site has more than halved since upgrading to MT3.2 and the SpamLookUp system as a result, as illustrated in the following bar chart. Unfortunately Jay Allen (creator of MT-Blacklist and now 6Apart Product Manager) decided not to upgrade MT-Blacklist for MT3.2, so that was the end of that, unless someone who has the skills decides to take up the reins. Sadly, nobody has.
Another issue is that SpamLookUp blocks trackback pings where the ping IP doesn't match the IP of the weblog it supposedly comes from. Good idea on the surface, but it doesn't take into account services like Haloscan, which host comments and trackbacks off-site. I found this out by chance when glancing at the junked trackback pings one day and found valid trackbacks being wiped out thanks to SpamLookUp - no idea how many had been deleted prior to that. As a result I now have to take time to check through all junked pings, which renders SpamLookUp pretty much pointless in that area. Sure I could whitelist Haloscan and other similar services, but then spammers also use those services, so once again I'd be stuffed. Nice one. I used to get a lot of 'drive-by' comments and trackbacks on this site - this is, comments and trackbacks by first-time visitors, often visitors who returned if a conversation started up. This return traffic has been lost with a significant portion of the blame being the SpamLookUp process not allowing open commenting. Yes, SpamLookUp has a word filter option, but it is not nearly as effective as MT-Blacklist imho - particularly when it comes to the different data types. I also have a suspicion that if I relied only on that and put a huge list in there like I had on MT-Blacklist, CPU load would again get out of hand, something that MT-Blacklist had dealt with. Perhaps I should try it anyway? Has anyone had success using SpamLookUp purely as a wordfilter blacklist with open commenting, and not had CPU load jump up? I need a service that works, allows open commenting, and does not use up my time administering it. MT-Blacklist served that purpose comparatively well, SpamLookUp presently does not, and I don't have time to feck about with these options - I have study to do. Apparently MT-Blacklist will work with MT3.2 with some minor adjustments to the codebase, so perhaps I'll try that, although the damage has already been done, and the master blacklist for MT-Blacklist is no longer readily available or updated. comments (11)
Arvind In the hope of starting some conversation :) With the CPU load thing I find it a bit odd that you didn't experience the same with MT Blacklist. I think perhaps you never experienced as severe an attack as you are now because both MT Blacklist and SpamLookup are plugins hence any antispamming will consume CPU cycles (unlike mod_security). Also SpamLookup is far superior in terms of efficiency than MT Blacklist, I can't remember off the top of my head but I could swear MT Blacklist did something like check a comment/trackback against *every* entry on your blacklist which would obviously take up CPU cycles.... Nereus Heya Arvind :) Yeah there was a load issue with MTB at one stage, but from memory that problem was due to rebuilding.. whatever it was, Jay resolved it. As far as looking up matches, SpamLookUp is having to check every spam with external domain and IP blacklist service databases (opm.blitzed.org, surbl.org) rather than it's own database, which could certainly slow things down, with the addition that we as users have no control as to what may be on those databases. Additionally, SLU is looking to see if the commenter's email addy was ever used previously on an approved comment, and likewise if their web URL was ever approved on a previous comment.. this is a whole lot more search and matching than MTB ever did. As far as the process goes, MTB would stop searching on first 'deny' match (and would only continue if it was just a 'moderate' match, in case there was a 'deny' match further on in the search), and the way it was setup made it very quick too. As far as I understand it, SLU does not stop on first match, but takes into account all matches and gives an average scoring from there, and what should be a denied message can end up just being moderated instead because of other scoring attributes (in fact, this happens a significant amount in SLU). I also liked the 'comment denied' message from MTB - the few manual spammers are left in no doubt the weblog is protected, whereas SLU just gives the moderated message regardless, so the spammers may keep trying in the hope something gets through.. maybe. If the keyword filter feature on SpamLookUp was worked on so it could be used and administered the same way as MTB was, then I'd use it alone and deactive the other parts of SLU, since essentially it would be the same service as MTB was.. although it would probably do a full search due to the scoring system.. Whatever the case, SLU is causing me more administrative work than MTB was ..although I had developed a pretty effective DB on MTB with some of the patterns and a few regex.. Ah well. I wrote this blog entry when I was pretty pissed - I've calmed down a bit now. erin Nereus, Nereus Hiya Erin :) Yeah I was at around 3,000 entries at one point, although adding a bunch of good URL patterns made a big dent in that number. Possibly an expiry on some would be handy too.. ie: if no hits on a particular URL string for over X months, then drop it off the active list, just to keep the size of the list down. I have too many other plugins active on the site to change back to an earlier version of MT now though. MTB can work on MT3.2 with some tweaking (patch available here and more details here ), but there is no master blacklist available anymore. I've asked Jay for a copy if he still has one. There certainly is a lot of good features and new plugins available that couldn't work with pre-MT3.2 versions, so it's still certainly worth considering an upgrade, just a shame MTB wasn't at least updated to work fluently with MT3.2 before being left to die, considering what a big overhaul MT3.2 was. Nereus Great.. I no longer have trackbacks because of spam, and now I'm getting hammered with comment spam that, for the most part, is not getting junked but instead requires my approval, so I have to go through and select and delete every one of them a number of times every day, but make sure I don't accidentally junk the valid ones - so I have to fucking check each one! Jay Allen "How is SpamLookUp so much more superior than MT-Blacklist??? Hmmm???" Because for the vast majority of Movable Type users, Spamlookup works very, very well and requires absolutely NO attention. If you don't pay very careful attention to MTB or make any mistakes, it will work just as well as a two legged table. I get on average about one non-junked spam per day (and that's moderated). If you're not experiencing the same, you may want to re-examine any changes you made to the SpamLookup configuration or to the junk threshold (which should be 0). The defaults work quite well. Also, let's be clear. Six Apart didn't "give up" on Blacklist. It was never theirs. I did, but that's because I couldn't do it justice. At this point, as you mentioned, it is open-source. Have at it. As you know, I understand how frustrating it is to get spammed and have to deal with it. But instead of directing your anger at the people who are trying to help, perhaps it's better to channel that towards the spammers somehow? Nereus Nods, I have SLU set to the defaults, and yes, it was working well for quite a while, which is why I had to eat the words I said in the beginning when it first came out ..but more recently there has been a steadily increasing amount of spam getting past most of the SLU blocks to the moderation point where I have to check them individually (don't get me wrong, there is still a heap getting blocked direct to the junk file). Presumably spammers are working out ways to get around SLU, and it is solely the TypeKey rego that stops some of them getting to published status right now. I've started using the wordfilter more extensively to try and combat it, which should help. Ironic that I'm falling back to the very feature that made MTB so good in order to try and keep SLU more effective though. If I'm getting a steady increase in spam getting past most of the SLU blocks, then others will start to notice it soon too I'd imagine. It's still a trickle at the moment, but enough to be annoying, and quite likely a precursor to a bigger spam flow getting past most of SLU's blocks. Re 6A 'dropping' MTB, okay yeah, bad way to phrase it - as you said, it was never theirs - although 6A sure as hell endorsed it in a big way by awarding it the top plugin award in a high-profile (relative to MT) international competition (well deserved too imho).. regardless, you now work there, so I'm in no position to comment further. Believe me Jay, if I had even half your skills, I would certainly be having a go at working on MTB, but alas I do not have those skills and am busy studying full time already, but not in comp sci. As for assigning blame, of course the spammers are the ones at fault here. As I mentioned earlier in my comments, I was pissed when I wrote what I wrote in the first place, so of course there's going to be a certain amount of irrationality as a result. I'm frustrated having to compromise when having open commenting was something really important to me, which MTB worked well with compared to SLU, at least for me. I dunno. I'm starting to wonder if it's even worth the frustration - there are more important things to be concerned about in life than a friggin weblog. Hmm.. I think that pretty much hits the nail on the head.. Nereus Ohhh I see what their latest trick is.. I've been getting a bunch of spam comments getting past all the SLU blocks and being stopped only by the TypeKey registration, and they've all contained links to valid sites that most people would not want to add to their SLU word filter - sites such as google.com, forbes.com, foxnews.com and so on. I was wondering why the spammers would want to bother spamming using these URLs, then suddenly I clicked why: if a comment gets published, the email addy that comment used will then give +1 to the scoring of any future comments that use that same email address. Nice one, using SLU's own features against itself. It seems that every day I'm having to manually sort through a few more moderated comments than the day prior, comments that have only been moderated because of TypeKey registration requirements. This is becoming a chore, and not an enjoyable one. If I could get my hands on one of these fucking spammers... Nereus I'm once again forced to limit commenting to TypeKey users only, and this is after releasing that restriction only yesterday so that non-registered commenters would be moderated rather than denied outright.. in just the small amount of time it took me to write the previous comment, I received fifteen more spam comments using the same tactic to bypass SLU, and they were only moderated because they were not TypeKey registered. I am sick and tired of having to manually check every single one of these moderated comments every frigging day. MT-Blacklist did not require anywhere near this much maintenance to keep the spammers out. This has become a time-consuming and frustrating chore, and I don't want to hear anyone tell me how much better and more effective SLU is any more, because it might have a bunch of different features and maybe the standard of coding is better (not my claim btw), but that fact is that when it comes to the crunch, SLU has created far more administrative work for me than MTB did. If I still had my weblog running with open commenting (TypeKey registration not required) like I did with MTB, this weblog would have thousands upon thousands of published spam comments by now. If ANYONE has a copy of the last full master blacklist, please email it to me. I asked Jay for a copy but it never received a response. If you're considering commenting here about the pros of SLU and cons of MTB, don't bother - I've most likely already heard it (and I've certainly experienced the difference) and I remain completely unconvinced. Oh, and if you're thinking "what's so bad about typekey only commenting" then I will tell you: since switching to SLU (because it came bundled with MT3.2 and MTB was never upgraded fro MT3.2 - I never would have changed from MTB otherwise), valid comments to this website have dropped by about 90 to 95% because many people just can't be farked registering and signing in. Hell, I'm one of them! Unless I'm a regular visitor to a site, if I stumble into a site and decide to comment and find they have registered commenting only, then 9 times out of 10 forget it, I'm gone and I'm not coming back. Had I been able to comment without pissing around registering and logging in, I would've, and I would've most likely returned to see if anyone replied to my comment. WHY IS IT SO DAMN HARD FOR PEOPLE TO UNDERSTAND THAT THERE IS A HUGE HUGE AMOUNT OF WEBSURFERS OUT THERE THAT THINK EXACTLY THE SAME WAY I DO? Hell, I've proved it - my comments have died and my traffic has dropped by more than 50% since changing to MT3.2 and SLU, and I'm spending waaay more time having to clean up spam.. WTF? I guess my traffic is different from the traffic the developers get (I wonder if they researched it) - traffic here is much more the drive-by and perhaps return type, as opposed to a bunch of regulars. GRRRR. In fact fark this.. I might just delete this weblog in the new year - it's not really enjoyable anymore. I don't write as often and don't write much of interest or post entries that are enjoyable to read anymore, basically because by the time I get through all the moderated spam I'm just not in the mood to write anything - I just want to logoff and get the hell away from this thing that's tied around my neck like a great big ungainly dead albatross. ACK! leave a comment
|
current weblog entries weblog archives syndication (atom/rss) weather forecasts related utilities online games psychic mind reader the bad day cure internet security webmaster resources password generator gisborne surf, nz goat island bay session las vegas 2005 bbc world headlines cnn world headlines michelle malkin usgs earthquakes daily rotten news national geographic time world headlines time photoessays urban scrawl site info urban scrawl site map contact the author linkage list |
|
| |
Nereus
November 19, 2005 9:47 AM [link]
Archive structure change also has much to do with the drop in traffic, but the commenting is pretty much totally dead now ..except for the spam.