ikiwiki/doc/todo/commandline_comment_moderat...

115 lines
5.7 KiB
Plaintext
Raw Normal View History

2020-05-27 06:18:38 +02:00
So I have enabled the [[plugins/moderatedcomments]] plugin on my wiki. and good thing that! around 1000 spammy comments showed up in the last 3 months! Awful!
2015-07-23 23:03:14 +02:00
It's pretty hard to figure out the ham and the spam in there. One thing I was hoping was to use the power of the commandline to filter through all that stuff. Now, it seems there's only a "ikiwiki-comment" tool now, and nothing to examine the moderated comments.
2015-07-23 23:14:00 +02:00
It seems to me it would be great to have some tool to filter through that...
So it turns out it was over 3000 comments. The vast majority of those (every one but 42 comments) were from the IP `46.161.41.34` which i recommend null-routing everywhere. I used the following shell magic to figure that out:
<pre>
2015-09-10 16:32:48 +02:00
#!/bin/sh
set -e
cd .ikiwiki/transient || {
echo could not find comments, make sure you are in a ikiwiki source directory.
exit 1
}
2015-07-23 23:14:00 +02:00
# count the number of comments
2015-09-10 16:32:48 +02:00
echo found $(find . -name '*._comment_pending' | wc -l) pending comments
2015-07-23 23:14:00 +02:00
# number of comments per IP
2015-09-10 16:32:48 +02:00
echo IP distribution:
2015-07-23 23:14:00 +02:00
find . -name '*._comment_pending' | xargs grep -h ip= | sort | uniq -c | sort -n
2015-09-10 16:32:48 +02:00
# generate a banlist for insertion in `banusers`, assuming all the
# pending comments are spam
echo banlist would look like:
2015-07-23 23:14:00 +02:00
find . -name '*._comment_pending' | xargs grep -h ip= | sort -u| sed 's/ ip="//;s/"//;s/^/- ip(/;s/$/)/'
2015-09-10 16:32:48 +02:00
echo to remove comments from a specific IP, use one of those:
find . -name '*._comment_pending' | xargs grep -h ip= | sort -u \
| sed 's/ ip="//;s/"//;' \
| while read ip; do
echo "find . -name '*._comment_pending' | xargs grep -l 'ip=\"$ip\"'| xargs rm"
done
echo to flush all pending comments, use:
echo "find . -name '*._comment_pending' -delete"
2015-07-23 23:14:00 +02:00
</pre>
The remaining 42 comments I reviewed throught the web interface, then flushed using the above command. My final addition to the banlist is:
<pre>
- ip(159.224.160.225)
- ip(176.10.104.227)
- ip(176.10.104.234)
- ip(188.143.233.211)
- ip(193.201.227.41)
- ip(195.154.181.152)
- ip(213.238.175.29)
- ip(31.184.238.11)
- ip(37.57.231.112)
- ip(37.57.231.204)
- ip(46.161.41.34)
- ip(46.161.41.199)
- ip(95.130.13.111)
- ip(95.181.178.142)
</pre>
--[[anarcat]]
2015-09-10 16:32:48 +02:00
Update: i made a script, above. And the banlist is much larger now so the above list is pretty much out of date now... --[[anarcat]]
2020-05-27 06:18:38 +02:00
Another update, 2020: I rewrote the script to support interactive batch approval and running from a cron job. I have published the [script in my own repo](https://gitlab.com/anarcat/scripts/blob/master/ikiwiki-comment-moderation) since it's python (and not perl), but I would be happy to provide it as a patch here if that's acceptable.
The basic usage is as follows. First, you add the script in a cron job:
9 * * * * /home/anarcat/bin/ikiwiki-comment-moderation --source-dir=$HOME/source/
This will run every hour. When there are no comments to moderate, the script is silent and you will not get mail. Otherwise you will get something like this:
date ip user subject content
2020-05-27T03:42:05Z 192.168.0.116 spammer name subject spammy comment
1 comments pending moderation
Then you can either go through the web interface to approve/deny the
comments, or call the script by hand, interactively, for example:
w-anarcat@marcos:~/source$ ~anarcat/bin/ikiwiki-comment-moderation --source-dir=$HOME/source -i --verbose
Date : 2020-05-27T04:00:23Z
Ip : 192.168.0.116
Claimedauthor : spammer name
Subject : subject
Content : spammy comment
approve, delete, ignore? [a/d/I] a
moving /home/w-anarcat/source/.ikiwiki/transient/blog/2020-04-27-drowning-camera/comment_1_07f43231a14d0ee6e78d1030aa6a7985._comment_pending to /home/w-anarcat/source/blog/2020-04-27-drowning-camera/comment_1_07f43231a14d0ee6e78d1030aa6a7985._comment
adding to git...
[master 8f5cb10f] approve comment
1 file changed, 9 insertions(+)
create mode 100644 blog/2020-04-27-drowning-camera/comment_1_07f43231a14d0ee6e78d1030aa6a7985._comment
Énumération des objets: 8, fait.
Décompte des objets: 100% (8/8), fait.
Compression par delta en utilisant jusqu'à 12 fils d'exécution
Compression des objets: 100% (5/5), fait.
Écriture des objets: 100% (5/5), 566 bytes | 566.00 KiB/s, fait.
Total 5 (delta 3), réutilisés 0 (delta 0)
To /home/w-anarcat/source.git
91669038..8f5cb10f master -> master
approved comment
And you're done! In the above case, the test comment was actually
approved (by pressing `a`), but you can also hit `d` to just delete
the comment. The default (`i`) is to ignore the comment.
I find that this is generally faster than going through a web browser, although to be as fast as the CGI interface, there would need to be a final dialog that says "delete all ignored comments" like in the CGI. Exercise for the reader or, I guess, myself when I got too many junk comments to process...
Feedback, as usual, welcome. -- [[anarcat]]
2020-11-02 10:15:23 +01:00
> great stuff, thanks! I've got a similar-ish script that sends me an email summary, but I don't
> have the CLI stuff myself. What I'd like to see on the web form is: display the IP (or user)
> responsible for each comment and enable banning them at the same time as moderating the comment.
> However I'd also like to expand the data attached to the banlist, so it's clear *why* an entry
> was added (e.g. blog spam), and when — because I would prefer all such bans to be time limited.
> (implementation wise, this might be easiest achieved if one of the comment plugins was responsible
> for maintaining a separate data structure with this info which was processed into a ban list to
> append to the main one.) *—[[Jon]], 2020-11-02*