How-To: Extract PHP-Nuke Articles from an SQL INSERT Statement in Python

I found an old SQL dump of this site yesterday which I thought was lost to time. The site ran php-nuke back then, and rather than importing the SQL into a database and fiddling with it to export all the old articles I thought I’d take the python hammer to it. It’s a bit of a bodge/hack – but it seems to work well enough.

The idea is you copy JUST the insert statement which deals with “stories” (nuke’s word for articles / posts) from the SQL dump and put it in a file (mine’s called “nuke_stories.sql”), then run the parse_nuke_stories python script and it’ll dump out all the articles into separate files with filenames like “Article_ID_YEAR_MONTH_DAY.html” and such.

Here’s the top few lines of my SQL insert statement:

INSERT INTO `nuke_stories` (`sid`, `catid`, `aid`, `title`, `time`, `hometext`, `bodytext`, `comments`, `counter`, `topic`, `informant`, `notes`, `ihome`, `alanguage`, `acomm`, `haspoll`, `pollID`, `score`, `ratings`, `associated`) VALUES
(1, 0, 'r3duxAdmin', 'W00t! The site it be a-changin''...', '2004-09-10 12:35:06', 'Well, PHP-Nuke seems to be up and running, so now it''s time to mess w/ it in a huge way and configure all the millions of options needed to make it just the way I want.\r\n

\r\nDaunted? Me? Er, yeah. You could say that...', '', 0, 83, 1, 'r3duxAdmin', '', 1, '', 0, 0, 0, 0, 0, ''),

Here’s what it’ll dump out into a file called Article_001_2004_09_10.html if this was all that was there:

Article ID : 1
Author : r3duxAdmin
Title : W00t! The site it be a-changin’…
Date : 2004-09-10 12:35:06

Well, PHP-Nuke seems to be up and running, so now it’s time to mess w/ it in a huge way and configure all the millions of options needed to make it just the way I want.

Daunted? Me? Er, yeah. You could say that…

And here’s the actual python (version 3) code to parse the SQL file:

# Function to find a single apostrophe that closes the field and return its location
def find_end_apostrophe(location, story):
    while True:
        # Move along
        location += 1
        # Grab two chars
        char1 = story[location:location+1]
        char2 = story[location+1:location+2]
        # If they're both apostophes then skip ahead 2 chars
        if char1 == "'" and char2 == "'":
            location += 2
        # An apostrophe and something else? It's the end apostrophe so bail!
        elif char1 == "'" and char2 != "'":
    return location
# Function to parse a story into separate fields and place them into the parsed_story list
def parse_story(story):
    parsed_story = []
    # Fields:
    # 0 = `sid`, 1 = `catid`, 2 = `aid`, 3 = `title`, 4 = `time`, 5 = `hometext`, `6 = bodytext`, 7 = `comments`,
    #`8 = `counter`, 9 = `topic`, 10 = `informant`, 11 = `notes`, 12 = `ihome`, 13 = `alanguage`, 14 = `acomm`,
    # 15 = `haspoll`, 16 = `pollID`, 17 = `score`, 18 = `ratings`, 19 = `associated`
    # 0 - Add story ID
    temp_end = story.find(",")
    field = story[1:temp_end]
    #print("ID:", field)
    # 1 - Add category ID
    temp_start = temp_end + 1
    temp_end = story.find(",", temp_start)
    field = story[temp_start:temp_end]
    #print("Category:", field)
    # 2 - Add author
    temp_start = temp_end + 3
    temp_end = story.find("'", temp_start)
    field = story[temp_start:temp_end]
    #print("Author:", field)
    # 3 - Add title
    temp_start = temp_end + 4
    temp_end   = find_end_apostrophe(temp_start, story)    
    #print("Title locs:", temp_start, temp_end)
    field = story[temp_start:temp_end].replace("''", "'")
    #print("Title:", field)
    # 4- Add time
    temp_start = temp_end + 4
    temp_end = story.find("'", temp_start)
    #print("Time locs:", temp_start, temp_end)
    field = story[temp_start:temp_end]
    parsed_story.append(field) # initial_story_split[4][2:-1])
    #print("Date:", field)
    # 5 - Add home text (i.e. the first paragraph)
    temp_start = temp_end + 4
    temp_end   = find_end_apostrophe(temp_start, story)
    #print("Home locs:", temp_start, temp_end)
    field = story[temp_start:temp_end].replace("''", "'").replace("<br><br>\\r\\n", "<br/><br/>").replace("\\r\\n", " ").replace("  ", "")
    #print("Home:", field)        
    # 6 = Add body text
    temp_start = temp_end + 3
    temp_end   = find_end_apostrophe(temp_start, story)
    #print("Body locs:", temp_start, temp_end)
    field = story[temp_start:temp_end].replace("''", "'").replace("<br><br>\\r\\n", "<br/><br/>").replace("\\r\\n", " ").replace("  ", "") # Last one is 2 'A with a hat' chars used for bullet points
    # If there is no body blank the field, otherwise strip the first char
    if field == "'":
        field = ""
        field = field[1:]        
    #print("Body:", field)
    # Not bothered about the rest of the fields
    return parsed_story
# ---------------------- Main --------------------------
#import codecs
#sql_file ="nuke_stories.sql", encoding="utf-8")
# Read the SQL dump of stories into a list of lines where each line is a story
sql_file = open("nuke_stories.sql", "r")
story_list = sql_file.readlines();
num_stories = len(story_list)
print("Number of stories: ", num_stories, "\n")
# Newline char
nl = "<br/>"
outfile = 0
for story_number in range(1, num_stories - 1): # Start from 1 because line 0 is the insert statement itself
    # Parse the story (i.e. article)
    parsed_story = parse_story( story_list[story_number] )
    # Grab the first section of the date only. Format: YYYY-MM-DD
    date_str = parsed_story[4][:parsed_story[4].find(" ")].replace("-", "_")
    # Construct the filename then create the file for writing
    filename = "Article_" + parsed_story[0].zfill(3) + "_" + date_str + ".html"
    outfile = open(filename, "w")
    # Dump the parsed story details to file
    outfile.write("Article ID : " + parsed_story[0] + nl)
    # Skipping Category ID - don't care.
    outfile.write("Author     : " + parsed_story[2] + nl)
    outfile.write("Title      : " + parsed_story[3] + nl)
    outfile.write("Date       : " + parsed_story[4] + nl + nl)
    outfile.write(parsed_story[5] + nl + nl) # Home text (i.e. first paragraph)
    outfile.write(parsed_story[6])           # Body text (i.e. rest of article)
    # Close the file to save it
print("All done! =D")

As mentioned, it’s a complete hack – but if you need to do something similar it may work as a starter to get you going…

How To: Stop Apache DOS attacks with Fail2Ban

I had to install and configure fail2ban yesterday to stop some hacking attempts on my FTP server, and when I was looking through the fail2ban configs I saw that you could stop DOS (Denial of Service) attacks with it too. As this site’s been hit by the occasional DOS from people with an axe to grind and too much time on their hands, I thought I may as well set up a DOS mitigation strategy while I was at it. Here’s how:

  1. Install fail2ban through the method of your choice.
  2. Edit the file /etc/fail2ban/jail.local and add the following section:
    enabled = true
    port = http,https
    filter = http-get-dos
    logpath = /var/log/apache2/YOUR_WEB_SERVER_ACCESS_LOG
    # maxretry is how many GETs we can have in the findtime period before getting narky
    maxretry = 300
    # findtime is the time period in seconds in which we're counting "retries" (300 seconds = 5 mins)
    findtime = 300
    # bantime is how long we should drop incoming GET requests for a given IP for, in this case it's 5 minutes
    bantime = 300
    action = iptables[name=HTTP, port=http, protocol=tcp]
  3. Don’t forget to replace YOUR_WEB_SERVER_ACCESS_LOG with the actual access log for your webserver! Note: This doesn’t have to be an apache log, I just happen to be using apache.

  4. Now we need to create the filter file, so create the file /etc/fail2ban/filters.d/http-get-dos.conf and place the following contents in it:
    # Fail2Ban configuration file
    # Author:
    # Option: failregex
    # Note: This regex will match any GET entry in your logs, so basically all valid and not valid entries are a match.
    # You should set up in the jail.conf file, the maxretry and findtime carefully in order to avoid false positives.
    failregex = ^<HOST> -.*"(GET|POST).*
    # Option: ignoreregex
    # Notes.: regex to ignore. If this regex matches, the line is ignored.
    # Values: TEXT
    ignoreregex =
  5. Now we just need to restart fail2ban for the new jail & filter to come into affect:
    /etc/init.d/fail2ban restart
  6. Or if your machine is on systemd, use:

    systemctl start fail2ban

    Also on systemd, if you want fail2ban to start on boot (and the chances are that you do), run the additional:

    systemctl enable fail2ban

    With all that done your site should be pretty safe from casual DOS attacks, although you’d likely need more stringent maxretry and findtime settings to really help against Distributed DOS (DDOS) attacks.


To check if fail2ban is seeing the logs, check out /var/log/fail2ban.log and you should see things like:

[http-get-dos] Found w.x.y.z

showing up as visitors view your site.

If you want to test if it’s really working, a nice way to do so is to use ab (Apache Benchmark – part of the apache2-utils package), like this:

ab -n 500 -c 10 http://your-web-site-dot-com/

This will kick off 500 page-loads in 10 concurrent connections against your site. When the ban kicks in the page-loads will stop (as incoming GET requests from your IP will be dropped), then when the bantime expires you’ll be able to access the site again. If you then take a look in your /var/log/fail2ban.log file you should see something like this:

2013-06-22 05:37:21,943 fail2ban.actions: WARNING [http-get-dos] Ban YOUR_IP_ADDRESS
2013-06-22 05:42:22,341 fail2ban.actions: WARNING [http-get-dos] Unban YOUR_IP_ADDRESS

Pretty neat, huh?

Many thanks to the authors of the following great articles for helping me to get this set up in no-time:– on-webserver.html

Cheers! =D

How To: Block FTP hacking attempts using Fail2Ban

I noticed that my FTP server was getting hit up with huge streams of access attempts, which just won’t do. Thankfully, it’s really easy to block these access attempts using the awesome fail2ban script.

  1. First, install fail2ban either manually or if it’s in your repos use:
    apt-get install fail2ban
  2. Next, go to the relevant section of the file /etc/fail2ban/jail.conf for your FTP server (mine is proftpd) and enable jailing by flipping the enabled flag to true:
    enabled = true
    port = ftp,ftp-data,ftps,ftps-data
    filter = proftpd
    logpath = /var/log/proftpd/proftpd.log
    maxretry = 5
    bantime = 3600
  3. Set your retries and bantime as you see fit, and make sure the log file path is correct (i.e. that it’s actually the log you want to monitor!)

  4. Restart fail2ban with a swift:
    /etc/init.d/fail2ban restart
  5. If your FTP server is controlled via inted/xinetd you don’t need to restart the FTP server as it’s started when required. If your FTP server is standalone then it probably won’t hurt to restart the service manually through /etc/init.d/[your-ftp-server-management-script-here]

That should be pretty much it, if there’s still access attempts going on they’ll be banned from connecting for the bantime you defined, and you’ll be able to see ban details in /var/log/fail2ban.log.

Many, many thanks to the excellent Block FTP Hacking tutorial on The Art of Web – fantastic stuff =D

GTA 4 Carmageddon

This made me smile the other day, so I thought I’d share…

Apparently, you can do this on the PC version just by going into the handling.dat file in your GTA4 folder, and editing “wbias” to be around -9 for all the vehicles you want to go crazy. Something about negative wheel friction, I believe =D

On a title-related bent, I hadn’t thought about the Carmageddon games for a long while – they were pretty cool in their day, but you look now and it’s… well, downright ugly. Still, they used to be a lot of fun to play…

Kevin Butler (re)tweets the PS3 Master Key

Fictional character Kevin Butler is the Sony PS3 spokesperson of sorts – which might lead you to believe that given Sony’s recent sue-a-thon against anyone even daring to mention the hallowed hex digits of the Master Key (Thanks, GeoHot!) – it could, and I’m not an expert, but it could, be a construed as a bad idea for whatever PR firm behind his twitter account (TheKevinButler) to re-tweet said hacked PS3 Master Key.

Which when used correctly will make a PS3 think it’s running legitimate, signed code as opposed to homebrew or warez.

To around 69,000 subscribers.

Kevin Butler PS3 Master Key Retweet

Hahahaha =D Well played, exiva, well played…

Additional: As seen on

PS3 You Sunk My Battleship