How-To: Extract PHP-Nuke Articles from an SQL INSERT Statement in Python

I found an old SQL dump of this site yesterday which I thought was lost to time. The site ran php-nuke back then, and rather than importing the SQL into a database and fiddling with it to export all the old articles I thought I’d take the python hammer to it. It’s a bit of a bodge/hack – but it seems to work well enough.

The idea is you copy JUST the insert statement which deals with “stories” (nuke’s word for articles / posts) from the SQL dump and put it in a file (mine’s called “nuke_stories.sql”), then run the parse_nuke_stories python script and it’ll dump out all the articles into separate files with filenames like “Article_ID_YEAR_MONTH_DAY.html” and such.

Here’s the top few lines of my SQL insert statement:

INSERT INTO `nuke_stories` (`sid`, `catid`, `aid`, `title`, `time`, `hometext`, `bodytext`, `comments`, `counter`, `topic`, `informant`, `notes`, `ihome`, `alanguage`, `acomm`, `haspoll`, `pollID`, `score`, `ratings`, `associated`) VALUES
(1, 0, 'r3duxAdmin', 'W00t! The site it be a-changin''...', '2004-09-10 12:35:06', 'Well, PHP-Nuke seems to be up and running, so now it''s time to mess w/ it in a huge way and configure all the millions of options needed to make it just the way I want.\r\n

\r\nDaunted? Me? Er, yeah. You could say that...', '', 0, 83, 1, 'r3duxAdmin', '', 1, '', 0, 0, 0, 0, 0, ''),

Here’s what it’ll dump out into a file called Article_001_2004_09_10.html if this was all that was there:

Article ID : 1
Author : r3duxAdmin
Title : W00t! The site it be a-changin’…
Date : 2004-09-10 12:35:06

Well, PHP-Nuke seems to be up and running, so now it’s time to mess w/ it in a huge way and configure all the millions of options needed to make it just the way I want.

Daunted? Me? Er, yeah. You could say that…

And here’s the actual python (version 3) code to parse the SQL file:

# Function to find a single apostrophe that closes the field and return its location
def find_end_apostrophe(location, story):
    while True:
        # Move along
        location += 1
        # Grab two chars
        char1 = story[location:location+1]
        char2 = story[location+1:location+2]
        # If they're both apostophes then skip ahead 2 chars
        if char1 == "'" and char2 == "'":
            location += 2
        # An apostrophe and something else? It's the end apostrophe so bail!
        elif char1 == "'" and char2 != "'":
    return location
# Function to parse a story into separate fields and place them into the parsed_story list
def parse_story(story):
    parsed_story = []
    # Fields:
    # 0 = `sid`, 1 = `catid`, 2 = `aid`, 3 = `title`, 4 = `time`, 5 = `hometext`, `6 = bodytext`, 7 = `comments`,
    #`8 = `counter`, 9 = `topic`, 10 = `informant`, 11 = `notes`, 12 = `ihome`, 13 = `alanguage`, 14 = `acomm`,
    # 15 = `haspoll`, 16 = `pollID`, 17 = `score`, 18 = `ratings`, 19 = `associated`
    # 0 - Add story ID
    temp_end = story.find(",")
    field = story[1:temp_end]
    #print("ID:", field)
    # 1 - Add category ID
    temp_start = temp_end + 1
    temp_end = story.find(",", temp_start)
    field = story[temp_start:temp_end]
    #print("Category:", field)
    # 2 - Add author
    temp_start = temp_end + 3
    temp_end = story.find("'", temp_start)
    field = story[temp_start:temp_end]
    #print("Author:", field)
    # 3 - Add title
    temp_start = temp_end + 4
    temp_end   = find_end_apostrophe(temp_start, story)    
    #print("Title locs:", temp_start, temp_end)
    field = story[temp_start:temp_end].replace("''", "'")
    #print("Title:", field)
    # 4- Add time
    temp_start = temp_end + 4
    temp_end = story.find("'", temp_start)
    #print("Time locs:", temp_start, temp_end)
    field = story[temp_start:temp_end]
    parsed_story.append(field) # initial_story_split[4][2:-1])
    #print("Date:", field)
    # 5 - Add home text (i.e. the first paragraph)
    temp_start = temp_end + 4
    temp_end   = find_end_apostrophe(temp_start, story)
    #print("Home locs:", temp_start, temp_end)
    field = story[temp_start:temp_end].replace("''", "'").replace("<br><br>\\r\\n", "<br/><br/>").replace("\\r\\n", " ").replace("  ", "")
    #print("Home:", field)        
    # 6 = Add body text
    temp_start = temp_end + 3
    temp_end   = find_end_apostrophe(temp_start, story)
    #print("Body locs:", temp_start, temp_end)
    field = story[temp_start:temp_end].replace("''", "'").replace("<br><br>\\r\\n", "<br/><br/>").replace("\\r\\n", " ").replace("  ", "") # Last one is 2 'A with a hat' chars used for bullet points
    # If there is no body blank the field, otherwise strip the first char
    if field == "'":
        field = ""
        field = field[1:]        
    #print("Body:", field)
    # Not bothered about the rest of the fields
    return parsed_story
# ---------------------- Main --------------------------
#import codecs
#sql_file ="nuke_stories.sql", encoding="utf-8")
# Read the SQL dump of stories into a list of lines where each line is a story
sql_file = open("nuke_stories.sql", "r")
story_list = sql_file.readlines();
num_stories = len(story_list)
print("Number of stories: ", num_stories, "\n")
# Newline char
nl = "<br/>"
outfile = 0
for story_number in range(1, num_stories - 1): # Start from 1 because line 0 is the insert statement itself
    # Parse the story (i.e. article)
    parsed_story = parse_story( story_list[story_number] )
    # Grab the first section of the date only. Format: YYYY-MM-DD
    date_str = parsed_story[4][:parsed_story[4].find(" ")].replace("-", "_")
    # Construct the filename then create the file for writing
    filename = "Article_" + parsed_story[0].zfill(3) + "_" + date_str + ".html"
    outfile = open(filename, "w")
    # Dump the parsed story details to file
    outfile.write("Article ID : " + parsed_story[0] + nl)
    # Skipping Category ID - don't care.
    outfile.write("Author     : " + parsed_story[2] + nl)
    outfile.write("Title      : " + parsed_story[3] + nl)
    outfile.write("Date       : " + parsed_story[4] + nl + nl)
    outfile.write(parsed_story[5] + nl + nl) # Home text (i.e. first paragraph)
    outfile.write(parsed_story[6])           # Body text (i.e. rest of article)
    # Close the file to save it
print("All done! =D")

As mentioned, it’s a complete hack – but if you need to do something similar it may work as a starter to get you going…

How To: Fix Thunderbird failing to open HTTP / HTTPS links

If you recently upgraded Thunderbird to v52 onwards and all links in emails stopped working (i.e. clicking on a link doesn’t open them in your browser) then you can fix it like this:

  1. In Thunderbird, go to Tools | Clear Recent History and clear everything (this just removes your usage history, not any emails), then
  2. Close Thunderbird, go into your profile folder (i.e. where the emails are stored) and delete all the places.sqlite files.

Tap-tap. Job done.


How To: Connect to a Linux shared drive from a Windows guest in VMware

I always forget how to do this, so I’m writing it down…

1 – Getting and installing VMware Tools

First, we’ll need the VMware Tools installed. If VMware is being a dick and failing to get the tools ISO automatically then you can power down the VM, then in the main VMware Player window before you’ve opened any VM up, go to File | Player Preferences and click [Download All Components Now].

Once done, the iso files will be in a location such as: /usr/lib/vmware/isoimages. In this case (with a Windows 10 guest) I want the windows.iso image – so launch the VM, mount the ISO (from Virtual Machine | Removable Devices | CD/DVD | Settings…) then go to “This PC”, right-click on the mounted drive and select “Open Autoplay…” and let it install.

Update – 2017-03-03

Sometimes VMware tools is also a dick and needs repairing, so do the above but select the “Repair” option to uninstall/re-install it, then power down the VM.

As I found out today, the repair option is rubbish and doesn’t work – to get mapping of network drives working if it isn’t already you need to completely uninstall VMware tools in your guest OS, then reboot, then reinstall VMware tools, then reboot again. On this final reboot (and assuming you have the ‘Map to network drive’ option ticked in your VM settings – see below) then it should all be sorted and if you go to My PC | Network you should see a vmware-host folder which is mapped to Z: (by default – map it to something else if you want to) – from which you can then use Z:\WHATEVER_SHARE_NAME_YOU_GAVE to access your shared folders.

2 – Enabling Shared Folders

With the VM still off, go to the settings for your virtual machine select the Options tab and then the Shared Folders option, then select the “Always Enabled” radio-button. Now add a folder to share and give it a friendly name, I chose “Linux” and pointed it at my linux user’s home folder (so for me that’d be “/home/r3dux”).

I haven’t been able to get the “Map as a network drive in Windows guests” option working for a while, but it shouldn’t be a massive problem as you can access it via the double-backslash notation in the next step.

3 – Access the shared folder from the guest

Boot up your VM (i.e. Windows guest), and from the search bar or URL entry in windows explorer enter:

\\vmware-host\Shared Folders\

So for example, I’d access my shared folder via going to:

\\vmware-host\Shared Folders\Linux

This should work, or at least it works fine for me. However, if I try to map the drive – like if I want to map that location to Z:, and I enter the exact same (known working) path to the shared folder… it doesn’t work. And I have absolutely no idea why. If you let the troubleshooter run it just shrugs at you, which to be fair is basically all any Windows troubleshooter has pretty much ever done in my experience.

To access the network shared folder easily without assigning a drive letter you can just go up one level (i.e. to \\vmware-host\Shared Folders) and then right-click on your shared folder and choose to send a shortcut to the windows desktop, from which you can access it easily enough just like any other folder.

Wrap Up

Hope this gets things working for you, even if it won’t map the drive to a letter for some bizarre reason. If you figure out why that might be I’d love to know!.


How To: Fix Eye of Gnome / Cairo crashes when loading large images

If Eye of Gnome is crashing when attempting to load large images (think 10MB and up), you can typically fix it by modifying the kernel SHM (shared memory) settings – in this case upping the max shared memory to 512MB thus:

Try adding the following to /etc/sysctl.conf and run ‘sysctl -p’ or reboot:



This worked for me, so hopefully it’ll work for you also.


How To: Fix Intel 8260 (rev 3a) slow / rubbish wireless issues in Linux

My laptop has an Intel Corporation Wireless 8260 (rev 3a) wireless card. It says so right in the lspci -v output:

02:00.0 Network controller: Intel Corporation Wireless 8260 (rev 3a)
	Subsystem: Intel Corporation Dual Band Wireless-AC 8260
	Flags: bus master, fast devsel, latency 0, IRQ 327
	Memory at dc400000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [c8] Power Management version 3
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [40] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number a4-34-d9-ff-ff-38-91-54
	Capabilities: [14c] Latency Tolerance Reporting
	Capabilities: [154] L1 PM Substates
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi

However, in Linux it’s been dropping out and going slow and all sorts of rubbish. Looking in dmesg you can typically see stuff like (edited):

Loaded firmware version: 22.361476.0
iwlwifi 0000:02:00.0: 0x00000084 | NMI_INTERRUPT_UNKNOWN
Lots more red text here...

So can we fix it? Like Bob the Builder, yes we can – but it’s a two-step…

Part 1 – Kernel Modules

The 8260 card uses the iwlwifi kernel module, and the microcode for that is stored in /lib/firmware.

Specifically, you’re looking for the files: iwlwifi-8000C-SOME_NUMBER.ucode.

So for example, I see the following:

r3dux [ /lib/firmware ]$ ls -alh /lib/firmware/iwlwifi-8000C*
-rw-r--r-- 1 root root 1.7M Oct  5 21:35 /lib/firmware/iwlwifi-8000C-13.ucode
-rw-r--r-- 1 root root 2.3M Oct  5 21:35 /lib/firmware/iwlwifi-8000C-16.ucode
-rw-r--r-- 1 root root 2.3M Oct  5 21:35 /lib/firmware/iwlwifi-8000C-21.ucode
-rw-r--r-- 1 root root 2.1M Oct  5 21:35 /lib/firmware/iwlwifi-8000C-22.ucode

The kernel seems to pick the highest number in the 8000C range, so it’ll pick the 8000C-22 variant. Only this is borked. To revert to the previous 21 revision, simply rename the file extension of the 22 version to something different, for example:

sudo mv /lib/firmware/iwlwifi-8000C-22.ucode /lib/firmware/iwlwifi-8000C-22.ucode.BORKED

However, at least in my experience, this isn’t enough to stop the module crash/restart issues – so we need to…

Part 2 – Disable Wireless N

If I just do the above, I still get issues in dmesg where the wireless card’s crashing and resetting itself – so to bypass the failing code, we need to disable wireless N (and only use B/G). Sure, this is going to be slower than N, but it’s going to be faster than a borked version of N – so off we go…

The parameters to the iwlwifi module include one called 11n_disable – and to set that on boot we need to have a /etc/modprobe.d folder (create the directory if necessary), then into that put a file with any name ending in .conf such as iwlwifi.conf (makes sense, right?) with the following contents:

options iwlwifi 11n_disable=1

Once that’s in and saved, reboot and your wireless should work properly again – no dmesg crash data, no slow-downs, no bullshit.

There are actually a few different values that can be used, but “1” works for me. The array of valid values for the 11n_disable property can be seen by entering:

modinfo iwlwifi

And the current settings can be checked by hitting:

systool -v -m iwlwifi

With the 21 revision of the microcode and wireless-N disabled you should find your wireless card now works properly. Huzzah!


You may want to know that I did this on an Arch Linux system (kernel: 4.8.13-1-ARCH linux-firmware: 20161005.9c71af9-1), and that I also set my regulatory authority code which controls allowable wireless frequencies/channels (via installing the crda package and setting the config to my local country, which is Australia, so “AU” – further reading: – although I’m not sure if changing the regulatory domain actually did anything to the above fix instructions. Thought I’d mention it all the same.