Migrating from Statpress plugins to StatComm: a how-to guide

The first release from this Statcomm 1.7.00 codename Taal  is about StatComm migration.

As you maybe are already aware of, there are many statistics plugins on the WordPress Repository. Many of them are in the making but also a lot of plugins are abandoned or just their author decided not go going forward anymore.

We decided to focus our analysis in those plugins that meet certain criteria:

  • The plugin uses a prefix_statpress table (for example case wp_statpress in default WP installation)
  • The plugin is statpress-like or with close resemblance of this plugin, usually an improvement.
  • The plugin has ceased their updates for a relatively long time (a year or more) and/or is not supported at all, except few isolated collaborations for the community and/or small tips & tricks.


The following plugins matches those requirements with some notes:

Statpress: (1.4.1) Last update 2010-1-5, more than 2 years ago).
Statpress Reloaded (1.5.21) Last update 2009-5-10,  more than three years ago
Smashing Analytics (1.5.22) Last update 2011-03-12 more than 1 year ago.
Statpress Seolution  (0.4.2.2) Last update 2010-9-28 , more than 1 year and half
StatpressCN  Last update 2011-1-27 more than one year ago. Take notice that StatpressCN has chinese support (Statcomm does not) and it has a couple of features than Statcomm does not yet cover.
StatSurfer (1.2.1) Last update 2010-10-31 one half and year ago. Notice Statsurfer has couple of features that Statcomm does not cover yet.

Migration data: is it for you?

First question we should do it ourselves is: Why would I wanted to migrate my data?
There are several reasons you would want to do it::

  • You will be moving from an unsupported or outdated plugin to a supported one.
  • You want to preserve it for future analysis
  • You want to preserve and also discard unnecessary information.
  • You want to test StatComm but don’t you want to stop using the previous plugin.
  • You want to compare both plugins working together.
  • StatComm will improve in future releases tool for analysis of historical data, so it could be useful migrating the data for further processing.
  • … (fill the blanks)


Migration Options

Currently you can use two ways to migrate the data to StatComm:
1) The Migration Tool. This option provides a way to migrate and filter information from older plugins to StatComm. Safer, but slower.
2) A hack way, a faster but trickier solution.

Let’s see what is in it for you in both options.

1) Migration tool

Difficult level: low.

Before explaining the tool, you should know a few things about how StatComm and most Statpress plugins are currently storing information.

When a user access to your site , StatComm stores data about the user in a database table. Some information is needed to be saved (the mandatory data), and some other information is inferred from those fields.
Example: Statcomm saves a timestamp field when the user arrives. From this timestamp we can infer the fields date and time. So we have three fields, timestamp (mandatory) and Date and Time (inferred over timestamp field).

This idea is also true for other fields like Ip and (nation, country),  agent and (os,browser, spider)
urlrequested  and (feed) , etc.

The migration tool works over the mandatory  fields and infers the other fields. In this way we normalize the incoming data when it is migrated.

¿What are all the mandatory fields?

These are: ip,urlrequested,agent,referrer,user,timestamp,language,threat_score,threat_type and statuscode

From all those fields, three are optional (threat_score,threat_type,statuscode) since many older plugins are not supporting them. The others non optional fields are currently present in every other statpress plugins.

Summing up,  seven mandatory fields are responsible for almost 21 fields on the current StatComm table, where 11 are inferred, one is auto-generated (id) and three could vary from plugin to plugin.

A special case is the nation field, which is inferred through hostipaddress(ip), which makes an timely expensive operation. For such reason, this field is just copied.

The Migration Panel


StatComm 1.7.00 provides a new menu option named Migration. The menu is straightforward with a very few options. There are few things to consider when you decide to choose migrating data to StatComm:

  1. The tool works only if you have a statpress table in your wordpress database (generated by other plugin).
  2. The tool provides two filters: filter spiders and/or selecting a date to drop data until that date. Configure to meet your needs and press Save settings to take effect before start migration.
  3. Save settings does NOT start migration, but the migration process is ruled according the settings you specify.
  4. The migration will delete all StatComm data before proceed so be sure what are you doing! This is a tool for first-time users, if you want to keep the StatComm data, you will have to backup the table previously
  5. Although is not mandatory, we recommend to deactivate the older plugin to avoid capturing data while migrating.
  6. VERY IMPORTANT: When the StatComm is migrating data , it suspends any traffic capture. This is done in order to avoid inconsistencies with the final records migrated. When it finishes, it restores the capturing data process.
  7. VERY IMPORTANT (2): Once the migration started, you’ll need to wait until it is finished. Avoid to close the page or browsing to another one while migrating. If you do that, two things will happen::
    • The migration will be incomplete. Solution: you can restart it again, but you cannot continue a migration. It will start from zero.
    • You won’t give the plugin a chance to restart capturing data process. The common scenario would be the plugin stops capturing traffic. Solution: You can solve it easily restarting the migration and waiting to finish, or deactivating/ activating the plugin from the control panel.
  • To our standards, the process is slow. In our system (not optimized) we migrated 120k records in about 30 minutes (around 68 rec/sec). This can be improved by filtering spiders.  Typical speeds are 100 rec/sec, which translates on 6k records/minute. Evaluate your time checking the record count on menu’s header (in green).

2) Hack way

Difficult level: medium

If you have many records (above 200k+) the usual migration tool will be slow. If you are skilled enough, you maybe want take this approach first. Is a shortcut where you will need access to mysql console to do it, like phpmyadmin. The process is (much) faster than migration tool but:

  • No data will be normalized in the process
  • Useless fields will remain after the process. You can safely delete useless fields, but it is advised is risky.(more on this later)
  • Field order could differ from the original structure (it won’t affect the plugin operation)

We assume that you:

  • Have a default installation of WordPress and the prefix is wp_ .
  • Have access to a mysql console (like PhpMyAdmin) to access your database
  • we are using a WP default (wp_statpress,wp_statcomm). If you have different prefix, adjust your steps accordingly.

The process is very straightforward:

  • Stop both statistics plugins before start (the old Statpress plugin and StatComm)
  • If you want to backup the StatComm data table, now is the time(!)
  • delete wp_statcomm table.
  • copy wp_statpress to wp_statcomm (structure and data)
  • Reactivate StatComm. This should fire an update table process when it found some table structure discrepancies.
  • End. All statpress data is migrated to Statcomm.


On plugin activation, StatComm will convert the table to its own internal structure, adapting the fields as necessary. This process will take a bit . In our case with a 100k records table it should take less than a minute. After the plugin is activated ), the migration is completed.

The hack way was tested on the following plugin tables:

  • Statpress (migration ok)
  • Statpress Reloaded (migration ok)
  • Smashing Analytics (migration ok)
  • Statpress Seolution (migration ok)
  • StatpressCN (migration ok, although there are two fields that Statcomm will not use: ptype and pvalue)
  • Statsurfer (migration ok).

If you are coming from other plugins, try one or both migration procedures and tell us what you think.

Comments

  1. Hi!
    I have a problem with the new migration feature. I used the first method and sadly I can just migrate yesterday’s data (without feeds) of StatPress (1.4.1). Everything else like March 1st or even May 28th 2012 fails after a few seconds.

    Greetings and cheers

    Dennis

  2. admin says:

    Hi Dennis:
    Do you receive any errors in the process? If you do please send us the messages we’ll try to help you out.
    Thank you!

  3. Hey :)
    There are no error messages or something in the prcoess. It just stops and shows the “Migration In Progress” report. I also waited a few more minutes just in case the tool is still working, but unfortunately nothing happens after that. Here are two examples of these reports. The first is the “successful” migration of the data (without feeds) from June 1st 2012 and the second is a failed attempt for the data from May 28th 2012.

    1)

    Migration In Progress…
    2012-06-2 07:12:16 PM – Starting migration process…
    2012-06-2 07:12:16 PM – Current settings:
    2012-06-2 07:12:16 PM – Filter Spiders: OFF
    2012-06-2 07:12:16 PM – Cutting date: 20120601
    Initializing migration: 4075 records, 5 iterations
    2012-06-2 07:12:16 PM – Clearing destination table…
    2012-06-2 07:12:16 PM – Done. Starting migration…
    2012-06-2 07:12:16 PM – Processing 1000…(24%) (from 4075)
    Average: 307 rec/sec | Time elapsed: 00:00:03
    2012-06-2 07:12:20 PM – Processing 2000…(49%) (from 4075)
    Average: 389 rec/sec | Time elapsed: 00:00:05
    2012-06-2 07:12:22 PM – Processing 3000…(73%) (from 4075)
    Average: 358 rec/sec | Time elapsed: 00:00:08
    2012-06-2 07:12:25 PM – Processing 4000…(98%) (from 4075)
    Average: 406 rec/sec | Time elapsed: 00:00:11
    2012-06-2 07:12:28 PM – Processing 5000…(100%) (from 4075)
    Average: 2137 rec/sec | Time elapsed: 00:00:11

    Migration completed!!!

    Migration resume:

    Final total time:00:00:11 , 4075 records processed, 4075 migrated
    No errors found.

    ————————————————

    2)

    Migration In Progress…
    2012-06-2 07:13:37 PM – Starting migration process…
    2012-06-2 07:13:37 PM – Current settings:
    2012-06-2 07:13:37 PM – Filter Spiders: OFF
    2012-06-2 07:13:37 PM – Cutting date: 20120528
    Initializing migration: 14617 records, 15 iterations
    2012-06-2 07:13:37 PM – Clearing destination table…
    2012-06-2 07:13:37 PM – Done. Starting migration…
    2012-06-2 07:13:37 PM – Processing 1000…(6%) (from 14617)
    Average: 309 rec/sec | Time elapsed: 00:00:03
    2012-06-2 07:13:40 PM – Processing 2000…(13%) (from 14617)
    Average: 368 rec/sec | Time elapsed: 00:00:05
    2012-06-2 07:13:43 PM – Processing 3000…(20%) (from 14617)
    Average: 373 rec/sec | Time elapsed: 00:00:08
    2012-06-2 07:13:46 PM – Processing 4000…(27%) (from 14617)
    Average: 327 rec/sec | Time elapsed: 00:00:11
    2012-06-2 07:13:49 PM – Processing 5000…(34%) (from 14617)
    Average: 343 rec/sec | Time elapsed: 00:00:14
    2012-06-2 07:13:52 PM – Processing 6000…(41%) (from 14617)

    It seems, that the tool just works for maybe 11 or 12 seconds and after that it stops every time.

  4. We are currently checking if anything is wrong inside the migration procedure and we will get back to you. Thank you for the information you provided.

  5. admin says:

    We found a possible cause about this issue.
    The procedure changes the setting set_time_limit in order to avoid timeout problems. Unless we set set_time_limit, the process would stop in the middle without any warnings. We believe this is your case.

    In some servers, you can’t assume that we are able to change this setting, ending up in an error like :
    Warning: set_time_limit() has been disabled for security reasons in xxxxx.php on line yyy

    The problem is our procedure silently ignores this warning and assumes the set is possible, and that is the root of the problem.

    One workaround to this problem is to ask admin server to enable set_time_limit allowing the migration to be completed.

    For our side, we’ll try to detect this condition before the process start.
    Our apologies for the inconvenience and thank you for let us know about this issue.

  6. Did the migration issue get fixed? I tried migrating data again with the new update and the migration process still stops at about 30%… no error codes, just stops.

  7. admin says:

    Thanks for your comments.
    Currently, the migration attempts to detect if the user could set the set_time_limit environment variable.
    If the user can’t, an error is raised. That’s is captured and displayed. If not, everything works as normal.
    We didn’t find more scenarios. We’ll re-check if everything is working as expected and we’ll contact you later.

  8. Robert Dawson says:

    Hey, I have followed the steps all the way, and its showing my table records, however the button still says migration disabled?

    Any help would be great

    Cheers

  9. admin says:

    That is a issue that will be solved in 1.7.50, due this week. Our apologizes for the problem.

Speak Your Mind

*