User Interface: Applying Changes

In the previous post, I introduced the user interface where the task “Posts, Pages & Media” is defined by the user (search, replace) and where the user can request immediate previews of changes, without actually affecting the database (and without reloading the page).

As soon as a “profile” as been defined, the job/task can be executed by applying all calculated changes to the affected database tables. This is done by clicking the button “Apply Changes + Options”, which renders some options before actually applying the changes. The underlaying software design allows for easy extension of new/better techniques by using “factory classes” like a backup factory, a snapshot factory and an execution factory. Because of the inherent runtime environment restrictions of (php) scripts, additional options like maximum execution time and memory limit per request are used so this software can cope with very large databases and large changes. Although these restrictions can be changed on the fly, it’s not always possible on some webhosting environments, which would break the execution if not considered properly. In principle, changes are applied to the database as long as each request stays within some user-defined and application-controlled (+ fine-tuned) environment restrictions. If not, current execution is paused, stored to disk and resumed at the same execution state with the authenticated request (very much like process execution in modern operating systems).

User Interface: Tweak options before actually applying changes. Software is easily extensible.

Clicking on the button “Apply Changes Permanently” reveals the user interface where the user is dynamically notified about the current process. It also triggers corresponding request to execute the task, retrieving and parsing server responses, updating the ui and doing the next request with parameters being given from the previous request (along with some authentication keys). It has two progressbars: a big one shows the overall progress, while the smaller one is used for each request (a 20s long request can felt so long that the user might think there is no progress any more, but it is). A “details” section shows standard output of each request (see terminal output in linux systems).

User Interface: Progessbars, current status information and detailed output when actually applying changes.

User Interface: First Impressions

The main workspace will have two columns. The left column contains all installed and for the user available tasks with a short description what the task is all about. The right column, a bit smaller, contains a) a list of recent jobs which was created and executed from task settings, and b) a list of past reports having been created automatically during job execution. On the top right, you can find a submenu to configure the plugin, read whats new, tutorials, changelogs, credits and statistics (maybe sometime in future).

Main workspace with a list of available tasks and recent jobs (history) and reports.

The “About section” contains a short software description, along with version number, credits and where to find more information, e.g. tutorials. It also contains the full development and donations plan.

The about section. It briefly describes the plugin, along with the installed version number, homepage url and currently development and donation plan.

The first task to be implemented is “Posts, Pages & Media”. In the first, upper part, you define what the plugin should search for, and with what each match should be replaced with. You can define as many search and replace criteria as you want, but at least one replace criteria has to be defined (i.e. one pair of search and replace criteria). A criteria can be defined in these forms:

General search/replace forms Description
word1 word2 word3 At least one word must be present for the criteria (e.g. post title)
L”a sample phrase” The exact character sequence must be present in the criteria
E”a sample phrase” The criteria must be exactly like the given character sequence.
M”a mysql regular expression” A MySQL regular expression that matches the given criteria
R”a full php regular expression” A full, i.e. php regular expression must match the given criteria. This is a very expensive setting/operation and should only be used if all other methods above cannot be used. I cannot be used in conjunction with mysql regular expressions for one job or task execution. Time complexity at least depends linearly on the database table size being used by the corresponding task and used criteria.

All possible changes to the database can be previewed in the section below, by pressing the button “Preview Changes”. A list of changes will be displayed just a few seconds later, without the need to reload the current page, i.e. it’s done dynamically via AJAX.

Task "Posts, Pages & Media": user interface and options. A preview section can be used to dynamically preview any changes to the database without actually affected the databases.

XML Storage Formats

In a first revision, data are stored in the following XML formats. All objects implement the XMLObject interface.

Jobs.xml: Storing the partitioned root job into smaller child jobs (cause of limited script execution)

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <job id="" snapshot="[path]" snapshotOffset="0" snapshotLength="0" 
         report="[path]" author="[wp_user_id]:[wp_user_name]:[wp_user_role]"
         backup="[path]" taskmodule="[unique_name]">
        <job id="" snapshot="[path]" snapshotOffset="0" snapshotLength="0"></job>
        <job id="" snapshot="[path]" snapshotOffset="0" snapshotLength="0">
            <job id="" snapshot="[path]" snapshotOffset="0" snapshotLength="0"></job>
            <job id="" snapshot="[path]" snapshotOffset="0" snapshotLength="0"></job>
        </job>
        <job id="" snapshot="[path]" snapshotOffset="0" snapshotLength="0"></job>
    </job>
</root>

Report.xml: How to store repots on disk

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <report id="" job="[path]">
        <name>[string]</name>
        <title>[nice string]</title>
        <description>1</description>
        <date>[rfc xxx]</date>
        <author>[name]</author>
        <statistics>
            <cputime>[time_in_sec]</cputime>
            <malloc>[min]:[average]:[peak]</malloc>
            <affectedcells>[number]</affectedcells>
            <affectedtables>[number]</affectedtables>
            <notifications count="[number]">
                <email recipient="[name] <[mail_address]>" subject="[string]" contents="[summary|changes|statistics]" size="[size in bytes]" role="[superadmin|admin|editor|author]" />
                <email recipient="[name] <[mail_address]>" subject="[string]" contents="[summary|changes|statistics]" size="[size in bytes]" role="[superadmin|admin|editor|author]" />
                <email recipient="[name] <[mail_address]>" subject="[string]" contents="[summary|changes|statistics]" size="[size in bytes]" role="[superadmin|admin|editor|author]" />
            </notifications>
        </statistics>
        <protocol>1</protocol>
        <meta>
            <taskmodule name="[unique_name]" />
            <dbtables>[name1],[name2],[name3]</dbtables>
        </meta>
    </report>
</root>

Backups.xml: An index like structure to point to backup data

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <backup database="[name]" table="[name]" path="[path_to_mysql_dump_data]" date="[rfc xxx]" job="[path]"
            arguments="[arg1],[arg2],[arg3],..." cputime="" malloc="[min]:[average]:[peak]" />
    <backup database="[name]" table="[name]" path="[path_to_mysql_dump_data]" date="[rfc xxx]" job="[path]"
            arguments="[arg1],[arg2],[arg3],..." cputime="" malloc="[min]:[average]:[peak]" />
    <backup database="[name]" table="[name]" path="[path_to_mysql_dump_data]" date="[rfc xxx]" job="[path]"
            arguments="[arg1],[arg2],[arg3],..." cputime="" malloc="[min]:[average]:[peak]" />
    <backup database="[name]" table="[name]" path="[path_to_mysql_dump_data]" date="[rfc xxx]" job="[path]"
            arguments="[arg1],[arg2],[arg3],..." cputime="" malloc="[min]:[average]:[peak]" />
</root>

Snapshot.xml: where all differential data for one table is stored

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <meta database="[name]" table="[name]" primary_column="[name]" version="[internal_structure_version]">
        <column name="[name]" cid="[column_id]" />
        <column name="[name]" cid="[column_id]" />
        <column name="[name]" cid="[column_id]" />
    </meta>
    <body>
        <cell pid="[primary_row_id]" cid="[column_id]">
            <pre>1</pre>
            <post>1</post>
        </cell>
        <cell pid="[primary_row_id]" cid="[column_id]">
            <pre>1</pre>
            <post>1</post>
        </cell>
        <cell pid="[primary_row_id]" cid="[column_id]">
            <pre>1</pre>
            <post>1</post>
        </cell>
        <cell pid="[primary_row_id]" cid="[column_id]">
            <pre>1</pre>
            <post>1</post>
        </cell>
    </body>
</root>

History.xml: An index of previous tasks that point to root jobs

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <job date="[rfc xxx]" title="[string]" path="[path]" />
    <job date="[rfc xxx]" title="[string]" path="[path]" />
    <job date="[rfc xxx]" title="[string]" path="[path]" />
    <job date="[rfc xxx]" title="[string]" path="[path]" />
</root>

Locks.xml: A list of all locks to ensure database coherence (for one part)

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <lock database="[name]" table="[name]" date="[rfc xxx]" taskmodule="[name]" job="[path]" />
    <lock database="[name]" table="[name]" date="[rfc xxx]" taskmodule="[name]" job="[path]" />
    <lock database="[name]" table="[name]" date="[rfc xxx]" taskmodule="[name]" job="[path]" />
    <lock database="[name]" table="[name]" date="[rfc xxx]" taskmodule="[name]" job="[path]" />
</root>

Taskpresets.xml: Easily exchange task settings (e.g. where to search for what). Can be used by other plugin authors in case of their plugin changes. The user can easily insert or upload such a file.

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <preset title="[string]" date="[rfc xxx]" author="name" authormail="mailaddress" authorurl="url">
        <taskmodule name="[name]">
            <param name="[name]" value="[string]" />
            <param name="[name]" value="[string]" />
            <param name="[name]" value="[string]" />
            <param name="[name]" value="[string]" />
            <meta>
                <selector search="[string]" order="[number]" />
                <selector search="[string]" order="[number]" />
                <selector search="[string]" order="[number]" />
                <selector search="[string]" order="[number]" />  
                <filter search="[string]" replace="[string]" order="[number]" />
                <filter search="[string]" replace="[string]" order="[number]" />
                <filter search="[string]" replace="[string]" order="[number]" />
                <filter search="[string]" replace="[string]" order="[number]" />
            </meta>
        </taskmodule>
    </preset>
</root>

Likewise to database designs, logical grouping of data and no to repeat similar data was a strong requirements for designing these xml structures. Sometimes, data is repeated only to prevent reading of multiple xml files for simple things (so to speed up processes a bit). A compromise!

Of cause, the database could have been used as well, but these XML files simplifies things a lot, and using multiple database tables would only be a burden on the WordPress database used (esp. in case of plugin updates and possibly database table updates!).

Conceptual Design Revision 2

The design model have been extended with some activities for reporting, execution of root and single jobs and a new class diagram showing which classes implement the interface XMLObject (those objects can be saved in an XML structure, named SimpleXMLElement in PHP). The previously suggested diagrams (in the last post) have updated correspondingly.

Simple things first: ReportingActivity. In essence, reporting is done on the level of root jobs, most likely containing multiple child jobs (due to partioning of snapshots). So a report spans several client-side requests to execute child jobs because they logically belong to one operation/task: the user should not know that the operation is split into multiple junks to be reliably executed one by one – only an interactive AJAX-based interfaces generated by UI_ExecuteJob is displayed. During each such request, statistics, meta data and the protocol is collected and stored within the report (and saved to disk). Execution of one job should not be done in parallel, but in sequence. Otherwise, lost update is possible concerning the report file. A speed up with parallel execution of child jobs is not applicable because all child jobs modify the same database table, which will be locked in most circumstances. The application checks whether the currect child job is the last with its hierarchy – if so additional bookkeeping, formatting of meta data according to the definition of the task module being used, and automated notification (version 2.1) will be carried out.

Activity diagram "ReportingActivity": how to generate one report across multiple client requests.

Next: the UndoOperationActivity should be rather self-explanatory from the model itself because of the detailed notes. In principle, the undo operation just reverses each snapshot within a given time interval and for given database tables or task modules, plus their temporal relation is reversed to. Then they are added to one new job, the undo operation. Executing this new job is pretty much the same as doing a “normal” execution of tasks (i.e. forward). So there is no need to repeat core logic like backup before doing a root job, splitting into good junks (child jobs), displaying an interactive UI and client/server communication etc.

Activity diagram "UndoOperationActivity": the additional logic to undo previous operations, stored as Snapshots.

There are two different activities, one for the execution of a root job, and one for its child jobs. The simpler one, ExecuteSingleJobActivity shows what the server does for a client request to execute a single job, i.e. a child job. No other jobs within the hierarchy are executed (see limited execution time). Its triggered by one interactive request by the UI_ExecuteJob user interface.

Activity diagram "ExecuteSingleJobActivity": how the server processes a single request to execute a one child job (being part of one task).

The rather complete picture of executing one task (generated by a task module) is modeled with ExecuteRootJobActivity. What changed? This model as been renamed, undoOperationActivity() and executeSingleJobActivity() are both used instead of literal descriptions. It makes understanding easier how the later activity fits into the whole process. This model shows how this software tries to simulate/implement database coherence (backups, undos etc) because MyISAM does not support transactions!

Activity diagram "ExecuteRootJobActivity": from client and server side setup to the execution of each individual (pre-partitioned) child job, reporting, database state restoration (revision 2)

Concerning the class diagram CoreFunctionality, class Operation was to general and has been renamed to Job, with much better methods. Again, to respect the afore mentioned activities. Another difference: the class PerformanceMonitor (used for measuring cpu time and memory allocations to optimize the execution of large tasks), and the new interface XMLObject.

Class diagram "CoreFunctionality": all classes needed to implement core functionality like reporting, database coherence, undo and tasks. Design is modular (easy extension with more task modules)

The interface XMLObjectsimply forces all classes to adhere to certain “standards” so that it’s much clearer for the programmer which objects can be easily stored in an XML hierarchy and loaded from it. Storage and loading is a matter for the task, and not for someone else (so to reduce complexity and to adhere to logic encapsulation).

Class diagram "XMLObjectSpecialisations": objects can be stored to and parsed from XML structures in a distributed fasion.

Again, all models have been created with ArgoUML.

First Draft: A Conceptual Design

With the first draft for the conceptual design of SSR2, implementation can begin very soon – for the release 2.0. I just have to make sure that the proposed designs meets all anticipated requirements for SSR2 while allowing easy extension and more importantly: a clean, minimal source code to keep things as simple as possible. Major design goals are: ease of use, reliability, robustness, performance, scalability, correctness, database coherence (esp. for MyISAM tables).

I am not going into further details right now (but later), and I am not going to write half a book about the following two models. When studying them, just keep in mind that the first model (a class model) show major class and their dependencies, but omits (most) class methods to keep things simple. The idea is that the core functionality takes care about correct, reliable and coherence operation execution, while the tasks “simply” build all parameters, filters and selectors that make up an operation (also called a job). Because PHP is based on script execution with a limited amount of time (often fixed), big jobs must be split into smaller junks – otherwise their execution will be terminated by PHP. When writing large amounts to the database, execution can easily reach the limit.

Class Diagram showing major classes and class methods for SSR2

The second model (an activity diagram) simply shows what will happen between client (user) and server (SSR2) when actually execution one big operation. Note that one operation is being split into multiple jobs (if required). Also note that restoring the database state is already an intrinsic part of this model – because SSR2 considers data coherence to be very important.

Activity diagram about how to actually execute an operation/job. An job is created by "task modules", like "Posts, Pages & Media".

All diagrams have been modeled with ArgoUML for Ubuntu.

As always: should you have any questions, just get in touch with me. Thanks.