Get started - cleaning up pages

The most common use of Extreme Clean is to clean up old page versions. Here are some tips to get you started.

You can find the Extreme Clean dashboard page at Dashboard > System and Ssettings > Extreme Clean > Configure and Run Cleaners, or simply enter 'Clean' or 'Extreme' in the search window. 

This page is multi-purpose, enabling cleaners to be selected, configured, run directly from the dashboard or enabled for the extreme clean job.

Select the Cleaner

Click 'Choose a Cleaner' and a modal overlay will list all the installed and available cleaners. Scroll down and  select Page Version Cleaner to begin working with the Page Version Cleaner.

As an aside, clicking the 'i' icons in the modal will show more information about the each cleaner.

Page Version Cleaner

The Page Version Cleaner emoves old page versions, beginning with the oldest and continuing to a configured number of remaining versions. The currently published page and more recent un-published versions are never cleaned.

Start by selecting the maximum number of versions to keep. If you make this "1", only the currently approved version will be kept. So any Lorem-Ipsum in old versions of pages will be purged. Extreme Clean cleans up old versions, but always leaves the current version of a page (and the most recent un-published version if newer than the approved version).

Versions in slices

Cleaning can require a lot of processing, so Extreme Clean spreads this processing across a number of slices. The next option 'Number of page versions to remove with each slice' governs how much work is done in each slice. 

When pages have lots of attributes, or on a slow server, you can decrease the number of page versions cleaned in each slice to stay within processor limits. When pages are relatively simple and the site is hosted on a powerful server, you can increase the number of versions cleaned with each slice to speed things up. 

For now, while you are new to Extreme Clean, keep this at '1'. You can experiment with bigger slices later as you gain experience with Extreme Clean.

Closely related is 'Massive Mode', which changes the way the Page Version Cleaner crawls pages. Under normal operation, the Page Version Cleaner makes a pre-selection of the pages with old versions that require cleaning, then each slice does some of that cleaning to remove 1 or more old versions.

On very big sites, the pre-check that selects the pages with most old versions and queues them up for cleaning can run into limits. In Massive Mode all pages are queued and processed for old versions without any pre-check, so reducing the initial overhead at the expense of subsequenly processing pages that may not have any old versions for cleaning, so adding unnecessary steps.

Unless you have a lot of pages, or unless the Page Version Cleaner stalls when you run it, you can leave Massive Mode un-checked.

Global Options

The remaining settings are global to all cleaners and to the Extreme Clean job.

As mentioned above, Extreme Clean spreads cleaning across a number of slices. The 'Maximum number of steps to run' controls the maximum number of slices that can be attempted each time any cleaner is run from the dashboard or when the Extreme Clean job is run.

For page version cleaning, the slices required will be the sum of all old versions across all pages divided by the number of page versions to remove with each slice. If the maximum number of steps to run is bigger than the slices required, Extreme Clean simply finishes early. If the maximum number of steps to run is smaller than the number of slices required, you can simply run Extreme Clean again.

500 is a good number if you are thinking of using the Extreme Clean job because it avoids a long standing core bug that breaks job execution if there are too many slices. If you are only working from the dashboard cleaners, you can use higher numbers of slices.

If you like, you can set this to a small number while experimenting, then increase to a more practical number to use Extreme Clean in anger.

Check only or Actually clean

When 'check only' is selected, cleaners will be run, but nothing will actually be deleted. Use this option as a preview of what could be cleaned. This is most useful for dangerous cleaners such as the Unused Block Cleaner, but can also be a good confidence builder when new to Extreme Clean.

Beware the reporting from 'check only' is is not strictly accurate for cleaners that apply iterative algorithms such as page version cleaning.

When 'check only' is selected, any cleaner run from the dashboard or through the Extreme Clean job will not actually clean anything. As a reminder, the background and action button change from green for 'check only' to red for 'actually clean'. Think of 'check only' as the safety catch on a gun.

Any time you run a cleaner with Check Now or Clean Now, the settings are saved. You can also save the settings without running a cleaner using the light blue Save Settings button.

Clean Now

When you click 'Check Only' or  'Clean Now' (bottom right), the next page will show a progress bar that creeps up to 100% and a growing list that shows the progress of each cleaning slice.

When this completes, a final note will summarise how much has been cleaned and if any further cleaning is required.

For example "10 old page versions removed from 1 page. Processing completed. Click "Finish" to return."

The Finish button returns you to the main Configure and Run Cleaners page.

Depending on the summary, you may then want to run the page version cleaner again with another click of Clean Now, pwehaps having adjusted the settings above.

Enable for Extreme Clean job​​​​​

The setting we skipped past was 'Enable for Extreme Clean job​​​​​'. Checking this will make a cleaner available to the Extreme Clean job. Details of the job are configured on a separate dashboard page at Dashboard > System and Ssettings > Extreme Clean > Job Settings.

Additional Pages

Strategy

Backup your site before first using Extreme Clean. I accept no responsibility if you end up cleaning critical data! Once you have Extreme Clean configured, a safe strategy is to run it after your regular backups, so keeping the database clean ready for the next backup.

Experiment Safely

Use 'Check Only' to experiment with a cleaner without any risk.

Remember to change it back to 'Actually Clean' when you are ready to use Extreme Clean in anger.