Logging system. Optimize Magento database by cleaning logs

Magento is a powerful and, at the same time, complex eCommerce system. One of the important parts of this system is a different events logging. After some time of functioning, Magento collects more and more events data and it might have the performance impact. Of course, the logs information can be cleaned manually in the database to avoid issues with performance, but this approach is not always handy. Therefore, in this article we want to review Magento logging system and built in tools for logs handling.

As you may also know, Magento uses two types of logs data collecting. The first type is “System and Exception Logging” to a local filesystem (it’s disabled by default). So, we can use admin panel to enable/disable this type of logging (System -> Configuration -> Advanced -> Developer -> Log settings).

log settings in magento system config

The system logs are located at “var/log/” directory. You have to check and remove them periodically, but we will not stay at this point for now. Let’s move on to more interesting part of Magento logs.

The second logging type is log information about visitors and visitors actions stored in a database. You can see many tables used for that purpose there. We will briefly describe some of them below.

For example, in Magento Community Edition we have the following list of the tables that are used for logging purposes:

  1. log_customer – information about customer login/logout time and date
  2. log_visitor – information about visitor session
  3. log_visitor_info – visitor locale preferences
  4. log_visitor_online – date and time of the first and the last visit, the last visited URL
  5. log_summary – visitors summary report by a date
  6. log_summary_type – summary types and descriptions
  7. log_url – visited URLs for each visitor
  8. log_url_info – information about visited URLs
  9. log_quote – contains quotes history by visitors
  10. index_event – events that occurred in Magento for products, catalogs. Editing or deleting etc
  11. report_event – history of some visitors/customers actions in store
  12. report_viewed_product_index – history of viewed products by visitors/customers
  13. report_compared_product_index – history of compared products by visitors/customers
  14. catalog_compare_item – compared products according to customer
  15. dataflow_batch_export – history of batches exports
  16. dataflow_batch_import – history of batches imports

log tables from magento community edition database

Magento Enterprise Edition has two more tables:

  1. enterprise_logging_event – history of events performed in admin panel
  2. enterprise_logging_event_changes – history of data changes

log tables from magento enterprise edition database

As follows, these tables can contain much unnecessary information and it is recommended cleaning them periodically. There are three different ways to do it:

  1. Manual cleaning using any MySQL client or CLI.
  2. Configure auto logs cleaning via Magento admin panel.
  3. Use shell/log.php.

Manual Cleaning via MySQL client or CLI

You can remove data from Magento logs tables by any MySQL client using GUI or run SQL DELETE query from console. By cleaning log and report tables they lose the visitors history that might be useful (for example, for sales analytics etc). That is why make sure you don’t need this information anymore before removing it. We also suggest to make the backup before this data removal.

Configure automatic logs cleaning via Magento admin panel

In admin panel, go to System -> Configuration -> Advanced -> System -> Log (or “Log Cleaning” for older Magento versions). Don’t be surprised if this feature is disabled in your store, but you can enable/disable automatic Log Cleaning, configure how often the cleaning process will be performed, set time for that and when (in days) the log data will be marked as expired via the admin panel. Moreover, there is an ability to specify for what customers you want to enable logging. Also you are able to set an email or few emails (separated with comma) in “Error Email Recipient” setting if you want to receive a notification of all errors/warnings that occur during the system log cleaning.

Enable logs and log cleaning in Magento

As you can see, because of the flexible configuration this feature will not actually remove the whole data from the log tables unlike the manual way described before.

Database maintenance by shell/log.php

Magento has a shell utility log.php stored in the shell/ folder. It’s execution can be configured via Magento cron job or this utility can be run manually. The script is used for cleaning and displaying statistics of the following tables:

  1. log_customer
  2. log_visitor
  3. log_visitor_info
  4. log_url
  5. log_url_info
  6. log_quote
  7. report_viewed_product_index
  8. report_compared_product_index
  9. report_event
  10. catalog_compare_item

As we mentioned before, the same tables are being cleaned by “Log Cleaning” configured via admin panel. You can receive the short help message typing the following command from the Magento root directory:

php -f shell/log.php help

Here is the result:

$ php -f shell/log.php help  

Usage:  php -f log.php -- [options]
        php -f log.php -- clean --days 1

clean             Clean Logs
--days <days>     Save log, days. (Minimum 1 day, if defined - ignoring system value)
status            Display statistics per log tables
help              This help

As you see, it allows to set the number of days for period you want to save log. Let’s check the logs statistic:

$ php -f shell/log.php status

-----------------------------------+------------+------------+------------+
Table Name                         | Rows       | Data Size  | Index Size |
-----------------------------------+------------+------------+------------+
log_customer                       | 49.01K     | 2.64Mb     | 1.59Mb     |
log_visitor                        | 38.43K     | 3.69Mb     | 0 b        |
log_visitor_info                   | 44.98K     | 5.78Mb     | 0 b        |
log_url                            | 124.92K    | 6.83Mb     | 8.42Mb     |
log_url_info                       | 119.03K    | 22.61Mb    | 0 b        |
log_quote                          | 1.37K      | 81.92Kb    | 0 b        |
report_viewed_product_index        | 346.83K    | 16.30Mb    | 55.16Mb    |
report_compared_product_index      | 15.40K     | 1.59Mb     | 3.26Mb     |
report_event                       | 1.06M      | 50.94Mb    | 125.60Mb   |
catalog_compare_item               | 56.98K     | 2.64Mb     | 10.04Mb    |
-----------------------------------+------------+------------+------------+
Total                              | 1.85M      | 113.10Mb   | 204.08Mb   |
-----------------------------------+------------+------------+------------+

And now we are going to clean these tables by running the following command:

$ php -f shell/log.php clean
Log cleaned

Also, you can run the Cron job with ID “log_clean” instead. It does the same operations as “log.php clean”.

$ n98-magerun.phar sys:cron:run log_clean
Run Mage_Log_Model_Cron::logClean done

MageRun is a very useful CLI tool for Magento developers, it has been described in our blog earlier. If you are still not familiar with it, here you can read more about MageRun tool.

Logs cleaning by log.php takes some time that depends on your logs data size, so you should just wait while it’s running and then you will see the result:

$ php -f shell/log.php status
-----------------------------------+------------+------------+------------+
Table Name                         | Rows       | Data Size  | Index Size |
-----------------------------------+------------+------------+------------+
log_customer                       | 48.96K     | 2.64Mb     | 1.59Mb     |
log_visitor                        | 11         | 16.38Kb    | 0 b        |
log_visitor_info                   | 11         | 16.38Kb    | 0 b        |
log_url                            | 129        | 16.38Kb    | 32.77Kb    |
log_url_info                       | 199        | 1.06Mb     | 0 b        |
log_quote                          | 0          | 16.38Kb    | 0 b        |
report_viewed_product_index        | 338.38K    | 15.25Mb    | 53.07Mb    |
report_compared_product_index      | 14.59K     | 1.59Mb     | 3.26Mb     |
report_event                       | 1.05M      | 50.94Mb    | 125.60Mb   |
catalog_compare_item               | 56.62K     | 2.64Mb     | 10.04Mb    |
-----------------------------------+------------+------------+------------+
Total                              | 1.50M      | 74.19Mb    | 193.59Mb   |
-----------------------------------+------------+------------+------------+

Have you been waiting for some other result? As it was mentioned before, this script doesn’t blindly remove the whole data from these tables. So, let’s dig deeper and find out what’s actually going on inside of Log.php script. There you can see the Run() method. In case of “clean” argument it applies _getLog() method to the current object.

/**
* Run script
*/
public function run()
{
    if ($this->getArg('clean')) {
        $days = $this->getArg('days');
        if ($days > 0) {
            Mage::app()->getStore()->setConfig(Mage_Log_Model_Log::XML_LOG_CLEAN_DAYS, $days);
        }
        $this->_getLog()->clean();
        echo 'Log cleaned\n';
    } else if ($this->getArg('status')) {
        ...
    }
}

Here is the _getLog() method itself, it just returns Mage_Log_Model_Log model:

/**
* Retrieve Log instance
* @return Mage_Log_Model_Log
*/
protected function _getLog()
{
   if (is_null($this->_log)) {
       $this->_log = Mage::getModel('log/log');
   }
   return $this->_log;
}

Look into the resource model of the Mage_Log_Model_Log class (Mage_Log_Model_Resource_Log) and find method named “clean()” inside:

/**
* Clean logs
* @param Mage_Log_Model_Log $object
* @return Mage_Log_Model_Resource_Log
*/
public function clean(Mage_Log_Model_Log $object)
{
   $cleanTime = $object->getLogCleanTime();

   Mage::dispatchEvent('log_log_clean_before', array(
       'log'   => $object
   ));

   $this->_cleanVisitors($cleanTime);
   $this->_cleanCustomers($cleanTime);
   $this->_cleanUrls();

   Mage::dispatchEvent('log_log_clean_after', array(
       'log'   => $object
   ));

   return $this;
}

This method dispatches two events: log_log_clean_before and log_log_clean_after. Also there are calls of three methods for logs clean up. Let’s review one of them – “_cleanVisitors()” method. You can find the SELECT database query inside of it:

$select = $readAdapter->select()
->from(
   array('visitor_table' => $this->getTable('log/visitor')),
   array('visitor_id' => 'visitor_table.visitor_id'))
->joinLeft(
   array('customer_table' => $this->getTable('log/customer')),
   'visitor_table.visitor_id = customer_table.visitor_id AND customer_table.log_id IS NULL',
   array())
->where('visitor_table.last_visit_at < ?', $timeLimit)
->limit(100);

As you can also see, this method removes only the selected visitors information by ID from “visitor_info”, “quote”, “url” and “visitor” tables. The similar operations are performed for other cleaning methods. That’s why we can’t remove whole data from all the log tables using these methods.

Bringing this topic to the end, we can say that logs cleaning of Magento store’s database is a very important and effective thing to improve performance especially if you have hundreds of visitors per day. Also, it’s really good idea to setup up scripts to automate this process.

Thanks for reading the article and feel free to share your own experience of Magento performance improvement in comments.