How to Create a Magento Data Feed and Where to Use it

Every eCommerce site at one point or another will need a data feed. A data feed is a text-based list of all or some entities of a particular data set that can be found on your eCommerce store. Data feeds are commonly done for products, which provides third party services easy access to your store’s offerings.

Depending on how the data feed is generated, it could be in a few different formats: Delimited (ex. CSV or TSV), XML, or JSON. The format of the feed will vary based on what is required by the third party service.This article will show you how to implement a simple Magento data feed by extending the shell class that is capable of automatically generating up to date feeds.

Writing a Simple Magento Product Feed

product feeds

The first example will involve the creation of a product feed which will generate a CSV file of all products in the database.

In the root of your Magento site you should be able to find a folder named shell, which is where we will house our scripts. We will take things a step further and create create a Namespace folder in /shell for organizational purposes. In that folder create a file ProductExport.php.

You can use whatever name makes sense to you for the filename and in place of “Namespace” wherever it is used.

You should have the following folder structure and file:

/shell/Namespace/ProductExport.php

Now we will create a basic script that extends Mage_Shell_Abstract.

<?php

/* To start we need to include abscract.php, which is located 
 * in /shell/abstract.php which contains Magento's Mage_Shell_Abstract 
 * class. 
 *
 * Since this ProductExport.php is in /shell/Namespace/ we
 * need to include ../ in our require statement which means the
 * file we are including is up one directory from the current file location.
 */
require_once '../abstract.php';

class Namespace_ProductExport extends Mage_Shell_Abstract
{
   // Implement abstract function Mage_Shell_Abstract::run();
   public function run()
   {

   }
}

// Create a new instance of our class and run it.
$shell = new Namespace_ProductExport();
$shell->run();

The is the absolute basic skeleton code that we need to get a shell program to run successfully using Magento’s Mage_Shell_Abstract class. You could run it by executing the following command on a command line.

php -f /shell/Namespace/ProductExport.php

That command can also be setup to be run by a cronjob, which could periodically update your data feed automatically, keeping it up to date.

Why Do You Have To Extend Mage_Shell_Abstract?

Rather than writing a PHP script, we are including /shell/abscract.php and creating a class that extends the Mage_Shell_Abstract class. Mage_Shell_Abstract provides a very good base class to build a script off. It includes everything that would be needed to interact with Magento’s compontents, but more importantly it contains a security feature that ensures your script can only be run from shell (it cannot be run from the browser!).

If you try to access the php file from a browser will see an error. The function that prevents the file from being accessed is Mage_Shell_Abstract::_validate().

protected function _validate()
{
    if (isset($_SERVER['REQUEST_METHOD'])) {
        die('This script cannot be run from Browser. This is the shell script.');
    }
}

Implementing a Product Feed

Time to make our run() function actually do something. The following is a simple implementation that will generate a comma delimited file that contains the name and price of every product in your Magento database.

function run()
{
	// Pull product collection
    $productCollection = Mage::getModel('catalog/product')->getCollection()
        ->addAttributeToSelect(array('name', 'price'));

    // Divide the collection into "pages" that contain 100 products each
    $productCollection->setPageSize(100);

    /* Get the page number of the last page, so we know how many pages of        
     * products we need to iterate through.
     *
     * pages = total items in collection / page size
     */ 
    $pages = $productCollection->getLastPageNumber();

    // Start on page 1
    $currentPage = 1;

    // Open our file for writing
    $write = fopen('ProductExport.csv', 'w');

    // Create our first row which is the columns for our data
    fwrite($write, implode(",", array('Name', 'Price')) . "\r\n");
	
    // Iterate until $currentPage reaches the total number of pages.
    do {
        $productCollection->setCurPage($currentPage);
		 
        /* When passing a collection into a foreach loop
   		  * load() is automagically called on the collection.
         */
        foreach ($productCollection as $_product) {
            // write our comma-delimited line of data to our file
            fwrite($write, implode(",", array(
                $_product->getName() ,
                $_product->getFinalPrice()
            )) . "\r\n");
        }

        // Proceed to the next page
        $currentPage++;

        /* Here we take advantage of the fact that we are only
         * loading 100 products at a time. Once we finished processing
         * the first page of 100, we can clear the collection data
         * which frees up memory in the system. 
         */
        $productCollection->clear();
    } while ($currentPage <= $pages);

    // Close file stream
    fclose($write);
}

The collection is paged to prevent your system from running out of memory when working with large collections of data. Without paging your data, the system would keep all the data pulled from the database in memory. If your system runs out of space in memory before the script finishes, it would crash.

The 100 items per page can be increased depending on how much memory is available on the server which is running the Magento site.

This basic script can be adapted to any data set you would like. But here is a more refined version of the product export to uses Varien_Io_File when saving data to the file system.

require_once '../abstract.php';

class Mbrzuzy_ProductExport extends Mage_Shell_Abstract
{
    protected $_io;
    protected $_folder       = 'exports';
    protected $_name         = 'ProductExport.csv';
    protected $_delimiter    = ',';
    protected $_attributeMap = array(
        'ProductName'  => 'name',
        'ProductPrice' => 'price'
    );

    protected function _construct()
    {
        $this->_io = new Varien_Io_File();
        $this->_io->setAllowCreateFolders(true);

        parent::_construct();
    }

    public function run()
    {
        $productsCollection = Mage::getModel('catalog/product')->getCollection()
            ->addAttributeToSelect('*');

        $productsCollection->setPageSize(100);

        $pages = $productsCollection->getLastPageNumber();

        $currentPage = 1;

        $this->_io->open(array('path' => $this->_folder));
        $this->_io->streamOpen($this->_name);
        $this->_io->streamWrite($this->_convertArrayToDelimitedString(array_keys($this->_attributeMap)) . "\r\n");

        do {
            $productsCollection->setCurPage($currentPage);

            foreach ($productsCollection as $_product) {
                $mappedValues = $this->_getMappedValues($_product);
                if (!empty($mappedValues)) {
                    $this->_io->streamWrite($this->_convertArrayToDelimitedString($mappedValues) . "\r\n");
                }
            }

            $currentPage++;
            $productsCollection->clear();
        } while ($currentPage <= $pages);

        $this->_io->streamClose();
    }

    protected function _convertArrayToDelimitedString($data)
    {
        return implode($this->_delimiter, $data);
    }

    protected function _getMappedValues($product)
    {
        $data = array();

        foreach ($this->_attributeMap as $column => $attribute) {
            $data[] = $product->getData($attribute);
        }

        return $data;
    }
}

$shell = new Mbrzuzy_ProductExport();
$shell->run();

Third Party Extensions

There are a lot of extensions that exist which can provide a graphical user interface (GUI) for creating and managing data feeds. These are nice to have but not necessary as seen by the examples here.

The benefit to writing your own data feed scripts is that you have something can can be very easily customized to meet any kind of requirements.

What Can I Do With My Data Feeds?

product feeds

There are certain features that Magento does not implement very well or even at all. You can bridge the gap by taking advantage of third party services that focus on doing one thing and doing it very well.

Magento Search

Google

Analytics

Affiliates