Magento doesn't trigger setup script

I found a nice new, rarely trigger Magento bug.

In one of our projects, we have an example file to create new products:

data-upgrade-example-new-product-import.php

Beside this we had data scripts:

data-install-1.0.1.php
data-upgrade-1.0.0-1.0.1.php
data-upgrade-1.0.5-1.0.6.php

Unfortunately data-upgrade-1.0.5-1.0.6.php didn't fire.

It took me a while, but I found the problem:

// \Mage_Core_Model_Resource_Setup::_getModifySqlFiles
protected function _getModifySqlFiles($actionType, $fromVersion, $toVersion, $arrFiles)
{
    $arrRes = array();
    switch ($actionType) {

        // ... 

        case self::TYPE_DB_UPGRADE:
        case self::TYPE_DATA_UPGRADE:
            uksort($arrFiles, 'version_compare');
            foreach ($arrFiles as $version => $file) {
                $versionInfo = explode('-', $version);

                // In array must be 2 elements: 0 => version from, 1 => version to
                if (count($versionInfo)!=2) {
                    break;
                }

in $arrFiles we find an array with all files in the data dir which match a certain regex in \Mage_Core_Model_Resource_Setup::_getAvailableDataFiles. In short, when the files starts with data-.

The problem is, that data-upgrade-example-new-product-import.php doesn't meet the if (count($versionInfo)!=2) check and then break is called, which kills the complete loop, but should only be continue;.

So either we hack the core or rename the data-upgrade-example-new-product-import.php, I decided for renaming.

INSERT INTO, increased auto_increment, but no data

Missing data, failed transactions and unassigned products

Last week my job was to port a module back from Magento 1.9 to Magento 1.4.

Allowed core hacks

In the end the only real problem I had, was that Mage_Core_Model_Resource_Db_Abstract doesn't exist already, therefore I created the file in core with the content:

<?php
class Mage_Core_Model_Resource_Db_Abstract extends Mage_Core_Model_Mysql4_Abstract {}

If you ask me, this is one of the very rare cases, where a core hack is ok. When Magento is updated, the file is overwrittten and exists then. If you create the file in community or local, this doesn't happen automatically.

Raised auto_increment but missing data - no quote

I copied the module from Magento 1.9 to 1.4 without problems, created the core file and tested it. On my first tests, I didn't get a quote item. Whatever I did, my cart was empty. The bad thing I discovered was, that Magento didn't even create a quote. sales_flat_quote is empty. But the auto_increment value raises by 1. I tried to google it, but googleing for INSERT is ignored or quote is not written didn't help a lot. After one day of searching, I found a MySQL forums post which described the same behaviour. The answer was thankfully there to. What happened?

  1. Transaction is started
  2. Quote is written to the database
  3. Autoincrement is increased
  4. Something fails
  5. Rollback.

Ok. But what fails? In the end I discovered, that the Problem was a killed model rewrite, which leads to an unset quote_id on the quote_payment. I have no clue why this happend. But after deleting the model and setting up the old rewrite this problem was solved.

Quote item in database, but cart empty

I thought I was done. Said the client it is only a matter of minutes, but the next test still fails. The cart is still empty. I started investigating again. The quote was created, the item written - and then deleted.

I learned, that when a collection is loaded all products are assigned to the quote items. If the product doesn't exist, the quote item is deleted. So far so understandable.

\Mage_Sales_Model_Resource_Quote_Item_Collection::_afterLoad
\Mage_Sales_Model_Resource_Quote_Item_Collection::_assignProducts

$product = $productCollection->getItemById($item->getProductId());
if ($product) {
    // [...]
} else {
    $item->isDeleted(true);
    $recollectQuote = true;
}

Long story short: The collection is loaded from the flat tables (if activated). My product was missing there.

$productCollection = Mage::getModel('catalog/product')->getCollection()
    ->setStoreId($this->getStoreId())
    ->addIdFilter($this->_productIds)
    ->addAttributeToSelect(Mage::getSingleton('sales/quote_config')->getProductAttributes())
    ->addOptionsToResult()
    ->addStoreFilter()
    ->addUrlRewrite()
    ->addTierPriceData();

Unfortunately I forgot to assign the product to the desired website. We don't test if the product is available on the website, so no error is thrown on adding it to the cart and without happening anything the item is deleted, before anyone even sees it.

Hope that helps someone finding such weird and hard to trace bugs.

Mage_Weee and why it is important for tax calculation

TL;DR

Install Topological Sort to fix the sorting algorithm.

Problem

Mage_Weee is a module which is needed in Magento to calculate the so called Waste Electrical and Electronic Equipment Directive. It is a tax in Europe for electronic stuff, so that it is already paid, when you throws it away. Don't ask me. We don't sell any kind of electric stuff, so it is a good decision to remove the module.

<!-- app/etc/modules/zzzzzz_DeactivatedModules.xml-->
<?xml version="1.0"?>
<config>
    <modules>
        <Mage_Weee>
            <active>false</active>
        </Mage_Weee>
    </modules>
</config>

Missing Mage_Weee adds tax on product instead of including it

What happens then was very irritating for me.

Before:

After:

You might sense the miscalculation of the tax. Our tax settings are still correct and everything is fine, but the tax is added to the price, instead of being included.

Easiest fix here is to just NOT deactivate Mage_Weee, but this is unsatisfying. So I dig deeper (and have already an idea, what happens):

Magento total models

Magento is calculating all the stuff in quote and order with total models. You can check what total models are called in

\Mage_Sales_Model_Quote_Address::collectTotals
foreach ($this->getTotalCollector()->getCollectors() as $model) {
            echo get_class($model) . "\n";
            $model->collect($this);
        }

Billing address with Mage_Weee

  • Mage_Sales_Model_Quote_Address_Total_Nominal
  • Mage_Sales_Model_Quote_Address_Total_Subtotal
  • Mage_Sales_Model_Quote_Address_Total_Msrp
  • Mage_SalesRule_Model_Quote_Freeshipping
  • Mage_Tax_Model_Sales_Total_Quote_Subtotal
  • Mage_Weee_Model_Total_Quote_Weee
  • Mage_Sales_Model_Quote_Address_Total_Shipping
  • Mage_Tax_Model_Sales_Total_Quote_Shipping
  • Mage_SalesRule_Model_Quote_Discount
  • Mage_Tax_Model_Sales_Total_Quote_Tax
  • Mage_Sales_Model_Quote_Address_Total_Grand

Billing address without Mage_Weee

  • Mage_SalesRule_Model_Quote_Freeshipping
  • Mage_Tax_Model_Sales_Total_Quote_Subtotal
  • Mage_Tax_Model_Sales_Total_Quote_Shipping
  • Mage_SalesRule_Model_Quote_Discount
  • Mage_Tax_Model_Sales_Total_Quote_Tax
  • Mage_Sales_Model_Quote_Address_Total_Msrp
  • Mage_Sales_Model_Quote_Address_Total_Nominal
  • Mage_Sales_Model_Quote_Address_Total_Subtotal
  • Mage_Sales_Model_Quote_Address_Total_Shipping
  • Mage_Sales_Model_Quote_Address_Total_Grand

As you can see, the order is totally different. I'm cleaning up the list a little bit:

WITH WeeeWithOUT Weee
Mage_Sales_Model_Quote_Address_Total_Subtotal Mage_Tax_Model_Sales_Total_Quote_Subtotal
Mage_Tax_Model_Sales_Total_Quote_Subtotal Mage_Tax_Model_Sales_Total_Quote_Shipping
Mage_Sales_Model_Quote_Address_Total_Shipping Mage_Tax_Model_Sales_Total_Quote_Tax
Mage_Tax_Model_Sales_Total_Quote_Shipping Mage_Sales_Model_Quote_Address_Total_Subtotal
Mage_Tax_Model_Sales_Total_Quote_Tax Mage_Sales_Model_Quote_Address_Total_Shipping
Mage_Sales_Model_Quote_Address_Total_Grand Mage_Sales_Model_Quote_Address_Total_Grand

I think (I didn't check it yet), that the Mage_Tax totals expect, that the Mage_Sales already ran and collected all the sums, etc. So when you are changing the order from first Mage_Sales, then Mage_Tax, you get wrong results.

Ordering multiple dependend nodes in a graph is a hard problem

The problem lays in the ordering algorithm. All the totals have a before and after in their definition.

<!-- app/code/core/Mage/Tax/etc/config.xml:160 -->
    <sales>
        <quote>
            <totals>
                <tax_subtotal>
                    <class>tax/sales_total_quote_subtotal</class>
                    <after>freeshipping</after>
                    <before>tax,discount</before>
                </tax_subtotal>
                <tax_shipping>
                    <class>tax/sales_total_quote_shipping</class>
                    <after>shipping,tax_subtotal</after>
                    <before>tax,discount</before>
                </tax_shipping>
               [...]

All the ordering is done here:
\Mage_Sales_Model_Config_Ordered and this is buggy.

How to solve it the great way

The hard solution for this is:

  1. Implement a real graph algorithm which solves all the dependencies and orders the total models correct
  2. Play with all the before and after nodes until the order is correct.

I neither implemented a cool algorithm yet (or looked it up) to solve the problem, nor did I play with the values until everything is fine. I'm staying with the solution: Not deactivating Mage_Weee for the moment.

Update: Thanks to @Daniel_Sloof, more informations from @VinaiKopp

Update 2: I found even more on this topic on Stackoverflow from @s3lf

Domains, Domains, Domains

I dumped a shop from my customer and imported it to my local system, as you can guess, the local domain is another one. All my domains are like: <customer>.dev.

Where to change domains

Secure/Unsecure Base URL

From core_config_data:

default	0	web/unsecure/base_url	http://customer.dev/
default	0	web/secure/base_url		http://customer.dev/

Check the skin/media/js url although!

default	0	web/unsecure/base_link_url	{{unsecure_base_url}}
default	0	web/unsecure/base_skin_url	{{unsecure_base_url}}skin/
default	0	web/unsecure/base_media_url	{{unsecure_base_url}}media/
default	0	web/unsecure/base_js_url	{{unsecure_base_url}}js/
default	0	web/secure/base_link_url	{{secure_base_url}}
default	0	web/secure/base_skin_url	{{secure_base_url}}skin/
default	0	web/secure/base_media_url	{{secure_base_url}}media/
default	0	web/secure/base_js_url		{{secure_base_url}}js/

Cookies

For cookies you should add the domain and the path, to secure your cookie. In your development environment, you can just use these settings:

default	0	web/cookie/cookie_path		NULL
default	0	web/cookie/cookie_domain	NULL

SSL Everywhere or HSTS

We have to secure all the data of our users, not only registration, checkout and login. We need to secure the session data too.

SSL Everywhere or HTTP Strict Transport Security

(I hope) Everyone knows, that it is important to secure (read as encrypt) our customer’s data. Because of all the evil hackers in the world and the bad ISPs which intercept all our data.

TL;DR

Use HTTP Strict Transport Security!

The first problem: unencrypted personal data

But the real problems are the all-day security problems:

  • I’m sitting at Starbucks, surfing over their unencrypted wifi, enjoy my coffee and work. Hopefully, all connections are encrypted: email, jabber, VPN to the office, what’s app... But often this is not the case. What happens, if a connection is not encrypted? Everyone in the wifi can listen.

You can encrypt the wifi, but for example WEP doesn't solve the problem, because it doesn't have user isolation. But WPA helps.

If the connection is not encrypted, one can read your emails, your whats app messages or your email login.

First solution: TLS (formerly known as SSL)

Because of the painted scenario all our login and registration pages are SSL secured. We encrypt every transmission, where important data are sent. You can use TLS for nearly everything:

  • SMTP -> SMTPS
  • IMAP -> IMAPS
  • POP3 -> POP3S
  • HTTP -> HTTPS
  • and so on

Second problem: unencrypted session data

Do you only encrypt the login and registration page? Oh, the checkout too. Great! But what is with all the other pages, like »Home«, »Privacy Policy«, »About Us« and so on?
Do you think there are no important data transmitted? You are wrong. With every request there is the session ID sent in a cookie. This means, you convey (a maybe authorized session id) unencrypted personal data in your HTTP header.

The attack vector might be (in Magento):

  • getting personal data: address, order history, wishlist, payment data
  • order with the cusomer’s account to a different address, billing the account owner
  • spending the bonus points or the customer’s credit
  • download already paid virtual content which is found in the account

Second solution: SSL Everywhere

Use SSL everywhere. On every request. If the user comes to your page redirect him directly on HTTPS. Use an official signed certificate, so the user knows, he can trust you.

Third problem: SSL Stripping and ARP Spoofing

There is still at least one problem left. It is called "HTTPS stripping attacks". Moxie Marlinspike implemented a tool called sslstrip and recorded a nice video to demonstrate it.

A hacker can pipe all your traffic (under the correct circumstances) through his own machine and transform all the secured (https) links to insecure links (http).

ARP Spoofing

ARP is a protocol to find the shortest path to another address inside the network, for example between your computer and the router in the network.
Alt text
visualization made by 0x55534C, thanks for that.

ARP Spoofing means, you flood the network with ARP packets and define the way through your computer as the fastest way to the router. This way, all the traffic is piped through your machine.

Now you have the full control over all the traffic. You can read it, you can change it or you can drop some or all packets.

Encryption helps

If your packets are encrypted, you don't have this problem. Because the man-in-the-middle (MITM) can't do anything without your recognition. SSL checks itself for integrity.

The precise problem: Redirection at the beginning

But you remember? The very first connection to your shop is to HTTP://www.my-shop.example. This means, the connection is unencrypted and an attacker can do his job.

The attacker (let's call him Mallory) ARP spoofs the victims (Alice) laptop, reroutes Alice's traffic through his machine and removes the HTTP Location header. Then Mallory loads the https version of the site Alice wants, changes every https:// to http:// and pipes it to Alice's computer. Now Mallory can do her evil work.

It is important to understand, that most users don't realize or check, wether they are on a https:// site. Or wether their address bar is green or blue. Don't rely on the user.

Third/Final solution: Use HTTP Strict Transport Security (HSTS)

HSTS is a server side HTTP Header, specified in RFC 6796.

It does two things:

  1. It ensures, that the connection is secure. If it is not possible to connect to the server on a secure connection, then an error is shown. This is applied in Chrome, Firefox and Opera so, that a connection via http is no longer possible.
  2. It transforms every link on the page from http to https.

How HSTS works

HSTS is a HTTP Header:

Strict-Transport-Security "max-age=31536000"
Strict-Transport-Security "max-age=31536000; includeSubDomains"

This means: Don't allow insecure connections for the next 365 days to my domain, with or without subdomains.

There is still one problem: The very first request may still be http. This consideration is correct. But the idea is: Hopefully the user are at home or in any secure network, when the user makes this request. If not, he is doomed. ;-)

However, if this first request is made, he is secure for the next 365 (or whatever the timespan is) days (on this device!).

As you can see: This final solution I showed to you doesn't guarantee complete security but they minimize the risk for a security breach.

Advertisement: Magento Module for HSTS

I implemented a module, which does all this for magento: Ikonoshirt_StrictTransportSecurity

More Information