• Complex Query with XPDO - Haversine formula#

  • BluewireMedia Reply #1, 4 months ago

    Reply
    Hi xPDO Guru's,

    Wondering if it's possible & how you would write a complex query like below using xPDO?

    SELECT id, ( 3959 * acos( cos( radians($centre_lat) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(($centre_lng) ) + sin( radians(($centre_lat) ) * sin( radians( lat ) ) ) ) AS distance FROM markers HAVING distance < 25 ORDER BY distance LIMIT 0 , 20;


    It's a SQL query to find nearest locations using the Haversine formula.

    From: http://code.google.com/apis/maps/articles/phpsqlsearch_v3.html#findnearsql
    Could it be done via xPDO or should I just stick to a manual query?


  • lazylegs Reply #2, 3 months ago

    Reply
    Hello,

    Yes this is possible. Am actually writing an article on this right at the moment. Just need to get few things fixed before working it out and to get that query to run smoothly also so it will be safe and secure.

    Will get back to topic when got one thing done.. Will be a sneak peak to article then. Which will have haversine and other geo searching with gmaps in the end


  • lazylegs Reply #3, 3 months ago

    Reply
    This will be a bit long reply,

    Two possible ways for haversine to query. I am pasting the queries I use as just using haversine calculations for db query results in that all rows in the table are checked even they would be indexed properly. So I limit the possible rows using bounding box calculations in the query (pre calculated with php)

    The PHP for it before querying, this a snippet that I use for testing before going to full production:
    $lat = $modx->getOption('latitude', $scriptProperties, '');
    $lng = $modx->getOption('longitude', $scriptProperties, '');
    $distance = $modx->getOption('distance', $scriptProperties, 50);
    $unit = $modx->getOption('unit', $scriptProperties, 'km');
    
    if ($unit == 'km') {
        $dist = $distance / 1.609344; // KM to Miles
    } else {
        $dist = $distance;
    }
    
    /*
    * Below code for perimeter corner points
    * The perimiter is square in shape and creates a bounding box
    */
    
    $lng1 = $lng - $dist / (cos(deg2rad($lat)) * 69);
    $lng2 = $lng + $dist / (cos(deg2rad($lat)) * 69);
    $lat1 = $lat - ($dist / 69);
    $lat2 = $lat + ($dist / 69);
    


    So now we got distance and bounding box. The 50km is bit long but I got database consisting 117k rows of different cities around the globe.

    Then by using xPDOQuery (Cities is my schema table)
    $criteria = $modx->newQuery('Cities');
    $criteria->select(array($modx->getSelectColumns(
        'Cities',
        'Cities',
        '',
        array('name',
            'latitude',
            'longitude'
        )
    ), 'distance' => "3956 * 2 * ASIN(
                    SQRT(POWER(SIN((abs({$modx->quote($lat)}) - abs(latitude)) * pi() / 180  / 2), 2) +
                    COS(abs({$modx->quote($lat)}) * pi() / 180) *
                    COS(abs(latitude) * pi() / 180) *
                    POWER(SIN(({$modx->quote($lng)} - longitude) * pi() / 180 / 2), 2)))"));
    
    $criteria->where("(latitude BETWEEN {$modx->quote($lat1)} AND {$modx->quote($lat2)} 
        AND longitude BETWEEN {$modx->quote($lng1)} AND {$modx->quote($lng2)}) HAVING distance < {$dist}");
    $criteria->sortby('distance', 'ASC');


    Or by using xPDOCriteria (two different methods)
    $tableName = $modx->getTableName('Cities'); // Get the real table name from the class
    $query = "SELECT name, latitude, longitude, 
        3956 * 2 * ASIN( SQRT(POWER(SIN((abs({$modx->quote($lat)}) - abs(latitude)) * pi() / 180 / 2), 2) + 
        COS(abs({$modx->quote($lat)}) * pi() / 180) * COS(abs(latitude) * pi() / 180) * 
        POWER(SIN(({$modx->quote($lng)} - longitude) * pi() / 180 / 2), 2))) AS distance 
        FROM {$tableName} WHERE (latitude BETWEEN {$modx->quote($lat1)} AND {$modx->quote($lat2)} 
        AND longitude BETWEEN {$modx->quote($lng1)} AND {$modx->quote($lng2)}) HAVING distance < {$dist} ORDER BY distance ASC";
    $criteria = new xPDOCriteria($modx, $query);


    Same by using prepared statement and bindings

    $tableName = $modx->getTableName('Cities'); // Get the real table name from the class
    $query = "SELECT name, latitude, longitude,  
        3956 * 2 * ASIN( SQRT(POWER(SIN((abs(:lat) - abs(latitude)) * pi() / 180 / 2), 2) + 
        COS(abs(:lat) * pi() / 180) * COS(abs(latitude) * pi() / 180) * 
        POWER(SIN((:lng - longitude) * pi() / 180 / 2), 2))) AS distance 
        FROM {$tableName} WHERE (latitude BETWEEN :lat1 AND :lat2 
        AND longitude BETWEEN :lng1 AND :lng2) HAVING distance < :dist ORDER BY distance ASC";
    $criteria = new xPDOCriteria($modx, $query);
    $criteria->bind(array(':lat' => $lat, ':lat1' => $lat1, ':lat2' => $lat2, ':lng' =>
        $lng, ':lng1' => $lng1, ':lng2' => $lng2, ':dist' => $dist));

    And iterate though it:
    $results = $modx->getIterator('Cities', $criteria);


    All of these can be queried with getCollection or getIterator. I used getIterator as that is memory efficient way (but can be bit slower). To do just haversine, remove the BETWEEN clauses and you are ready. But like said, that will go through all your rows versus what I got when testing 899 lines where 188 was actually inside the distance. But 899 lines is lot less than 117k when table has latitude and longitude indexed. And the how many lines are outside the distance does vary a lot. Really depends where I would be querying it on globe.


  • lazylegs Reply #4, 3 months ago

    Reply
    Adding explains (had only 115k lines)

    The one with Haversine only:

    +----+-------------+--------+------+---------------+------+---------+------+--------+----------------+
    | id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows   | Extra          |
    +----+-------------+--------+------+---------------+------+---------+------+--------+----------------+
    |  1 | SIMPLE      | Cities | ALL  | NULL          | NULL | NULL    | NULL | 115592 | Using filesort |
    +----+-------------+--------+------+---------------+------+---------+------+--------+----------------+


    And the one with bounding box calculated and used with BETWEEN


    +----+-------------+--------+-------+---------------+------+---------+------+------+-----------------------------+
    | id | select_type | table  | type  | possible_keys | key  | key_len | ref  | rows | Extra                       |
    +----+-------------+--------+-------+---------------+------+---------+------+------+-----------------------------+
    |  1 | SIMPLE      | Cities | range | lat,lng       | lng  | 8       | NULL |  890 | Using where; Using filesort |
    +----+-------------+--------+-------+---------------+------+---------+------+------+-----------------------------+


  • BluewireMedia Reply #5, 3 months ago

    Reply
    Thanks for the reply lazylegs! I ended up just using regular SQL query but next time I'll have a go at implementing your xPDO version. Thanks for posting for everyone else though, it's a very handy little formula!

    Cheers,
    Sam