I think if you use xpath on the data feed, you can totally avoid the cdata wonky-ness by using the LIBXML_NOCDATA parameter to remove all cdata tags before you parse, then xpath basically works as normal.
Assuming a truncated example foxy datafeed file of:
<foxydata>
<transactions>
<transaction>
<id><![CDATA[3247]]></id>
<store_id><![CDATA[9]]></store_id>
<transaction_date><![CDATA[2009-10-19 12:51:18]]></transaction_date>
<processor_response><![CDATA[Authorize.net Transaction ID:2150746594]]></processor_response>
<processor_response_details>
<merchantReferenceCode><![CDATA[3797]]></merchantReferenceCode>
<requestToken><![CDATA[Azj//wFEsrZ+43+52sqEe7OAAigjD]]></requestToken>
<ccAuthReply__cvCode><![CDATA[M]]></ccAuthReply__cvCode>
<ccAuthReply__authorizationCode><![CDATA[193189]]></ccAuthReply__authorizationCode>
<ccAuthReply__authorizedDateTime><![CDATA[2010-04-21T04:37:29Z]]></ccAuthReply__authorizedDateTime>
<ccAuthReply__avsCode><![CDATA[Y]]></ccAuthReply__avsCode>
<ccAuthReply__reconciliationID><![CDATA[]]></ccAuthReply__reconciliationID>
</processor_response_details>
<customer_id><![CDATA[116]]></customer_id>
<is_anonymous><![CDATA[0]]></is_anonymous>
<customer_first_name><![CDATA[John]]></customer_first_name>
<customer_last_name><![CDATA[Doe]]></customer_last_name>
<customer_company><![CDATA[ACME Inc.]]></customer_company>
<customer_address1><![CDATA[555 Mulberry Dr.]]></customer_address1>
<customer_address2><![CDATA[#200]]></customer_address2>
<customer_city><![CDATA[Pleasantville]]></customer_city>
<!-- lots more foxy fields, info and transactions and junk snipped here -->
</transaction>
</transactions>
</foxydata>
Then running this code (note the ’LIBXML_NOCDATA’ last param)....:
$xml = simplexml_load_file('foxydatafeed.xml', NULL, LIBXML_NOCDATA);
$parsed_transactions = $xml->xpath("/foxydata/transactions/transaction"):
echo var_dump($parsed_transactions);
.....will get you this in the $parsed_transactions var, all from a simple xpath query:
’/foxydata/transactions/transaction’:
array(1) {
[0]= > object(SimpleXMLElement)#2 (14) {
["id"]=> string(4) "3247"
["store_id"]=> string(1) "9"
["transaction_date"]=> string(19) "2009-10-19 12:51:18"
["processor_response"]=> string(39) "Authorize.net Transaction ID:2150746594"
["processor_response_details"]=> object(SimpleXMLElement)#3 (7) {
["merchantReferenceCode"]=> string(4) "3797"
["requestToken"]=> string(29) "Azj//wFEsrZ+43+52sqEe7OAAigjD"
["ccAuthReply__cvCode"]=> string(1) "M"
["ccAuthReply__authorizationCode"]=> string(6) "193189"
["ccAuthReply__authorizedDateTime"]=> string(20) "2010-04-21T04:37:29Z"
["ccAuthReply__avsCode"]=> string(1) "Y"
["ccAuthReply__reconciliationID"]=> object(SimpleXMLElement)#5 (1) {[0]=> string(0) "" }}
["customer_id"]=> string(3) "116"
["is_anonymous"]=> string(1) "0"
["customer_first_name"]=> string(4) "John"
["customer_last_name"]=> string(3) "Doe"
["customer_company"]=> string(9) "ACME Inc."
["customer_address1"]=> string(16) "555 Mulberry Dr."
["customer_address2"]=> string(4) "#200"
["customer_city"]=> string(13) "Pleasantville"
["comment"]=> object(SimpleXMLElement)#4 (0) { }}}
Now your xml doc is broken down into separated transactions --by this simple xpath:
"/foxydata/transactions/transaction"-- and all your fields for each transaction are also separated and parsed into array vars, and ready to do wutevah with....
....stuff them into quickbooks. Save to db. Save to human readable flat files. Print them out, let your dog eat them. Send them to the IRS. Send them to your ex so she sees how rich you’ve become and can add them to your alimony payments, etc.