eZ Community » Forums » Developer » Memory problem will running cronjob
expandshrink

Memory problem will running cronjob

Memory problem will running cronjob

Monday 16 August 2010 3:50:00 pm - 8 replies

Hi there,

I have a problem with a simple script :

I have to fetch all the nodes in a folder and there is a lot of nodes (18000+).
For each node I have to add it in an indexed array for comparison with a CSV file.

So this is the code :

$nodes = eZContentObjectTreeNode::subTreeByNodeID(null, 2283);
$existing = array();
foreach($nodes as $node){
 
    if($node->ClassIdentifier == "myclass"){
 
        $dm = $node->DataMap();      
        $existing[$dm["n_siren"]->DataText.$dm["siret"]->DataText] = $node->NodeID;
 
    }
 
    unset($dm);
    unset($site);
 
}

As you can see I try to unset variables (because they are not used anymore) but each time I can't finish this part of code without an error : not enough memory.

I'm under PHP 5.2.6 under GNU/Linux Lenny4 and I can't use PHP 5.3 and it's garbage collector.

Can someone help me ?

Modified on Monday 16 August 2010 4:30:25 pm by Damien MARTIN

Monday 16 August 2010 4:45:58 pm

Hi,

You can use offset/limit to achieve this.

Monday 16 August 2010 5:04:12 pm

Thanks Yannick,

But I really need to fetch all the node in one time.

This is how the script works in it's complete form :

  1. Load elements in the given directory (the part where the problem is)
  2. Load a CSV file with the meaningly same datas as in the stored elements
  3. Add new elements to eZ
  4. Modify existing elements in eZ
  5. Delete non existing elements (an element is removed it exists in eZ but not in the CSV file)

So, the first part has to load all the node to make the comparison with the CSV file.
If all the elements are not loaded, I will not be able to see if an element from the CSV has to be added or to be modified.

The element used for comparison is composed of two attributes (not the name, because it would be to easy...), it is why I have to load the DataMap of each node...

I hope it is more understandable like this.

Monday 16 August 2010 6:28:58 pm

Loading all nodes at same time it's not the best way. You can do it sequentially.

Just an example :

$limit = 50;
$offset = 0;
while( $continueRun )
{
$continueRun = dothis($offset, $limit);
$offset += 50;
 }
 
function dothis( $offset, $limit )
{
 $nodes = eZContentObjectTreeNode::subTreeByNodeID(array( 'Limit' => $limit, 'Offset' => offset),2283);
 if( $nodes == false )
  return false; 
 foreach( $nodes as $node )
 {
  //your task here
 }
 
 unset($nodes); 
}

Modified on Monday 16 August 2010 6:38:27 pm by Yannick Komotir

Monday 16 August 2010 11:50:30 pm

Hello

eZ Publish uses an in-memory cache for optimizations. If you want to iterate a long list of nodes/objects, you need to clear this cache :

$nodes = eZContentObjectTreeNode::subTreeByNodeID(null, 2283);
$existing = array();
foreach($nodes as $node){
 
    if($node->ClassIdentifier == "myclass"){
 
        $dm = $node->DataMap();      
        $existing[$dm["n_siren"]->DataText.$dm["siret"]->DataText] = $node->NodeID;
 
    }
 
    $node->object()->resetDataMap();
    eZContentObject::clearCache( array( $node->attribute( 'contentobject_id' ) ) );
 
}

Modified on Monday 16 August 2010 11:52:57 pm by Jérôme Vieilledent

Tuesday 17 August 2010 9:20:43 am

Hi Jerome,

Your solution is very interesting because I can use it in my other importation scripts. And it's wonderfull not to have to edit CSV file to re-run the cronjob from where it crashed !

I doesn't thought eZ was storing so much datas in its caches when you are using PHP directly.

Thanks you Yannick and Jerome.

Tuesday 17 August 2010 10:08:55 am

> I doesn't thought eZ was storing so much datas in its caches when you are using PHP directly.

It is, we have wanted to add a cache handler that manages in memory cache and provides an general cache api with handler support for years, so we can move parts of cache to for instance memecached and so, fix these memory issues, simplify cache code and possibly optimize stuff while at it.

Modified on Tuesday 17 August 2010 10:12:24 am by André R

Tuesday 17 August 2010 10:10:21 am

My pleasure blunk.gif Emoticon.

About import, I plan to release a new import extension very soon, SQLIImport. You'll be able to handle any data source (XML, CSV...) with only one PHP class to create and with a really simplified API to create and retrieve content objects.

Stay tuned ! happy.gif Emoticon

Tuesday 17 July 2012 4:46:00 pm

Hi, 

By cons there is a problem with the cache empty, the scripts will take longer to excution,Imagine a query that retrieves 400,000 rows, and you loop and you empty the cache, I tested the script took over 1h

 

thanks 

expandshrink

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu

Proudly Developed with from