Thursday 23 August 2012 7:44:28 pm - 4 replies

Hi. Im having a cronjob process that takes almost 95% of one of the processors... It has been on for hours. and taking load up...

/usr/local/bin/php runcronjobs.php -q

I have killed it before, but I dont understan what is keeping it stuck this is new, never happened before, how may I know?

I have deleted caches, and checked a lot of things. Could it be php times? I have them in 240 sec.

FCGI maybe or what?

I have EZ 4.4 not sure if there are any changes with current PHP and MySQL versions (5.3.13 and 5.5)

or what could be a reason?.

MySQL server is almost idle.

Help here. thanks.

EDIT: More info: the main cronjob task is ran everyday at 4 am right now its 13:00 hrs here. So its been around 9 hours running. Just killed it... nothing happened. Somehow it is not ending its work on time. Load came down around 2 points.

Thursday 23 August 2012 10:21:49 pm

Take off the -q that might help...

Otherwise if it's on a linux system and you have command line access, the command strace is your friend.

Friday 24 August 2012 2:45:48 pm

Hi, Steven.

Yes I have command line access. How do I run strace?

If I remove the -q (quiet) does that mean that it is going to log something somewhere?

Today the problem is still there today...

Friday 24 August 2012 3:05:30 pm


Using a trace from WHM in the process I found something that is cycling (but its not, I have found some differences between cycles) I removed database name and user name... I really dont understand a thing here:

munmap(0x2afc64c9f000, 4096)            = 0poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "\306\0\0\0\3SELECT id, parent, lang_mas"..., 202) = 202read(4, "\1\0\0\1\5A\0\0\2\3def\rdatabase\rezur"..., 16384) = 528poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "\375\10\0\0\3SELECT DISTINCT\n           "..., 2305) = 2305read(4, "\1\0\0\1!_\0\0\2\3def\rdatabase\17ezco"..., 16384) = 2736read(4, "rt_order\f?\0\v\0\0\0\3\0\0\0\0\0m\0\0\36\3def\rqu"..., 16384) = 534poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "\325\0\0\0\3SELECT contentobject_state_"..., 217) = 217read(4, "\1\0\0\1\2q\0\0\2\3def\rdatabase\21ezco"..., 16384) = 227poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "J\1\0\0\3SELECT l.contentobject_stat"..., 334) = 334read(4, "\1\0\0\1\3a\0\0\2\3def\rdatabase\1l\21ez"..., 16384) = 317poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "\306\0\0\0\3SELECT id, parent, lang_mas"..., 202) = 202read(4, "\1\0\0\1\5A\0\0\2\3def\rdatabase\rezur"..., 16384) = 539poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "9\0\0\0\3SELECT remote_id FROM ezcon"..., 61) = 61read(4, "\1\0\0\1\1S\0\0\2\3def\rdatabase\17ezco"..., 16384) = 147poll([{fd=4, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)write(4, "\376\5\0\0\3SELECT ezcontentobject.*,\n "..., 1538) = 1538read(4, "\1\0\0\1!_\0\0\2\3def\rdatabase\17ezco"..., 16384) = 2736read(4, "rt_order\f?\0\v\0\0\0\3\0\0\0\0\0m\0\0\36\3def\rqu"..., 16384) = 814access("var/ezflow_site/cache/template/compiled/generate-b7deefd8bdbef308cbc829a19c31bf6a.php", F_OK) = 0getcwd("/home/user/public_html/"..., 4096) = 28lstat("/home/user/public_html/./lib/ezc/var/ezflow_site/cache/template/compiled/generate-b7deefd8bdbef308cbc829a19c31bf6a.php", 0x7fffc8842f30) = -1 ENOENT (No such file or directory)getcwd("/home/user/public_html/v2"..., 4096) = 28open("/home/user/public_html/var/ezflow_site/cache/template/compiled/generate-b7deefd8bdbef308cbc829a19c31bf6a.php", O_RDONLY) = 8fstat(8, {st_mode=S_IFREG|0666, st_size=235548, ...}) = 0lseek(8, 0, SEEK_SET)                   = 0fcntl(8, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)fstat(8, {st_mode=S_IFREG|0666, st_size=235548, ...}) = 0mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2afc64c9f000lseek(8, 0, SEEK_CUR)                   = 0open("/home/user/public_html/var/ezflow_site/cache/template/compiled/generate-b7deefd8bdbef308cbc829a19c31bf6a.php", O_RDONLY) = 9fstat(9, {st_mode=S_IFREG|0666, st_size=235548, ...}) = 0mmap(NULL, 235548, PROT_READ, MAP_SHARED, 9, 0) = 0x2afc64ca0000munmap(0x2afc64ca0000, 235548)          = 0close(9)                                = 0

Friday 24 August 2012 3:54:56 pm

The -q is for quiet - and cron should be generating a mail with the output of the command... but I think it gets written when the process finishes and since this is infinite you won't ever get that mail...

strace is a command... but it does pretty much what you displayed here so that was the point...

I'm seeing it trying to access a lot of cache, so I'm guessing it's maybe rebuilding the cache do you have the eztc.php script in your cronjobs?

It's possible that if you have a hanging bracket in one of your templates some thing could be doing something weird when the template gets compiled.

Otherwise, do you have a cluster setup/static setup?

I think this is the once a day cron - so if you run that command from the command line without the -q from your ezroot you can maybe get a clue where it's getting stuck (beware that people will get notifications when they aren't expecting them if they have them set to send once a day - it might also have unintended consequences if you have a pending newsletter or something like that).  If you can reproduce the problem on the command line, then you can try and comment out your Scripts[] in cronjob.ini.append.php one by one until you figure out which script is hanging.

If it's not the once a day cron that hangs then you can try the infrequent/frequent etc CronjobPart-<name> until you find the culprit.

If its an intermittent thing that only happens when the a group of criteria happen to be the same... it'll be hard to debug.


