eZ Community » Forums » Discussions » eZ Publish meets Drupal: building a...
expandshrink

Thursday 19 January 2012 10:44:51 am - 6 replies

» Read full blog post

Introduction

In this blog post I will explain how to develop a simple scraper using eZ publish and a bit of Drupal code. To my knowledge its the first time eZ publish and Drupal meet in public, although I have used the same Drupal functions for the Mollom spam filter extension.

Monday 23 January 2012 10:31:10 am

I think it's the Content-type: application/x-www-form-urlencoded that is messing up the xml in sendHTTPRequest...

I've rewritten sendHTTPRequest  a couple of time - to use the stream_context_create which can be used with file_get_contents and file_put_contents with the context.

getDataByURL is also problematic since it's used for the link checker - which fails (sometimes) on weird things like sites that are proxied.

The ezc lib also has ezcAuthenticationUrl::getUrl - which I've never used but also uses stream_context_create and file_get_contents - but no way to pass $opts to send a POST request etc.

Monday 23 January 2012 10:42:16 am

Obligatory spamvertising: why not use the http client of ggwebservices instead of going all the way of writing a new one? See http://svn.projects.ez.no/ggwebservices/trunk/extension/ggwebservices/classes/ggwebservicesclient.php

I concede that the code using stream wrapper looks cleaner than my one (based on my previous work on phpxmlrpc lib, which had php 4.0.x compatibility), but all the features should be in there...

Wednesday 25 January 2012 10:42:21 am

What I like about the Drupal function is the fact that it's default Drupal functionality. The http client of ggwebservices does look good as well, but i think this is functionality that has the potential to be used so often that it should really be included in the kernel. The other thing that I like about the Drupal function is the list of error codes. Its small things like that that make Drupal an 'accessible' and user-friendly CMS. It shows that Drupal development is pushed beyond a pure technical implementation.

Wednesday 25 January 2012 11:56:59 am

About response codes: this is nice, and the ggws client does not have it really developed - but wouldn't the client treat a 204 as an error? The next version of the eZ rest api (the one with write access) will make heavy use of 204 responses as an OK return - which forces you into wrapping it up in a 2nd call

Other differences:

. ggws supports proxies, the drupal code (afaict) not

. ggws supports basic/digest/ntlm auth, the drupal code only basic

. ggws supports http 1.1 and 1.0, the drupal code always sends out http 1.0 headers

. drupal code suppoprts redirects, ggws does not

I agree that this should be in teh eZ kernel - or in the ZetaC library (you can see some of my mails on the ZC mailing list).

Why not join forces and try to push a combined solution?

Wednesday 25 January 2012 2:52:11 pm

Hi Gaetano,

That sounds like a good plan. Do you have some time this week to discuss?

Kind regards,
Sebastiaan

Wednesday 25 January 2012 2:58:34 pm

@Sebastiaan why not? You can reach me on skype / google chat

expandshrink

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu