Exporting ello.co

ello export a feed as JSON, here’s its name structure http://ello.co/username.json but you need to be logged in, god knows why; if not the file returns a version of the profile data. This post thus deals mainly with how to automate ello’s login, the problem of converting the JSON to XML is dealt with on a post of the same name. This post morphed into a generic web login via scripts.

Here are some previous attempts to reverse engineer the api,

  1. https://github.com/grant/ello
  2. http://www.programmableweb.com/api/unofficial-ello
  3. https://gist.github.com/conatus/cc665f917d5558c123bc

The third URL discusses using the unofficial API to perform this but documents the fact that the ello team oppose it and have recently introduced a xscripting defence which makes it much harder to use a previously issued cookie. The authentication hack is not well documented. At the moment the authentication is session and cookie based which is why my experiments my experiments with wget and curl have failed. i.e. they both bring down the profile not the feed.

I am considering a scripting solution, first Python,

  1. http://stackoverflow.com/questions/3876070/how-to-script-firefox-or-any-mozilla-based-browser
  2. https://docs.python.org/3/howto/urllib2.html

Some detailed examples using python tools

  1. http://stackoverflow.com/questions/2910221/how-can-i-login-to-a-website-with-python
  2. http://stackoverflow.com/questions/8316818/login-to-website-using-python
  3. http://stackoverflow.com/questions/13925983/login-to-website-using-urllib2-python-2-7
  4. http://stackoverflow.com/questions/2910221/how-can-i-login-to-a-website-with-python mentions package requests

For more checkout, this google search, “web login using python”

however the first of these links, contains proposals pointing at alternative scripting, using Selenium, mechanize & twill.

  1. http://docs.seleniumhq.org/projects/ide/
  2. http://wwwsearch.sourceforge.net/mechanize/
  3. http://twill.idyll.org/

I have tried twill, the login is tricky, and not well documented, or I am too stupid. the form hosting page, http://elllo.co/enter has form entry field names containing [], and something isn’t working. (It refuses to recognise the form fields.) ello detects the twill go statement and places a captcha/unblock request screen in the way. Do I continue down this path or shall I look at the at the basic python libraries.

I was also pointed at MaxQ, an automated web site test harness, for more see http://wiki.davelevy.info/web-site-testing-with-maxq/ but I havn’t got this to work.

I found the reference to eautiful Soap a python based document parser, helpful, may be for identifying the document structure and thus the arguments to twill

  1. https://docs.python.org/2/howto/urllib2.html
  2. http://www.voidspace.org.uk/python/articles/authentication.shtml

In some of my reading I was advised to look at Tamper Data,

  1. https://addons.mozilla.org/En-us/firefox/addon/tamper-data/

I should bear in mind that they are promising RSS so I don’t want to invest too much time. (Perhaps I should invest in the other part of the problem, converting JSON to XML.)

Do I miss second brain?

One thought on “Exporting ello.co

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.