Daniel Thomas' Blog

Posts Tagged ‘code’

Communicating with a Firefox extension from Selenium

Monday, May 20th, 2013

Edit: I think this now longer works with more recent versions of Firefox, or at least I have given up on this strategy and gone for extending Webdriver to do what I want instead.

For something I am currently working on I wanted to use Selenium to automatically access some parts of Firefox which are not accessible from a page. The chosen method was to use a Firefox extension and send events between the page and the extension to carry data. Getting this working was more tedious than I was expecting, perhaps mainly because I have tried to avoid javascript whenever possible in the past.

The following code extracts set up listeners with Selenium and the Firefox extension and send one event in each direction. Using this to do proper communication and to run automated tests is left as an exercise for the author but hopefully someone else will find this useful as a starting point. The full code base this forms part of will be open sourced and made public at some future point when it does something more useful.

App.java

package uk.ac.cam.cl.dtg.sync;
import java.io.File; import java.io.IOException; import org.openqa.selenium.JavascriptExecutor; import org.openqa.selenium.WebDriver; import org.openqa.selenium.firefox.FirefoxDriver; import org.openqa.selenium.firefox.FirefoxProfile; public class App { private static final String SEND = "\"syncCommandToExtension\""; private static final String RECV = "\"syncCommandToPage\""; public static void main(String[] args) throws IOException { // This is where maven is configured to put the compiled .xpi File extensionFile = new File("target/extension.xpi"); // So that the relevant Firefox extension developer settings get turned on. File developerFile = new File("developer_profile-0.1-fn+fx.xpi"); FirefoxProfile firefoxProfile = new FirefoxProfile(); firefoxProfile.addExtension(extensionFile); firefoxProfile.addExtension(developerFile); WebDriver driver = new FirefoxDriver(firefoxProfile); driver.get("about:blank"); if (driver instanceof JavascriptExecutor) { AsyncExecute executor = new AsyncExecute(((JavascriptExecutor) driver)); executor.execute("document.addEventListener( " + RECV + ", function(aEvent) { document.title = (" + RECV + " + aEvent) }, true);"); executor.execute( "document.dispatchEvent(new CustomEvent(" + SEND + "));"); } else { System.err.println("Driver does not support javascript execution"); } } /** * Encapsulate the boilerplate code required to execute javascript with Selenium */ private static class AsyncExecute { private final JavascriptExecutor executor; public AsyncExecute(JavascriptExecutor executor) { this.executor = executor; }
public void execute(String javascript) { executor.executeAsyncScript("var callback = arguments[arguments.length - 1];"+ javascript + "callback(null);", new Object[0]); } } }

browserOverlay.js Originally cribbed from the XUL School hello world tutorial.

document.addEventListener( "syncCommandToExtension", function(aEvent) { window.alert("document syncCommandToExtension" + aEvent);/* do stuff*/ }, true, true);
// do not try to add a callback until the browser window has // been initialised. We add a callback to the tabbed browser // when the browser's window gets loaded. window.addEventListener("load", function () { // Add a callback to be run every time a document loads. // note that this includes frames/iframes within the document gBrowser.addEventListener("load", pageLoadSetup, true); }, false); function syncLog(message){ Application.console.log("SYNC-TEST: " + message); } function sendToPage(doc) { doc.dispatchEvent(new CustomEvent("syncCommandToPage")); } function pageLoadSetup(event) { // this is the content document of the loaded page. let doc = event.originalTarget;
if (doc instanceof HTMLDocument) { // is this an inner frame? if (doc.defaultView.frameElement) { // Frame within a tab was loaded. // Find the root document: while (doc.defaultView.frameElement) { doc = doc.defaultView.frameElement.ownerDocument; } } // The event listener is added after the page has loaded and we don't want to trigger // the event until the listener is registered. setTimeout(function () {sendToPage(doc);},1000); }; };

Tags: automation, code, CompSci, Computer Laboratory, events, firefox extension, selenium, testing, work
Posted in CompSci | No Comments »

Raspberry Pi Entropy server

Thursday, August 23rd, 2012

The Raspberry Pi project is one of the more popular projects the Computer Lab is involved with at the moment and all the incoming freshers are getting one.

One of the things I have been working on as a Research Assistant in the Digital Technology Group is on improving the infrastructure we use for research and my current efforts include using puppet to automate the configuration of our servers.

We have a number of servers which are VMs and hence can be a little short of entropy. One solution to having a shortage of entropy is an ‘entropy key‘ which is a little USB device which uses reverse biased diodes to generate randomness and has a little ARM chip (ARM is something the CL is rather proud of) which does a pile of crypto and analysis to ensure that it is good randomness. As has been done before (with pretty graphs) this can then be fed to VMs providing them with the randomness they want.

My solution to the need for some physical hardware to host the entropy key was a Raspberry Pi because I don’t need very much compute power and dedicated hardware means that it is less likely to get randomly reinstalled. A rPi can be thought of as the hardware equivalent of a small VM.

I got the rPi from Rob Mullins by taking a short walk down the corridor on the condition that there be photos. One of the interesting things about using rPis for servers is that the cost of the hardware is negligible in comparison with the cost of connecting that hardware to the network and configuring it.

The rPi is now happily serving entropy to various VMs from the back of a shelf in one of the racks in a server room (not the one shown, we had to move it elsewhere).

Initially it was serving entropy in the clear via the EGD protocol over TCP. Clearly this is rather bad as observable entropy doesn’t really gain you anything (and might lose you everything). Hence it was necessary to use crypto to protect the transport from the rPi to the VMs.
This is managed by the dtg::entropy, dtg::entropy::host and dtg::entropy::client classes which generate the relevant config for egd-linux and stunnel.

This generates an egd-client.conf which looks like this:
; This stunnel config is managed by Puppet.


sslVersion = TLSv1

client = yes
setuid = egd-client

setgid = egd-client

pid = /egd-client.pid

chroot = /var/lib/stunnel4/egd-client
socket = l:TCP_NODELAY=1

socket = r:TCP_NODELAY=1

TIMEOUTclose = 0
debug = 0

output = /egd-client.log
verify = 3
CAfile = /usr/local/share/ssl/cafile

[egd-client] accept = 7777 connect = entropy.dtg.cl.cam.ac.uk:7776
And a host config like:
; This stunnel config is managed by Puppet.


sslVersion = TLSv1
setuid = egd-host

setgid = egd-host

pid = /egd-host.pid

chroot = /var/lib/stunnel4/egd-host
socket = l:TCP_NODELAY=1

socket = r:TCP_NODELAY=1

TIMEOUTclose = 0
debug = 0

output = /egd-host.log
cert = /root/puppet/ssl/stunnel.pem

key = /root/puppet/ssl/stunnel.pem

CAfile = /usr/local/share/ssl/cafile

[egd-host] accept = 7776 connect = 777 cert = /root/puppet/ssl/stunnel.pem key = /root/puppet/ssl/stunnel.pem CAfile = /usr/local/share/ssl/cafile

Getting that right was somewhat tedious due to defaults not working well together.
openssl s_client -connect entropy.dtg.cl.cam.ac.uk:7776
and a python egd client were useful for debugging. In the version of debian in rasperian the stunnel binary points to an stunnel3 compatibility script around the actual stunnel4 binary which resulted in much confusion when trying to run stunnel manually.

Tags: #cl, automation, code, CompSci, Computer Laboratory, debian, DTG, entropy, hardware, linux, puppet, randomness, Raspberry Pi, rPi, servers, sysadmin, work
Posted in CompSci, Security | 6 Comments »

Vertical labels on gnuplot LaTeX graphs

Thursday, April 28th, 2011

This post exists because doing this was far harder than it should have been and hopefully this will help someone else in the future.

When creating a bar chart/histogram in gnuplot using the latex driver if there are a lot of bars and the labels for the bars are over a certain length then the labels overlap horribly. The solution to this would be to rotate them and the following LaTeX and gnuplot code allows that to happen and deals with various fallout that results.

The following defines a length for storing offsets of the labels in \verticallabeloffset as this offset will depend on the length of the text. It also stores a length which holds the maximum of those values \maxverticallabeloffset. It provides a command \verticallabel which does hte following: calculates the length of the label, moves the start position across by 1.4em, moves the label down by 2em (to provide space for the x axis label) and then down by the length of the text. It then uses a sideways environment to make the text vertical. Then it uses the ifthen package to work out if the \verticallabeloffset is bigger than any previously seen and if so it sets \globaldefs=1 so that the effect of the \setlength command will be global rather than being restricted to the local scope and then sets it back to 0 (this is a nasty hack).
It also provides the \xaxislabel command which shifts the x axis title up into the space between the x axis and the labels.
\newlength{\verticallabeloffset} \newlength{\maxverticallabeloffset} \setlength{\maxverticallabeloffset}{0pt} \providecommand{\verticallabel}[1]{\settowidth{\verticallabeloffset}{#1}\hspace{1.4em}\vspace{-2em}\vspace{-\verticallabeloffset}\begin{sideways} #1 \end{sideways}\ifthenelse{\lengthtest{\verticallabeloffset>\maxverticallabeloffset}}{\globaldefs=1\setlength\maxverticallabeloffset{\verticallabeloffset}\globaldefs=0}{}} \providecommand{\xaxislabel}[1]{\vspace{2em}#1}
Having defined that the following allows gnuplot histograms generated with the LaTeX driver to be included in a LaTeX document. It first resets the maximum offset and then ensures there is sufficient space for the labels
\begin{figure} \centering \setlength{\maxverticallabeloffset}{0pt} \include{figs/graphs/comparisonSummary_data} \vspace{-3em}\vspace{\maxverticallabeloffset} \caption{foo bar chart} \label{figs:graphs:comparisonSummary_data} \end{figure}
The following gnuplot code generates a histogram which uses this to get the labels to display correctly.
set terminal latex size 16cm, 7.5cm set style histogram errorbars


set ylabel "\\begin\{sideways\}Mean absolute error\\end\{sideways\}"

set xlabel "\\xaxislabel\{Data set\}"

set output "comparisonSummary_data.tex" plot "comparisonSummary.dat" index 0 using 2:($3*1.96):xtic("\\verticallabel\{" . stringcolumn(1) . "\}") with histogram title "Mean absolute error for different data sets"

I hope this helps someone. I should get around to actually patching the gnuplot latex driver so that it works properly – but that will have to wait until post exams.

Tags: bar chart, chart, code, global scope, gnuplot, graph, histogram, LaTeX, setlength
Posted in CompSci, Part II Project | No Comments »

IB Group Projects

Thursday, March 10th, 2011

On Wednesday the Computer Science IB students demonstrated the projects that they have been working on for the last term. This is my thoughts on them.

Some of the projects were really quite interesting, some of them even actually useful in real life, some of them didn’t work, were boring and simply gimmicks.

Alpha: “African SMS Radio” was a project to create a pretty GUI to a “byzantine and buggy” backend. It could allow a radio operator to run polls and examine stats of texts sent to a particular number. However it didn’t look particularly interesting and though there might be use cases for such a system I think only as a component of a larger more enterprise system and only after the “buggy” backend they had to use had been fixed up/rewritten.

Bravo: “Crowd control” was a project to simulate evacuations of buildings. It is a nice use of the Open Room Map project to provide the building data. It looked like it was still a little buggy – in particular it was allowing really quite nasty crushes to occur and the resulting edge effects as people were thrown violently across the room as the system tried to deal with multiple people being in the same place at the same time was a little amusing. With a little more work it could become quite useful as an extension in the Open Room Map ecosystem which could help it gain momentum and take off. I think that the Open Room Map project is really quite cool and useful – it is the way that data on the current structure and contents of buildings can be crowd sourced and kept up to date but then it is a project of my supervisor. ;-)

Charlie: “Digit[Ov]al automated cricket commentary” this was a project to use little location transmitters on necklaces and usb receivers plugged into laptops to determine the location of cricketers while they were playing and then automatically construct commentary on that. It won the prize for best technical project but it didn’t actually work. They hadn’t solved the problem of people being between the transmitter and the receiver reducing transmission strength by 1/3 or the fact that placing a hand over it reduced it by 1/3 or the fact that the transmitters were not omnidirectional and so orientation was a major issue. They were also limited to only four receivers due to only having four suitable laptops. They used a square arrangement to try and detect location. It is possible that a double triangle arrangement with three corners at ground level and then the other triangle higher up (using the ‘stadium’ to gain height) and offset so that the upper vertices lined up with the mid point of the lower edges would have given them a better signal. Calibrating and constructing algorithms to deal with the noise and poor data would probably have been quite difficult and required some significant work – which IB students haven’t really been taught enough for yet.

Delta: “Hand Wave, Hand Wave” was a project to use two sensors with gyroscopes and accelerometers to do gesture recognition and control. It didn’t really work in the demo and since it had reimplemented everything it didn’t manage to do anything particularly interesting. I think using such sensors for gesture control is probably a dead end as kinect and the like makes just using a camera so much easier and more interesting.

Echo: “iZoopraxiscope – Interactive Handheld Projector” this project was about using a phone with a build in pico projector as an interface. This was obviously using very prototype technology – using the projector would drain the phones battery very quickly, in some cases even when the phone was plugged in and fitting it in the (slightly clunky) phone clearly was at the expense of providing the normal processing power that is expected in an Android phone resulting in it being somewhat sluggish. Since the sensors were rather noisy and techniques for coping with that were not as advanced as they might have been (they just used an exponential moving average and manually tweaked the parameter) they had some difficulties with sluggishness in the controls of some of the games. However I think they produced several nice arcade style games (I didn’t play any of them) and so did demonstrate a wide range of uses. With better knowledge of how to deal with sensors (not really covered in any of the courses offered at the CL) and better technology this could be really neat. However getting a battery powered projector to compete with normal lighting is going to be quite a challenge.
The thing I really like about small projectors is that it could help make it easier to interact in lectures. Sometimes when asking a question or making a comment in lectures it might be useful to draw a diagram which the lecturer (and the rest of the audience) can see and currently doing so is really quite hard. (I should take to carrying around a laser pointer for use in these circumstances).

Foxtrot: “Lounge Star” this was a android app for making air passenger’s lives a little easier by telling them information such as which gate to use etc. without them having to go anywhere and integrating with various airlines systems. As someone who has ‘given up flying’ (not in an absolute sense but in a ‘while any other option (including not going) still remains’ sense) this was not vastly interesting but it could really work as a product if the airlines like it. So: “Oh it is another nice little Android app” (but then associated short attention span kicks in and “bored now”).

Golf: The Energy Forecast this was a project I really liked (it pushed the right buttons) it is a project to predict the energy production of all the wind farms in the country based on the predicted wind speed. It integrated various sources of wind speeds, power production profiles for different types of wind farm and the locations and types of many different wind farms (they thought all but I found some they were missing) and they had a very pretty GUI using google maps etc to show things geographically and were using a very pretty graph drawing javascript library. So I did the “oh you should use the SRCF to host that” thing (they were using a public IP on one of their own computers) and I am sort of thinking “I would really like to have your code” (Oh wait I know where that is kept, snarfle, snarfle ;-) It is something I would really like to make into a part of the ReadYourMeter ecosystem (I may try and persuade Andy he wants to get something done with it).
I love wind turbines all my (small) investments are in them, we have one in our back garden etc. this could be really useful. [end fanboyism]

Hotel: “Top Tips” this was a project to see whether the comments traders put on their trading tips actually told you anything about how good the trade was. The answer was no, not really, nothing to see here. Which is a little disappointing and not a particularly interesting project “lets do some data analysis!” etc.

India: “True Mobile Coverage” this was a project to crowd source the collection of real mobile signal strength data. It actually serves a useful purpose and could be really helpful. They needed to work on their display a little as it wasn’t very good at distinguishing between areas they didn’t know much about and areas with weak signal and unfortunately as with all projects it started working in a very last minute manner so they didn’t have that much data to show. Nice crowd sourcing data collection android app of the kind that loads of people in the CL love. Of course there will be large quantities they could do to improve it using the kind of research which has been done in the CL but it is a good start.

Juliet: “Twitter Dashboard” this was so obviously going to win from the beginning – a twitter project (yey bandwagon) which looks pretty. They did do a very good job, it looked pretty, it ate 200% of the SRCF’s CPU continuously during the demo (but was niced to 19 so didn’t affect other services) – there are probably efficiency savings to be made here but that isn’t a priority for a Group Project which is mainly about producing something that looks pretty and as if it works all other considerations are secondary. My thoughts were mainly “Oh another project to make it easier for Redgate to do more of their perpetual advertising. meh.” (they have lovely people working for them but I couldn’t write good enough Java for them)

Kilo: “Walk out of the Underground” this was a project to guide you from the moment you stepped out of the underground to your destination using an arrow on the screen of your phone. It was rather hard to demo inside the Intel Lab where there is both poor signal and insufficient scale to see whether it actually works. It might be useful, it might work, it is yet another app for the app store and could probably drum up a few thousand users as a free app.

Lima: “Who is my Customer?” this was a very enterprise project to do some rather basic Information Retrieval to find the same customer in multiple data sets. The use case being $company has a failsome information system and their data is poor quality and not well linked together. Unfortunately the project gave the impression of being something which one person could hack together in a weekend. I may be being overly harsh but I found it a little boring.

So in summary: I liked “The Energy Forcast” most because it pushed the right buttons, “True mobile coverage” is interesting and useful. Charlie could be interesting if it could be made to work but I think that the ‘cricket’ aspect is a little silly – if you want commentary use a human. iZoopraxiscope (what a silly name) points out some cool tech that will perhaps be useful in the future but really is not ready yet (they might need/be using some of the cool holgrams tech that Tim Wilkinson is working on (he gave a CUCaTS talk “Do We Really Need Pixels?” recently).

Idea for next year: have a competition after the end of the presentations to write up the project in a scientific paper style and then publish the ones that actually reach a sufficiently good standard in a IB Group Project ‘journal’ as this would provide some scientific skills to go with all the Software Engineering skills that the Group project is currently supposed to teach. (No this is so not going to happen in reality)

Tags: code, Competition, Computer Laboratory, Computer Science, education, Energy Forcast, Group Project, IB, science, Software Engineering, teaching
Posted in CompSci, University | No Comments »

dh_installdocs –link-doc when using jh_installjavadoc

Monday, January 3rd, 2011

This post exists to stop people (possibly including myself) spending hours being thoroughly confused as to why
override_dh_installdocs: dh_installdocs --link-doc=packagel
is not working. In fact it probably is working but your package-doc.javadoc probably looks something like:
docs/api
rather than like:
docs/api /usr/share/doc/package/api
and hence jh_installjavadoc is correctly creating the /usr/share/doc/package-doc/ directory when really you didn’t want that to happen and so dh_installdocs –link-doc silently ignores your instruction as it doesn’t make sense in context.

Relatedly I have ‘finished’ packaging JEval for Debian partly using the instructions on packaging java packages using git. It just needs some further testing and to be uploaded and sponsored.
(I am packaging the dependencies of my Part II Project)

Tags: code, compiling, debian, googlefodder, JEval, packaging
Posted in CompSci, Debian | 2 Comments »

Firesheep as applied to Cambridge

Tuesday, October 26th, 2010

Many of you will have already heard about Firesheep which is essentially a Firefox extension which allows you to login to other people’s Facebook, Amazon etc. accounts if they are on the same (unsecured) network to you. This post is on my initial thoughts on what this means to the people on Cambridge University networks.

Essentially this whole thing is nothing new – in one sense people who know anything about security already knew that this was possible and that programs for doing this existed. The only innovation is an easy to use User Interface and because Human Computer Interaction (HCI) is hard, this means that Eric Butler has won.

In Cambridge we have unsecured wireless networks such as Lapwing and the CLs shared key networks and I think that Firesheep should work fine on these and so for example in lectures where lots of students are checking Facebook et al. (especially in the CL) there is great potential for “pwned with Firesheep” becoming the status of many people. However this would be morally wrong and violate the Terms of Service of the CUDN/JANET etc. If that isn’t sufficient – the UCS has magic scripts that watch network traffic, they know where you live and if you do something really bad they can probably stop you graduating. So while amusing I don’t think that a sudden epidemic of breaking into people’s accounts would be sensible.

So what does that mean for the users of Cambridge networks? Use Eduroam. Eduroam is wonderful and actually provides security in this case (at least as long as you trust the UCS, but we have to do that anyway). If you are using Lapwing and you use a site listed on the handlers page for firesheep (though don’t visit that link on an unsecured network as GitHub is on that list) then you have to accept the risk that someone may steal your cookies and pretend to be you.

What does this mean for people running websites for Cambridge people? Use SSL, if you are using the SRCF then you win as we provide free SSL and it is simply a matter of using a .htaccess file to turn it on. It should also be pointed out that if you are using Raven for authentication (which you should be) then you still need to use SSL for all the pages which you are authenticated on or you lose^[0]. If you are not using the SRCF – then why not? The SRCF is wonderful!^[1] . If you are within *.cam.ac.uk and not using the SRCF then you can also obtain free SSL certificates from the UCS (though I doubt anyone likely to read this is).

So do I fail on this count? Yes I think I have multiple websites on the SRCF which don’t use SSL everywhere they should and I don’t think any uses secure cookies. I also feel slightly responsible for another website which both uses poorly designed cookies and no SSL.

Users – know the risks. Developers – someone is telling us to wake up again, and even though I knew I was sleeping.

[0]: Unfortunately I think that until the SRCF rolls out per user and society subdomains which will be happening RSN if you use raven to login to one site on the SRCF and then visit any non-SSL page on the SRCF then your Raven cookie for the SRCF has just leaked to anyone listening. Oops. ~~Using secure cookies would fix this though I haven’t worked out how to do this yet – I will post a HOWTO later~~ Update: if the original authentication is done to an SSL protected site then the Raven cookie will be set to be secure.
[1]: I may be wearing my SRCF Chairman hat while writing that – though that doesn’t mean it isn’t true.

Tags: code, firefox extension, firesheep, unsecured network, wireless
Posted in CompSci, Security, University | 6 Comments »

Proving Java Terminates (or runs forever) – Avoiding deadlock automatically with compile time checking

Tuesday, October 12th, 2010

This is one of the two projects I was considering doing for my Part II project and the one which I decided not to do.
This project would involve using annotations in Java to check that the thread safety properties which the programmer believes that the code has actually hold.
For example @DeadlockFree or @ThreadSafe.
Now you might think that this would be both incredibly useful and far too difficult/impossible however implementations of this exist at least in part: The MIT licensed CheckThread project and another implementation forms part of the much larger JSR-305 project to use annotations in java to detect software defects. JSR-305 was going to be part of Java 7 but development appears to be slow (over a year between the second most recent commit and the most recent commit (4 days ago)).

Essentially if you want to use annotations in Java to find your thread safety/deadlock bugs and prevent them from reoccurring the technology is out there (but needs some polishing). In the end I decided that since I could probably hack something working together and get it packaged in Debian in a couple of days/weeks by fixing problems in the existing implementations then it would not be particularly useful to spend several months reimplementing it as a Part II project.

If a library which allows you to use annotations in Java to prevent thread safety/deadlock problems is not in Debian (including new queue) by the end of next summer throw things at me until it is.

Tags: annotation, code, deadlock, debian, java, library, thread safety
Posted in CompSci, Part II Project | 1 Comment »

Updating file copyright information using git to find the files

Wednesday, September 1st, 2010

Update: I missed off --author="Your Name".
All source files should technically have a copyright section at the top. However updating this every time the file is changed is tiresome and tends to be missed out. So after a long period of development you find yourself asking the question “Which files have I modified sufficiently that I need to add myself to the list of copyright holders on the file?”.
Of course you are using a version control system so the information you need about which files you modified and how much is available it just needs to be extracted. The shell is powerful and a solution is just one (long) line (using long options and splitting over multiple lines for readability).

$ find -type f -exec bash -c 'git log --ignore-all-space --unified=0 --oneline  \
    --author="Your Name" {} | grep "^[+-]" | \
    grep --invert-match "^\(+++\)\|\(---\)" | wc --lines | \
    xargs test 20 -le ' \; -print > update_copyright.txt \
    && wc --lines update_copyright.txt

In words: find all files and run a command on them and if that command returns true (0) then print out that file name to the ‘update_copyright.txt’ file and then count how many lines are in the file. Where the command is: use git log to find all changes which changed things other than space and minimise things other than the changes themselves (--oneline to reduce commit message etc and --unified=0 to remove context) then strip out all lines which don’t start with + or – and then strip out all the lines which start with +++ or — then count how many lines we get from that and test whether this is larger than 20. If so return true (zero) else return false (non zero).

This should result in an output like:

 284 update_copyright.txt

I picked 20 for the ‘number of lines changed’ value because 10 lines of changes is generally the size at which copyright information should be updated (I think GNU states this) and we are including both additions and removals so we want to double that.

Now I could go from there to a script which then automatically updated the copyright rather than going through manually and updating it myself… however the output I have contains lots of files which I should not update. Then there are files in different languages which use different types of comments etc. so such a script would be much more difficult to write.

Apologies for the poor quality of English in this post and to those of you who have no idea what I am on about.

Tags: code, copyright, find, git, GNU, gnuprologjava, GSoC, script, source files, VCS, version control
Posted in CompSci | 3 Comments »

Packaging a java library for Debian as an upstream

Tuesday, August 17th, 2010

I have just finished my GSoC project working on GNU Prolog for Java which is a Java library for doing Prolog. I got as far as the beta release of version 0.2.5. One of the things I want to do for the final release is to integrate the making of .deb files to distribute the binary, source and documentation into the ant build system. This post exists to chronicle how I go about doing that.

First some background reading: There is a page on the debian wiki on Java Packaging. You will want to install javahelper as well as debain-policy and debhelper. You will then have the javahelper documentation in /usr/share/doc/javahelper/. The debian policy on java will probably also be useful.

Now most debian packages will be created by people working for debian using the releases made by upstreams but I want to be a good upstream and do it myself.

So I ran:

jh_makepkg --package=gnuprologjava --maintainer="Daniel Thomas" \
    --email="drt24-debian@srcf.ucam.org" --upstream="0.2.5" --library --ant --default

I then reported a bug in jh_makepkg where it fails to support the --default and --email options properly because I got “Invalid option: default”. Which might be fixed by the time you read this.
So I ran:

jh_makepkg --package=gnuprologjava --maintainer="Daniel Thomas" \
    --email="drt24-debian@srcf.ucam.org" --upstream="0.2.5" --library --ant

And selected ‘default’ when prompted by pressing F.
I got:

dch warning: Recognised distributions are:
{dapper,hardy,intrepid,jaunty,karmic,lucid,maverick}{,-updates,-security,-proposed,-backports} and UNRELEASED.
Using your request anyway.
dch: Did you see that warning?  Press RETURN to continue...

Which I think stems from the fact that javahelper is assuming debian but javahelper uses debhelper or similar which Ubuntu has configured for Ubuntu. However this doesn’t matter to me as I am trying to make a .deb for Debian (as it will then end up in Ubuntu and many other places – push everything as far upstream as possible).
Now because of the previously mentioned bug the --email option is not correctly handled so we need to fix changelog, control and copyright in the created debian folder to use the correct email address rather than user@host in the case of changelog and NAME in the case of control and copyright (two places in copyright).

copyright needs updating to have Upstream Author, Copyright list and licence statement and homepage.
For licence statement I used:

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Library General Public
    License as published by the Free Software Foundation; either
    version 3 of the License, or (at your option) any later version.

    This library is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    Library General Public License for more details.

    You should have received a copy of the GNU Library General Public
    License along with this library; if not, write to the Free Software
    Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA  02110-1301 USA

On Debian systems the full text of the GNU Library Public License, can be
found in the file /usr/share/common-licenses/LGPL-3.

control needs updating to add homepage, Short Description and Long Description (which needs to be indented with one space). I also added texinfo and libgetopt-java to Build-Depends and Suggests: libgetopt-java.
GNU Prolog for Java has a build dependency on gnu.getopt as gnu.prolog.test.GoalRunner uses it so I added
export CLASSPATH=/usr/share/java/gnu-getopt.jar to debian/rules. libgnuprologjava-java.javadoc needed build/api put in it so that it could find the javadoc.

So at this point I have a debian folder with the right files in it but I don’t have an ant task which will do everything neatly. Since debian package building really wants to happen on an unpacked release tarball of the source I modified my existing dist-src target to copy the debian folder across as well and added the following to my init target:

<!-- Key id to use for signing the .deb files -->
<property name="dist.keyid" value="" />
<property name="dist.name-version" value="${ant.project.name}-${version}"/>
<property name="dist.debdir" value="dist/${dist.name-version}/" />

I could then add the following dist-deb target to do all the work.

<target name="dist-deb" depends="dist-src" description="Produce a .deb to install with">
	<mkdir dir="${dist.debdir}"/>
	<unzip src="dist/${dist.name-version}-src.zip" dest="${dist.debdir}"/>
	<!-- Delete the getopt code as we want to use the libgetopt-java package for that -->
	<delete dir="${dist.debdir}src/gnu/getopt"/>
	<exec executable="dpkg-buildpackage" dir="${dist.debdir}">
		<arg value="-k${dist.keyid}"/>
	</exec>
</target>

So this gives me binary and source files for distribution and includes the javadoc documentation but lacks the manual.
To fix that I added libgnuprologjava-java.docs containing

build/manual/
docs/readme.txt

, libgnuprologjava-java.info containing build/gnuprologjava.info and libgnuprologjava-java.doc-base.manual containing:

Document: libgnuprologjava-java-manual
Title: Manual for libgnuprologjava-java
Author: Daniel Thomas
Abstract: This the Manual for libgnuprologjava-java
Section: Programming

Format: HTML
Index: /usr/share/doc/libgnuprologjava-java/manual
Files: /usr/share/doc/libgnuprologjava-java/manual/*.html

Now all of this might result in the creation of the relevant files needed to get GNU Prolog for Java into Debian but then again it might not :-). The .deb file does install (and uninstall) correctly on my machine but I might have violated some debian-policy.

Tags: .deb, ant, apt, build systems, code, debian, distributions, dpkg, GNU Prolog for Java, gnuprologjava, GSoC, java, packaging, prolog
Posted in CompSci | 2 Comments »

Compiling git from git fails if core.autocrlf is set to true

Monday, August 16th, 2010

If you get the following error when compiling git from a git clone of git:

: command not foundline 2: 
: command not foundline 5: 
: command not foundline 8: 
./GIT-VERSION-GEN: line 14: syntax error near unexpected token `elif'
'/GIT-VERSION-GEN: line 14: `elif test -d .git -o -f .git &&
make: *** No rule to make target `GIT-VERSION-FILE', needed by `git-am'. Stop.
make: *** Waiting for unfinished jobs....

and you have core.autocrlf set to true in your git config then consider the following “it’s probably not a good idea to convert newlines in the git repo” curtsey of wereHamster on #git.

So having core.autocrlf set to true may result in bad things happening in odd ways because line endings are far more complicated than they have any reason to be (thanks to decisions made before I was born). Bugs due to white space errors are irritating and happen to me far too often :-).

Today I used CVS – it was horrible, I tried to use git’s cvsimport and cvsexportcommit to deal with the fact that CVS is horrible unfortunately this did not work ;-(.

This post exists so that the next time someone is silly like me they might get a helpful response from Google.

Tags: code, compiling, error, git, make, white space
Posted in CompSci | No Comments »