CSS3

I’m pleased to announce the first stable release of Jericho Selector.

Jericho Selector is an extension to the known library Jericho HTML Parser that allows you to select elements from an HTML document just like you do with jQuery, using CSS selectors.

Why Jericho? Different from jsoup, it allows you to modify the document keeping the original formatting. Other libraries rewrite the entire document. Anyway, I like to have a choice! 😉

Jericho Selector is completely free. It uses MIT license.

How to use

Jericho Selector is available at Maven Central Repository, so you just need to add the following dependency to your project:

<dependency>
    <groupId>br.com.starcode.jerichoselector</groupId>
    <artifactId>jericho-selector</artifactId>
    <version>1.0.1-RELEASE</version>
</dependency>

Import the static method $ that is the entry point for Jericho Selector:

import static br.com.starcode.jerichoselector.jerQuery.$;

Then you can query HTML elements just like jQuery:

$(html, "p.my-text")

What has been done

Before implementing Jericho Selector, I had to implement a full CSS parser. In order to do that, I created another library called parCCSer. It was based on the official W3C CSS3 specification and covers almot all of the specification, except some details that are valid only in the context of a browser and also something related to UTF-8 support that I considered not entirely necessary. It’s also under MIT license.

Jericho Selector then uses the object tree generated by parCCser, as the Jericho HTML Parser API, to query the HTML document elements given a CSS selector.

All implementitions are covered by unit testes above 90%, without taking in account excepcional cases that the plugin are not able to analyse.

What is coming

In the next weeks I aim to add some fluent API features to Jericho Selector, similar to jQuery, so you can make some operations using lambdas, for example. Methods like closest, parentsUntil, find, each are my priorities.

Another point to improve is the performance. Specific selectors can be optimized using cache or specific Jericho HTML Parser methods like getAllElementsByClass.

What you can do

Report any problem and suggest new features!

Souce code

You can get check out the source code from GitHub account:

https://github.com/utluiz/jericho-selector/