Categories
Uncategorized

Java :: HTML parsing

I used TagSoup some years ago, but last week I came across ‘JSoup’. It also allows parsing of ‘real world HTML’, and comes with a really neat API to download and select subsets of your document.

See for yourself:

String url = “https://javaspecialists.teachable.com/p/refactoring2j8”;

Document doc = Jsoup.connect(url).get();
Elements items = doc.select(“a.item”);