Skip to content

Technologies

Edward Eldridge edited this page Apr 24, 2018 · 2 revisions

JSoup

jsoup

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

  • scrape and parse HTML from a URL, file, or string
  • find and extract data, using DOM traversal or CSS selectors
  • manipulate the HTML elements, attributes, and text
  • clean user-submitted content against a safe white-list, to prevent XSS attacks
  • output tidy HTML

jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.

Example

Fetch the Wikipedia homepage, parse it to a DOM, and select the headlines from the In the news section into a list of Elements.

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
log(doc.title());
Elements newsHeadlines = doc.select("#mp-itn b a");
for (Element headline : newsHeadlines) {
  log("%s\n\t%s", 
    headline.attr("title"), headline.absUrl("href"));
}

MongoDB

download

MongoDB is a free and open-source cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with schemas. MongoDB is developed by MongoDB Inc., and is published under a combination of the GNU Affero General Public License and the Apache License.

Features

  • Schema less
  • Structure of a single object is clear
  • No complex joins
  • Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as * Powerful as SQL
  • Easy tuning
  • Ease of scale-out
  • Conversion / mapping of application objects to database objects not needed
  • Uses internal memory for storing the (windowed) working set, enabling faster access of dat

Swing

javaswing

Swing is a GUI widget toolkit for Java. It is part of Oracle's Java Foundation Classes (JFC) – an API for providing a graphical user interface (GUI) for Java programs.

Swing was developed to provide a more sophisticated set of GUI components than the earlier Abstract Window Toolkit (AWT). Swing provides a look and feel that emulates the look and feel of several platforms, and also supports a pluggable look and feel that allows applications to have a look and feel unrelated to the underlying platform. It has more powerful and flexible components than AWT. In addition to familiar components such as buttons, check boxes and labels, Swing provides several advanced components such as tabbed panel, scroll panes, trees, tables, and lists.

Unlike AWT components, Swing components are not implemented by platform-specific code. Instead, they are written entirely in Java and therefore are platform-independent. The term "lightweight" is used to describe such an element.[1

Clone this wiki locally