ApacheCon NA 2011

Olivier Grisel

R&D Software Engineer at Nuxeo with a background in Machine Learning and Natural Language Processing. Works on text analytics and semantic knowledge extraction using OpenNLP, Mahout and Pig and contribute to the Stanbol project.

Bridging traditional Open Source Content Management and the Web of Data with the Apache Stanbol Semantic Engine
November 11 10:00AM
This talk will introduce the Stanbol project and showcase how it can be integrated in traditional Enterprise Content Management solutions.

Stanbol is an Open Source project under incubation at the Apache Software Foundation. Its goal is to provide Web and CMS developers with a set of HTTP / RESTful services to help them integrate semantic technologies into their products and web sites.

The following Stanbol services are currently under active developments:

- Enhancement engines: use Natural Language Processing tools such as Apache OpenNLP to extract knowledge (topics, named entities, facts) from unstructured content and link it to unambiguous URIs from reference knowledge bases;

- Entity Hub: a Linked Data indexing cache built on top of Apache Solr, Clerezza and Jena that comes with precomputed indexes and live connectors to popular knowledge bases such as DBpedia, Geonames, YAGO...

- Content Hub: a faceted search engine based on Solr to search for content using the knowledge automatically extracted by the enhancement engines;

- CMS bridges to lift the structured content of document repositories using the JCR and CMIS access protocols (using Apache Chemistry) and store the result into a triple store suitable for SPARQL access;

- Rules engine based on Apache Jena for knowledge refactoring (e.g. convert extracted knowledge into the rich snippet vocabulary for SEO), integrity checks, merging rules, deductive inference...

The Semantic Web has made significant progress over the last years, and while it always gave a lot of promises, it is now the time where it can concretely be used in Enterprise Solutions.

If you are curious about the web of data, and want to see how concretely it can be used and integrated today in enterprise solutions thanks to software like the Stanbol projects, this session is for you.

You should also attend if you are interested in emerging technologies and don't have knowledge about semantic technologies, this will provide a good insight on how they can disrupt the usual way to develop applications.

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Community Sponsors