Apache Tika: 1 point Oh!
Apache Tika, since April 2010 an ASF top level project, and a thriving Apache community has made tremendous strides over the past 4 years to grow and mature into a leading text extraction library, and content detection framework. Tika is used in a number of search projects, in a number of data management systems, and in a number of domains.
Those domains span from the technical industry to domains of science and within the federal government.
Tika has been used as a teaching platform for computer science graduate students, has been used to unlock information from NASA images, and from the National Cancer Institute, and has also been used to provide rich meaning and information representation of content captured in pervasive document repositories and warehouses. These are only some of Tika's broad applications.
In November, we hope to have released Tika 1.0. This will coincide with a number of other properties that demonstrate Tika has reached the point of a mature community, including:
1. Concrete, stable features, and core interfaces.
2. Tika's use in multiple programming languages and environments.
3. Our growth in Apache, and election of new committers and PMC members (and ASF members).
4. Developer articles appearing quite frequently on Tika.
5. The culmination of a wealth of knowledge in the form of a book that will be published on Tika at the time of the ApacheCon meeting.
This talk will focus on how we got here, and what's next for this thriving Apache community.